Domain adaptation of lattice-free MMI based TDNN models for speech recognition

详细信息查看全文

作者：Yanhua Long ; Yijie Li ; Hone Ye ; Hongwei Mao
关键词：Domain adaptation ; Kullback–Leibler divergence regularization ; Time ; delay deep neural network ; Acoustic modeling
刊名：International Journal of Speech Technology
出版年：2017
出版时间：March 2017
年：2017
卷：20
期：1
页码：171-178
全文大小：
刊物类别：Engineering
刊物主题：Signal,Image and Speech Processing; Social Sciences, general; Artificial Intelligence (incl. Robotics);
出版者：Springer US
ISSN：1572-8110
卷排序：20

文摘

The recent proposed time-delay deep neural network (TDNN) acoustic models trained with lattice-free maximum mutual information (LF-MMI) criterion have been shown to give significant performance improvements over other deep neural network (DNN) models in variety speech recognition tasks. Meanwhile, the Kullback–Leibler divergence (KLD) regularization has been validated as an effective adaptation method for DNN acoustic models. However, to our best knowledge, no work has been reported on investigating whether the KLD-based method is also effective for LF-MMI based TDNN models, especially for the domain adaptation. In this study, we generalized the KLD regularized model adaptation to train domain-specific TDNN acoustic models. A few distinct and important observations have been obtained. Experiments were performed on the Cantonese accent, in-car and far-field noise Mandarin speech recognition tasks. Results demonstrated that the proposed domain adapted models can achieve around relative 7–29% word error rate reduction on these tasks, even when the adaptation utterances are only around 1 K.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700