SVitchboard-II and FiSVer-I: Crafting high quality and low complexity conversational english speech corpora using submodular function optimization

详细信息查看全文

作者：Yuzong Liu ; ^yzliu@uw.edu ; Rishabh Iyer ; Katrin Kirchhoff ; Jeff Bilmes
关键词：Submodular function optimization ; Automatic speech recognition ; Speech corpus
刊名：Computer Speech & Language
出版年：2017
出版时间：March 2017
年：2017
卷：42
期：Complete
页码：122-142
全文大小：846 K
卷排序：42

文摘

We introduced a set of conversational English speech corpora with high acoustic quality and limited vocabulary derived from the Switchboard-I and Fisher datasets. We investigated numerous state-of-the-art submodular function optimization procedures, including SCSC/SCSK, DS and SFM optimization. We surveyed different submodular function instantiations, where both ‘‘acoustic quality’’ and vocabulary size are adeptly measured via various submodular functions. We provided baseline word recognition results on all of the resultant speech corpora for both Gaussian mixture model (GMM) and deep neural network (DNN)-based systems. We had released all of the corpora definitions and Kaldi training recipes for free in the public domain.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700