SVitchboard-II and FiSVer-I: Crafting high quality and low complexity conversational english speech corpora using submodular function optimization
详细信息    查看全文
文摘
We introduced a set of conversational English speech corpora with high acoustic quality and limited vocabulary derived from the Switchboard-I and Fisher datasets. We investigated numerous state-of-the-art submodular function optimization procedures, including SCSC/SCSK, DS and SFM optimization. We surveyed different submodular function instantiations, where both ‘‘acoustic quality’’ and vocabulary size are adeptly measured via various submodular functions. We provided baseline word recognition results on all of the resultant speech corpora for both Gaussian mixture model (GMM) and deep neural network (DNN)-based systems. We had released all of the corpora definitions and Kaldi training recipes for free in the public domain.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700