摘要
语音合成是哈萨克文信息处理技术的一个重要研究领域。哈萨克文本中的阿拉伯数字转换为其读音文本是语音合成中重要的预备工作。该文利用规则库和N-gram,实现了文本当中的各类数字正确的转换到读音,为哈萨克语语音合成研究,提供了高质量的数字读音文本。希望通过该文提供的方法来提高哈萨克文以及相似特性的其他语种的语音合成的质量。
Speech synthesis is an important research field of Kazakh information processing technology. Converting the Arabic numerals in the Kazakh text to their pronunciation text is considered as an important preparatory work in speech synthesis. In this paper, the Rule-base and N-gram methods are used to realized the correct conversion of all kinds of numbers into the pronunciation, which provides high quality digital pronunciation text for Kazakh speech synthesis. It is hoped that the quality of speech synthesis in Kazakh and other languages with similar characteristics will be improved by using the methods provided in this paper.
引文
[1]木合亚提·尼亚孜别克,古力沙吾利.哈萨克文信息处理的现状和发展方向[J].中文信息学报,2010,24(4):111-114.
[2]冯志伟.文本连贯中的常识推理研究[C]//hnc与语言学研究学术研讨会,2005.
[3]木合亚提·尼亚孜别克,古力沙吾利.哈萨克文信息处理现状中的若干问题探讨[J].智能计算机与应用,2011,1(6):45-46.
[4]木合亚提·尼亚孜别克,古力沙吾利,古丽拉·阿东别克,等.哈萨克文语料库管理系统设计与实现[J].西南师范大学学报:自然科学版,2012,37(11):37-40.
[5]牛宁宁.哈萨克语兼类词词性标注研究[D].乌鲁木齐:新疆大学,2014.