摘要
随着我国"一带一路"战略的推进,新疆应当充分发挥其区域优势,着力于打造丝绸之路经济带核心区,其中提高汉语-维吾尔语(汉-维)之间机器翻译的质量有着重要的现实意义。通过对低频维吾尔语词汇进行词干词缀分割进行了基于三种不同机器翻译系统的汉-维机器翻译对比实验。该方法不仅减少词汇表大小从而减轻未登录词(Out Of Vocabulary,OOV)问题,同时也大大提升了翻译结果。其中统计机器翻译结果提升最明显,比原始提升了3.29个BLEU值。
As China's "The Belt and Road"(B&R)strategy advances,Xinjiang should gives full play to its regional advantages and strives to create the core area of the Silk Road Economic Belt. Therefore, it is of great practical significance to improve the quality of machine translation between Chinese and Uyghur. In this paper, a comparison experiment of Chinese-Uyghur Machine Translation based on three different machine translation systems are carried out, by separating the stems and affixes of low-frequency Uyghur vocabulary. This method not only reduces the size of the vocabulary table and alleviates the problem of Out Of Vocabulary(OOV), but also greatly enhances the translation result. Among them, statistical machine translation results are the most obvious,with 3.29 more BLEU scores than the original.
引文
[1]Kong J,Yang Y,Zhou X,et al.Research for Uyghur-Chinese Neural Machine Translation[M]//Natural Language Understanding and Intelligent Applications.Springer International Publishing,2016.
[2]哈里旦木·阿布都克里木,刘洋,孙茂松.神经机器翻译系统在维吾尔语-汉语翻译中的性能对比[J].清华大学学报(自然科学版),2017,57(8):878-883.
[3]米莉万·雪合来提,麦热哈巴·艾力,吐尔根·依布拉音,等.维吾尔语词尾对汉维统计机器翻译影响的研究[J].计算机工程,2014,40(3):224-227.
[4]米莉万·雪合来提,刘凯,吐尔根·依布拉音.基于维吾尔语词干词缀粒度的汉维机器翻译[J].中文信息学报,2015,29(03):201-206.
[5]Zhang S,Mahmut G,Wang D,et al.Memory-augmented Chinese-Uyghur Neural Machine Translation[J].2017.
[6]Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate.Computer Science,2014.
[7]张家俊,宗成庆.中文信息处理发展报告:机器翻译[R].中国中文信息学会,2016.
[8]Feng Y,Zhang S,Zhang A,et al.Memory-augmented Neural Machine Translation[J].2017.