融合多特征的汉维神经网络机器翻译模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Optimized Chinese-Uyghur neural machine translation model based on multi-features
  • 作者:朱顺乐
  • 英文作者:ZHUN Shun-le;Donghai Science and Technology College,Zhejiang Ocean University;
  • 关键词:汉维神经网络机器翻译 ; 对数线性模型 ; 特征 ; 形态分析 ; 语言模型
  • 英文关键词:Chinese-Uyghur neural machine translation;;log-linear model;;feature;;morphological analysis;;language model
  • 中文刊名:SJSJ
  • 英文刊名:Computer Engineering and Design
  • 机构:浙江海洋大学东海科学技术学院;
  • 出版日期:2019-05-15
  • 出版单位:计算机工程与设计
  • 年:2019
  • 期:v.40;No.389
  • 基金:浙江省自然科学基金项目(LY16F020014);浙江省自然科学基金青年科学基金项目(LQ16A010003)
  • 语种:中文;
  • 页:SJSJ201905051
  • 页数:6
  • CN:05
  • ISSN:11-1775/TP
  • 分类号:292-296+309
摘要
针对汉维神经网络机器翻译中出现的未登录词过多、维吾尔语端形态生成以及汉维词语表意不一致等问题,提出一种融合"编码器-解码器"特征、维吾尔语"词干-词缀"语言模型特征、汉维-维汉双向词对齐特征的汉维翻译策略。综合考虑汉维语言差异、汉维语言资源稀缺等问题,将统计机器翻译中的双语知识引入到神经网络机器翻译模型中,多个特征通过一个对数线性模型组合。实验结果表明,该方法能够有效提升汉维神经网络机器翻译性能,平均BLEU提升大于2.0。
        To overcome OOVs,morphological generation and n-to-1 alignment exist in Chinese-Uyghur neural machine translation,a log-linear model based Chinese-Uyghur machine translation approach was proposed,which integrated features like encoder-decoder feature,Uyghur stem-affix language model feature and Chinese-Uyghur bidirectional word alignment feature.Knowledge of statistical machine translation was imported into neural machine translation,which alleviated data sparsity and morphological differences during model training.Experimental results show that the proposed model can achieve 2.0+ BLEU improvements compared with other methods.
引文
[1]Bahdanau D,Cho K,Bengio Y.Neural machine translation by jointly learning to align and translate[C]//Accepted at ICLRas Oral Presentation,2014.
    [2]Koehn P,Knowles R.Six challenges for neural machine translation[C]//Proceedings of the 1st Workshop on Neural Machine Translation,2017.
    [3]Neubig G,Watanabe T.Optimization for statistical machine translation:A survey[J].Computational Linguistics,2016,42(1):1-54.
    [4]Cho K,Van Merri3nboer B,Bahdanau D,et al.On the properties of neural machine translation:Encoder-decoder approaches[C]//8th Workshop on Syntax,Semantics and Structure in Statistical Translation,2014.
    [5]Meng Z,Wei D,Wiesel A,et al.Marginal likelihoods for distributed parameter estimation of Gaussian graphical models[J].IEEE Transactions on Signal Processing,2014,62(20):5425-5438.
    [6]Zhang S,Mahmut G,Wang D,et al.Memory-augmented Chinese-Uyghur neural machine translation[C]//APSIPAASC,2017.
    [7]Mi C,Yang Y,Zhou X,et al.Co-occurrence degree based word alignment:A case study on Uyghur-Chinese[M]//Chinese Computational Linguistics and Natural Language Processing based on Naturally Annotated Big Data.Springer,Cham,2014:259-268.
    [8]Hadiwinoto C.Syntax-based statistical machine translation[J].Computational Linguistics,2017,43(4):893-896.
    [9]Sreelekha S,Bhattacharyya P.Morphology generation for statistical machine translation[C]//LREC,2018.
    [10]LI Xiang,NAN Jiang,YANG Yating,et al.Application of generalization language model in Chinese-Uyghur machine translation[J].Application Research of Computers,2014,31(10):2994-2997(in Chinese).[李响,南江,杨雅婷,等.泛化语言模型在汉维机器翻译中的应用[J].计算机应用研究,2014,31(10):2994-2997.]
    [11]Miliwan Xuehelaiti,Mairehaba Aili,Tuergen Yibulayin,et al.Research on Uyghur suffix’s influence on Chinese-Uyghur statistical machine translation[J].Computer Engineering,2014,40(3):224-227(in Chinese).[米莉万·雪合来提,麦热哈巴·艾力,吐尔根·依布拉音,等.维吾尔语词尾对汉维统计机器翻译影响的研究[J].计算机工程,2014,40(3):224-227.]
    [12]Miliwan Xuehelaiti,LIU Kai,Tuergen Ibrahim.ChineseUyghur machine translation based on smallest translation units of stems and suffixes[J].Journal of Chinese Information Processing,2015,29(3):201-206(in Chinese).[米莉万·雪合来提,刘凯,吐尔根·依布拉音.基于维吾尔语词干词缀粒度的汉维机器翻译[J].中文信息学报,2015,29(3):201-206.]
    [13]Ayiguli Halike,Hasan Wumaier,Tuergen Yibulayin,et al.Research on recognition and translation of Chinese-Uyghur time numeral and quantifier[J].Journal of Chinese Information Processing,2016,30(6):190-200(in Chinese).[阿依古丽·哈力克,艾山·吾买尔,吐尔根·依布拉音,等.汉维时间数字和量词的识别与翻译研究[J].中文信息学报,2016,30(6):190-200.]
    [14]Luong MT,Pham H,Manning CD.Effective approaches to attention-based neural machine translation[D].Stanford,CA:Stanford University,2015.
    [15]He W,He Z,Wu H,et al.Improved neural machine translation with SMT features[C]//AAAI,2016:151-157.
    [16]Wang X,Lu Z,Tu Z,et al.Neural machine translation advised by statistical machine translation[C]//AAAI,2017:3330-3336.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700