摘要
机器翻译研究在非人工干预的情况下,利用计算机自动地实现不同语言之间的转换,是自然语言处理和人工智能的重要研究领域,神经机器翻译(neural machine translation,NMT)利用神经网络实现源语言到目标语言的转换,是一种全新的机器翻译模型.神经机器翻译经过最近几年的发展,取得了丰富的研究成果,在很多语言对上超过了统计机器翻译方法.首先介绍神经机器翻译的基本思想和主要方法,然后对最新的前沿进展进行综述,最后对神经机器翻译的未来发展方向进行展望.
Machine translation without human interference and dependent on computers is an important research field of natural language processing and artificial intelligence. Neural machine translation( NMT) as a brand-new machine translation model uses the neural network to complete the translation process from the source language to the target language. Neural machine translation has achieved abundant research results in recent years,and has beaten statistical machine translation methods in terms of many language pairs. This paper first introduces the basic principles and main methods of neural machine translation,then summarizes the latest advances,and finally predicts the development of neural machine translation.
引文
[1] SUTSKEVER I,VINYALS O,LE Q V. Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems. 2014:3104-3112.
[2] BAHDANAU D,CHO K,BENGIO Y. Neural machine translation by jointly learning to align and translate[J].ar Xiv:1409. 0473,2014.
[3]刘洋.神经机器翻译前沿进展[J].计算机研究与发展,2017,54(06):1144-1149.
[4] PITTS W. A logical calculus of the ideas immanent in nervous activity[M]//Neurocomputing:foundations of research. MIT Press,1988:115-133.
[5]焦李成,杨淑媛,刘芳,等.神经网络七十年:回顾与展望[J].计算机学报,2016,39(8):1697-1716.
[6] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J].Nature,1986,323(6088):533-536.
[7] COLLOBERT R,WESTON J,BOTTOU L,et al. Natural language processing(almost)from scratch[J]. Journal of Machine Learning Research, 2011, 12(Aug):2493-2537.
[8] CASTANO M A,CASACUBERTA F. A connectionist approach to machine translation[C]//Fifth European Conference on Speech Communication and Technology. 1997.
[9] KALCHBRENNER N,BLUNSOM P. Recurrent continuous translation models[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013:1700-1709.
[10] SUTSKEVER I,VINYALS O,LE Q V. Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems. 2014:3104-3112.
[11] CHO K,van MERRIE‥NBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. ar Xiv:1406.1078,2014.
[12] CHO K,van MERRIE‥NBOER B,BAHDANAU D,et al.On the properties of neural machine translation:encoder-decoder approaches[C]//Proceedings of SSST-8,Eighth Workshop on Syntax,Semantics and Structure in Statistical Translation. Doha,Qatar,2014:103-111
[13] JEAN S,CHO K,BENGIO Y. On using very large target vocabulary for neural machine translation[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(ACL2015). Beijing,2015:1-10.
[14] JEAN S,FIRAT O,CHO K,et al. Montreal neural machine translation systems for WMT'15[C]//Proceedings of the Tenth Workshop on Statistical Machine Translation.Lisbon,Portugal,2015:134-140
[15] ELMAN J L. Finding structure in time[J]. Cognitive science,1990,14(2):179-211.
[16] GOODFELLOW I,BENGIO Y,COURVILLE A,et al.Deep learning[M]. Cambridge:MIT Press,2015
[17] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[M]//Neurocomputing:foundations of research. Cambridge:MIT Press,1988:533-536.
[18] BENGIO Y,SIMARD P,FRASCONI P. Learning longterm dependencies with gradient descent is difficult[J].IEEE transactions on neural networks,1994,5(2):157-166.
[19] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural computation,1997,9(8):1735-1780.
[20] CHUNG J,GULCEHRE C,CHO K H,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[J]. ar Xiv preprint ar Xiv:1412. 3555,2014.
[21] LEI T,ZHANG Y,WANG S I,et al. Simple recurrent units for highly parallelizable recurrence[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018:4470-4481.
[22] SOCHER R,HUANG E H,PENNIN J,et al. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection[C]//Advances in neural information processing systems. 2011:801-809.
[23] GEHRING J,AULI M,GRANGIER D,et al. Convolutional sequence to sequence learning[J]. ar Xiv preprint ar Xiv:1705. 03122,2017.
[24] DING Y,LIU Y,LUAN H,et al. Visualizing and understanding neural machine translation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers). 2017,1:1150-1159.
[25] SHI X,PADHI I,KNIGHT K. Does string-based neural MT learn source syntax?[C]//Proceedings of the2016 Conference on Empirical Methods in Natural Language Processing. 2016:1526-1534.
[26] MCCULLOCH W S,PITTS W. A logical calculus of the ideas immanent in nervous activity[J]. Bulletin of mathematical biology,1990,52(1/2):99-115.
[27] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J]. Nature,1986,323(6088):533.
[28] COLLOBERT R,WESTON J,BOTTOU L,et al. Natural language processing(almost)from scratch[J]. Journal of Machine Learning Research,2011,12(Aug):2493-2537.
[29] NECO R P,FORCADA M L. Asynchronous translations with recurrent neural nets[C]//International Conference on Neural Networks, 1997. IEEE, 1997, 4:2535-2540.