详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
Each of the Machine Translation (MT) methods has its own advantages and limitations. The purpose of hybrid MT method is to make full use of various MT's advantages, avoid their shortages, optimize translation result, and improve the whole performance of the MT system.
     In view of the resources available to us, after referencing the research achievements of related researchers, this paper studies about the Chinese-Mongolian hybrid machine translation system. We built a phrase-based Chinese-Mongolian SMT system using the existing open source tools. At the same time, we have established language resources for automatic evaluation platform of Chinese-Mongolian machine translation system. In order to improve the performance of the phrase-based Chinese-Mongolian SMT system, this thesis made the following research and experiments:
     (1)By joining the Chinese-Mongolian bilingual dictionary and doing morphological analysis for additional components of which Mongolian nouns cases, nouns plural forms and genitive cases to solve unknown word problems in translation.
     (2) We proposed Chinese sentence reordering method based on the Mongolian word order and handled the large number of word order errors that appeared in the phrase-based SMT. First of all, syntactic analysis needs to be done to the Chinese sentences. Then, according to the reordering rules Chinese sentences have to be converted to a form which similar to Mongolian sentence word order. Finally, the reordered Chinese sentence is sent to statistical decoder for monotonous decoding.
     (3) In the study of phrase-based Chinese-Mongolian SMT, we noticed that there are some errors in the Chinese-Mongolian quantifier translation. We compared Chinese-Mongolian quantifier translation methods and concluded one-to-one, many-to-one, one-to-zero and one-to-many relationships of translation between Chinese quantifier and Mongolian quantifier.
     Machine Translation Evaluation plays an important role to the development of Machine Translation technology. We provided the training corpus, development set and test set for Chinese-Mongolian daily evaluation task to CWMT2009 machine translation evaluation. In order to prepare corpus we have developed a rule-based Mongolian sentences automatic segmentation program and a converting program from Mongolian Latin to UTF-8 code. This thesis introduced the method and the process for developing these programs. Finally, we present the Chinese-Mongolian hybrid machine translation system's experiments and results analysis.
[1]Brown R. and Frederking R. Applying Statistical English Language Modeling to Symbolic Machine Translation. In:Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-95),. Leuven, Belgium.1995, pages. 221-239.
    [2]Chi-Ho Li, Dongdong Zhang, Mu Li, Ming Zhou, Minghui Li, Yi Guan,A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 720-727, Prague, Czech Republic, June 2007.
    [3]David Chiang.2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of ACL 2005, pages 263-270, Ann Arbor, Michigan, June.
    [4]Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight.2006. SPMT:Statistical machine translation with syntactified target language phrases. In Proceedings of EMNLP 2006, pages 44-52.
    [5]D. Wu.1995. Stochastic inversion transduction grammars, with application to segmentation, bracketing, and alignment of parallel corpora. In Proc. of the 14th International Joint Conf. on Artificial Iritelligence(IJCAI), pages 1328-1334, Montreal, August.
    [6]D. Wu.1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics,23(3):377-403,September.
    [7]Einat Minkov, Kristina Toutanova, Hisami Suzuki, Generating Complex Morphology for Machine Translation [A]. In:Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics(ACL-07).Prague,2007, pages 128-135.
    [8]Fei Xia, and Michael McCord 2004. Improving a Statistical MT System with Automatically Learned Rewrite Patterns. Proceedings for COLING 2004.
    [9]Franz Josef Och, Hermann Ney, Discriminative Training and Maximum Entropy Models for Statistical Machine Translation [A], ACL2002 pp.295-302.
    [10]Frederking R. and Nirenburg S., Three Heads are Better than One, In:Proceedings of the
    Fourth Conference on Applied Natural Language Processing (ANLP-94), Stuttgart, Germany, 1994, pages 95-100.
    [11]Hogan C. and Frederking R., An Evaluation of Multi-engine MT Architecture, In:Third Conference of the Association for Machine Translation in Americas (AMTA'98),, Langhorne, PA. USA, Oct.1998, published as:Machine Translation and the Information Soup, Springer, pages 113-123.
    [12]Ibrahim Badr, Rabih Zbib, James Glass, Segmentation for English-to-Arabic Statistical Machine Translation Proceedings of ACL-08:HLT, Short Papers (Companion Volume), pages 153-156, Columbus, Ohio, USA, June 2008.
    [13]Kenji Yamada and Kevin Knight.2001. A syntax-based statistical translation model. In Proceedings of ACL 2001, pages 523-530.
    [14]Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, Bleu:a Method for Automatic Evaluation of Machine Translation, IBM Research Division, IBM Research Report RC22176 (W0109-022) September 17,2001。
    [15]Kristina Toutanova, Hisami Suzuki, Achim Ruopp, Applying Morphology Generation Models to Machine Translation, Proceedings of ACL-08:HLT, pages 514-522.
    [16]Liu Qun, Chang Baobao, Zhan Weidong, Zhou Qiang, A News-oriented Chinese-English Machine Translation System, In:International Conference on Chinese Computing (ICCC2001), Singapore,2001
    [17]Liu Qun, A Chinese-English Machine Translation System Based on Micro-engine Architecture, In:Chan Sin-Wai eds., Translation and Information Technology, The Chinese University Press, Hong Kong,2002, page 23-30.
    [18]Michael Collins, Philipp Koehn, and Ivona Kucerova.2005. Clause Restructuring for Statistical MachineTranslation. Proceedings for ACL 2005.
    [19]Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang; and Ignacio Thayer.2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of COLING/ACL 2006, pages 961-968, Sydney, Australia, July.
    [20]Michel Galley, Mark Hopkins,Kevin Knight, and Daniel Marcu.2004. What's in a translation rule? In Proceedings of HLT/NAACL 2004, pages273-280, Boston, USA, May.
    [21]Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, Paul S. Roossin, A Statistical Approach to Machine Translation [J], Computational Linguistics,1990.
    [22]Peter. F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Robert L. Mercer, The Mathematics of Statistical Machine Translation:Parameter Estimation [J], Computational Linguistics, Vol 19, No.2,1993.
    [23]Philipp Koehn. (2004). Pharaoh:a beam search decoder for phrase-based statistical machine translation models. In Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas, pp.115-124.
    [24]Philipp Koehn and Hieu Hoang,2007. Factored Translation Models. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.868-876, Prague, June 2007.
    [25]Sharon Goldwater and David McClosky.2005. Improving statistical MT through morphological analysis. In EMNLP.
    [26]Sonja Nieβen and Hermann Ney.2004. Statistical machine translation with scarce resources using morpho-syntactic information. Computational Linguistics,30(2):181-204.
    [27]Su, Keh-Yih.2005. To have linguistic tree structures in statistical machine translation? In proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE), Wuhan,China.2005.
    [28]Wahlster W., Mobile Speech-to-Speech Translation of Spontaneous Dialogs:An Overview of the Final Verbmobil System, In Wolfgang Wahlster eds., Verbmobil:Foundations of Speech-to-Speech Translation, Springer,2000, ISBN 3-540-67783-6, pp 3-21
    [29]Ye-Yi Wang and Alex Waibel.1998. Modeling with structures in statistical machine translation. In Proceedings of COLING/ACL 1998, pages 1357-1363, Montreal, Quebec, Canada.
    [30]Young-Suk Lee.2004. Morphological analysis for statistical machine translation. In HLT-NAACL.
    [31]Zhang Min, Choi Key-Sun, Multi-Engine Machine Translation:Accomplishment of MATES/CK System, Proceedings of TMI99, pages:228-238.
    [32]Microsoft Corporation著,Win 32程序员参考大全,北京:清华大学出版社,1995年4月.
    [35]陈小荷,现代汉语自动分析:Visual C++实现,北京语言文化大学出版社,2000年3月;
    [47]刘群,统计机器翻译综述,中文信息学报,Vol.17, No.4, pp.1-12,2003.7。