统计机器翻译的一致性解码方法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

统计机器翻译的一致性解码方法研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Consensus Decoding Approaches to Statistical Machine Translation
作者：段楠
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：自然语言处理 ; 统计机器翻译 ; 一致性解码 ; 系统融合 ; 最小贝叶斯风险解码 ; 特征子空间 ; 协作解码 ; 混合模型 ; 翻译假设混合解码
英文关键词：natural language processing ; statistical machine translation ; consensusdecoding ; system combination ; minimum Bayes-risk decoding ; feature subspace ; collaborative decoding ; mixture model ; hypothesis mixture decoding
学位年度：2012
导师：周明 ; 李沐
学科代码：081203
学位授予单位：天津大学
论文提交日期：2011-12-01

摘要

近二十年来,机器翻译(machine translation, MT)研究迅猛发展。相比于传统的基于规则(rule-based)和基于实例(example-based)的机器翻译方法,统计机器翻译(statistical machine translation, SMT)在译文质量和系统鲁棒性上均展示出巨大优势,并已经成为机器翻译研究领域中的主流方法。
     给定源语言输入,机器翻译的任务在于生成与该输入表达意义相同的目标语言输出。典型的SMT系统在完成上述翻译任务的过程中,往往能够生成多条不同的目标语言翻译假设(translation hypotheses)。然而,由于翻译模型本身的局限性,系统排名最高(1-best)的翻译假设通常却并不是全部翻译假设中的最优结果;此外,基于不同翻译模型SMT系统的大量涌现,进一步扩展了相同源语言输入能够对应的目标语言翻译候选集合大小。在上述背景下,如何有效地利用不同翻译假设及其包含的信息来获取更优的翻译结果,已经成为近年来机器翻译研究领域中的一个热点课题。本篇博士论文针对该课题进行了一系列深入而系统的研究,并按照下述框架组织全文:
     首先,本文将该课题目前已有的研究成果大体分为如下两类,并提出一个统一的一致性解码(consensus decoding)框架,将这两类方法包括其中:
     ·翻译假设重排序,主要应用于单个SMT系统的翻译假设空间之上。该类方法通过选取特定的评判准则,对整个翻译假设空间中所包含的全部翻译假设进行重打分并重新排序,进而选取重排序后排名最高的翻译假设作为最终的翻译结果;
     ·系统融合,主要应用于多个SMT系统的翻译假设空间之上。根据其使用翻译单元的不同,该类方法又可划分为句子级(sentence-level)、短语级(phrase-level)和词汇级(word-level)三种不同的层次。其中,词汇级系统融合能够提供的性能提升最为显著,因此相关方面的研究成果也最多。然后,针对已有典型工作的不足,本文提出四种新型的一致性解码方法:
     ·基于特征子空间的句子级系统融合给定任意基于对数-线性(log-linear)模型的(主)SMT系统,通过选取该系统特征全集的不同特征子集来构造多个(子)SMT系统,进而在全部系统的输出结果之上使用一种句子级系统融合方法,来选取最终的翻译结果。
     该方法的贡献在于:提出了一种简单有效的多SMT系统构造方法,极大程度地减少了系统融合工作中多翻译系统构造方面的巨大开销;
     ·协作解码给定多个基于对数-线性模型的SMT系统,通过共享翻译假设空间的方式允许不同翻译系统进行交互,每个翻译系统通过使用一组基于其他翻译系统生成的翻译假设空间计算得到的n-gram统计量特征,来对自身的(局部和全部)翻译假设空间进行重排序。在协作解码完成后,还能够通过系统融合的方法获取进一步的性能提升。
     该方法的贡献在于:在解码过程中直接发生作用,能够在一定程度上避免更优的局部翻译假设被较早剪枝(pruning)的问题;
     ·基于混合模型的最小贝叶斯风险解码使用混合模型将多个SMT系统的翻译假设概率分布整合,利用整合后的概率分布在多个SMT系统合并后的翻译假设空间上计算MBR解码所需的n-gram统计量特征,进而从全部翻译候选中选取最终的译文结果。与传统的MBR解码方法相比较,基于混合模型的MBR解码所能访问到的翻译假设数目更多、差异性更大,并且基于混合概率分布计算的n-gram统计量也更为准确,因此,该方法能够获得的性能提升也更多。
     该方法的贡献在于:将最小贝叶斯风险解码的应用范围从单个SMT系统扩展到多个SMT系统;
     ·翻译假设混合解码通过利用来自多个SMT系统的局部翻译假设来构造一个更大的混合假设空间,并利用一系列基于一致性的统计量特征从该空间中选取最终的翻译结果。大规模数据上的机器翻译评测实验表明,该方法在翻译性能上显著地优于翻译假设重排序和词汇级系统融合方法。
     该方法的贡献在于:同时继承了翻译假设重排序和系统融合两类方法的优点,不仅能够使用来自任意SMT系统生成的局部翻译假设构建更大的翻译假设空间,而且能够产生已有翻译假设候选集合之外的新的翻译结果。
     在每种方法的对应章节中,本文均通过大规模数据上的中-英机器翻译评测实验,来验证各个方法的有效性;
     最后,总结全文,并对未来工作进行展望。
     本篇博士论文所探讨的一致性解码方法均针对统计机器翻译任务。然而,该类方法中所包含的思想同样适用于其他很多自然语言处理任务,如统计句法分析、自动语音识别、自动词对齐等。在将来的研究中,我们也将尝试在这些领域进一步扩展一致性解码方法的应用范围。
Research on machine translation (MT) grows rapidly in recent twenty years.Compared to traditional rule-based and example-based MT methods, statisticalmachine translation (SMT) has already shown huge advantages on translationquality and system robustness, and become the mainstream method in MT field.
     Given a source language input, the task of machine translation is to seek atarget language translation, which can express the same meaning with the sourceinput. During this procedure, a state-of-the-art SMT system usually generatesmultiple translation hypotheses. However, due to the limitation of the translationmodel itself, the 1-best hypothesis with the maximum model score assignedis often not the best translation result among all candidates in the hypothesis space.Furthermore, with the emergence of various SMT systems based on differenttranslation models, the number of translation hypotheses for an identical sourceinput becomes even larger. Based on such background described above, how togenerate translations with better qualities by making use of informa-tion contained in existing multiple translation hypotheses, has becomea hot topic in current SMT research eld. In this thesis, we carry on aseries of in-depth and systematic explorations on this topic, and present severaleffective approaches. We organize the whole thesis as follows:
     First, we classify the most representative work of the research topic describedabove into two categories, and present a unified consensus decoding frameworkthat can include both of them:
     ·translation hypothesis re-ranking, that usually applies on the outputsof a single SMT system. By using specific evaluation criteria, such methodscompute new scores for all hypotheses contained in the hypothesis space,and re-rank them based on these scores. The final translation output is the1-best hypothesis after hypothesis re-ranking;
     ·system combination, that usually applies on the outputs of multiple SMTsystems. Based on the usage of different translation units, such methodscan be further classified into three levels, including sentence-level, phraseleveland word-level system combination. Comparing to other two methods,word-level system combination can provide more improvements, so it hasbeen paid most attentions in recent ten years.
     Then, we present four new consensus-based decoding approaches to alleviatethe shortcomings of current state-of-the-art methods, including:
     ·feature subspace-based sentence-level system combination. Givenany log-linear SMT system, this approach first constructs a number ofsub-systems using different subsets of the full feature set of the baseline system.Then, a sentence-level system combination method is used to selectthe final translation from the outputs of both the baseline system and itscorresponding sub-systems.
     The contribution of this work is: it proposes a simple and effective way toconstruct multiple SMT systems, which can reduce the huge cost spent onmulti-system construction in system combination framework greatly;
     ·collaborative decoding. Given multiple log-linear SMT systems, it reranksboth partial and full hypothesis spaces for each SMT system by leveraginga set of n-gram statistics estimated on all hypothesis spaces providedby all involved SMT systems. By using system combination methods basedon the outputs of SMT systems after collaborative decoding, further improvementscan be achieved as well;
     The contribution of this work is: it avoids better partial hypotheses to bepruned in early decoding stages to a certain extent;
     ·mixture model-based minimum Bayes-risk decoding. First, it integrateshypothesis distributions of multiple SMT systems using the mixturemodel. Then, the final translation is selected from the combined hypothesisspace using n-gram statistics estimated under the combined hypothesisdistribution based on the mixture probability distribution. Because morehypotheses can be explored and more accurate n-gram statistics are usedas features for Bayes-risk computation, this approach performs significantlybetter than traditional MBR decoding.
     The contribution of this work is: it extends MBR decoding from single SMTsystem to multiple SMT systems;
     ·hypothesis mixture decoding. First, it constructs a larger mixture hypothesisspace using partial hypotheses from multiple SMT systems. Then,a series of consensus-based statistics are used as features to select the finaltranslation from this mixture hypothesis space. Experiments on large scaledata set have shown that this approach can outperform both translationhypothesis re-ranking and system combination at the same time.
     The contribution of this work is: it inherits the advantages of both translationhypothesis re-ranking and system combination, which can not onlyleverage partial hypotheses from arbitrary SMT system for hypothesis spaceconstruction, but also generate new translation candidates beyond existinghypothesis spaces.
     Solid experiments are performed on large scale MT evaluation tasks to demonstratethe effectiveness of each approach at each chapter;
     Last, we conclude this thesis. The future work is proposed as well.The consensus decoding approaches presented in this thesis work for MTtasks. However, the philosophy behind such consensus-based techniques is suitableto many other natural language processing tasks, such as statistical parsing,automatic speech recognition, automatic word alignment etc. In the future, wewill try to extend the applications of such consensus-based approaches to theseresearch fields as well.

引文

[1] Warren Weaver. 1955. Translation (1949), Machine Translation of Lan-guages, pages: 15-23.
    [2] Philipp Koehn. 2004. Phrase-based model for SMT, Computational Lin-guistics, 28(1): 114-133.
    [3] Franz Och and Hermann Ney. 2004. The alignment template approach tostatistical machine translation, Computational Linguistics, 30(4): 417-449.
    [4] Deyi Xiong, Qun Liu, and Shouxun Lin. 2006. Entropy based phrase re-ordering model for statistical machine translation, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 521-528.
    [5] Ye-Yi Wang and Alex Waibel. 1998. Modeling with structures in statisticalmachine translation, In Proceedings of Conference on Computational Lin-guistics/Annual Meeting of the Association for Computational Linguistics,pages: 1357-1363.
    [6] David Chiang. 2005. A hierarchical phrase-based model for statistical ma-chine translation, In Proceedings of Annual Meeting of the Association forComputational Linguistics, pages: 263-270.
    [7] David Chiang. 2007. Hierarchical phrase based translation, ComputationalLinguistics, 33(2): 201-228.
    [8] Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve De-Neefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and train-ing of context-rich syntactic translation models, In Proceedings of AnnualMeeting of the Association for Computational Linguistics.
    [9] Steve DeNeefe, Kevin Knight, Wei Wang, and Daniel Marcu. 2007. Whatcan syntax-based MT learn from phrase-based MT?, In Proceedings of Con-ference of Empirical Methods in Natural Language Processing, pages: 755-763.
    [10] Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency lan-guage model, In Proceedings of Annual Meeting of the Association forComputational Linguistics, pages: 577-585.
    [11] Mi Haitao, Liang Huang, and Qun Liu. 2008. Forest-based translation,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics, pages: 192-199.
    [12] David Chiang. 2010. Learning to translation with source and target syntax,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics, pages: 1443-1452.
    [13] Shankar Kumar and William Byrne. 2004. Minimum Bayes-risk decoding forstatistical machine translation, In Proceedings of North American Chapterof the Association for Computational Linguistics, pages: 169-176.
    [14] Nicola Ehling, Richard Zens, and Hermann Ney . 2007. Minimum Bayes riskdecoding for BLEU, In Proceedings of Annual Meeting of the Associationfor Computational Linguistics, pages: 101-104.
    [15] Roy Tromble, Shankar Kumar, Franz Och, and Wolf-gang Macherey. 2008.Lattice minimum Bayes-risk decoding for statistical machine translation, InProceedings of Conference of Empirical Methods in Natural Language Pro-cessing, pages: 620-629.
    [16] Shankar Kumar, Wolfgang Macherey, Chris Dyer, and Franz Och. 2009.E?cient minimum error rate training and minimum Bayes-risk decodingfor translation hypergraphs and lattices, In Proceedings of Annual Meetingof the Association for Computational Linguistics, pages: 163-171.
    [17] Antti-Veikko Rosti, Spyros Matsoukas, and Richard Schwartz. 2007. Im-proved word-level system combination for machine translation, In Proceed-ings of Annual Meeting of the Association for Computational Linguistics,pages: 312-319.
    [18] Makoto Nagao. 1984. A framework of a mechanical translation betweenJapanese and English by analogy principle, Arti?cial and Human Intelli-gence, pages: 173-180.
    [19] Peter F. Brown, Stephan A. Della Pietra, Vincent J. Della Pietra, andRobert L. Mercer. 1993. The mathematics of statistical machine translation:Parameter Estimation, Computational Linguistics, 19(2): 263-311.
    [20] Franz Och, Christoph Tillman, and Hermann Ney. 1999. Improved alignmentmodels for statistical machine translation, In Proceedings of Conference ofEmpirical Methods in Natural Language Processing, pages: 20-28.
    [21] Franz Och and Hermann Ney. 2000. A comparison of alignment models forstatistical machine translation, In Proceedings of Conference on Computa-tional Linguistics, pages: 1086-1090.
    [22] Franz Och and Hermann Ney. 2002. Discriminative training and maximumentropy models for statistical machine translation, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 295-302.
    [23] Franz Och. 2002. Statistical machine translation: from single-word modelsto alignment templates, Ph.D. thesis.
    [24] H. Hartley. 1958. Maximum likelihood estimation from incomplete data,Biometrics, (14): 174―194.
    [25] Stephan Vogel and Hermann Ney. 1996. Hmm-based word alignment instatistical translation, In Proceedings of Conference on Computational Lin-guistics, pages: 836-841.
    [26] Philipp Koehn, Franz Och, and Daniel Marcu. 2003. Statistical phrase-basedtranslation, In Proceedings of North American Chapter of the Associationfor Computational Linguistics, pages: 127-133.
    [27] Ye-Yi Wang and Alex Waibel. 1998. Modeling with structures in statisticalmachine translation, In Proceedings of Conference on Computational Lin-guistics/Annual Meeting of the Association for Computational Linguistics,pages: 1357-1363.
    [28] Franz Och. 2003. Minimum error rate training in statistical machine trans-lation, In Proceedings of Annual Meeting of the Association for Computa-tional Linguistics, pages: 160-167.
    [29] Richard Zens, Hermann Ney, Taro Watanabe, and Eiichiro Sumita. 2003.Reordering constraints for phrase-based statistical machine translation, InProceedings of Conference on Computational Linguistics, pages: 205-211.
    [30] Christoph Tillmann and Tong Zhang. 2005. A localized prediction modelfor statistical machine translation, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 557-564.
    [31] Deyi Xiong, Qun liu, and Shouxun Lin. 2006. Maximum entropy basedphrase reordering model for statistical machine translation, In Proceedingsof Conference on Computational Linguistics/Annual Meeting of the Asso-ciation for Computational Linguistics, pages: 521-528.
    [32] Yaser Al-Onaizan and Kishore Papineni. 2006. Distortion models for statis-tical machine translation, In Proceedings of Conference on ComputationalLinguistics/Annual Meeting of the Association for Computational Linguis-tics, pages: 529-536.
    [33] Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingualparsing of parallel corpora, Computational Linguistics, (23): 377-404.
    [34] Deyi Xiong, Min Zhang, Ai Ti Aw, Haitao Mi, Qun Liu and Shouxun Lin.2008. Re?nements in BTG-based statistical machine translation, In Pro-ceedings of IJCNLP, pages: 505-512.
    [35] Deyi Xiong, Min Zhang, Ai Ti Aw, and Haizhou Li. 2008. A linguisticallyannotated reordering model for BTG-based statistical machine translation,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics.
    [36] Deyi Xiong, Min Zhang, Aiti Aw and Haizhou Li. 2009. A syntax-drivenbracketing model for phrase-based translation, In Proceedings of AnnualMeeting of the Association for Computational Linguistics/IJCNLP.
    [37] Deyi Xiong, Min Zhang, Aiti Aw, Haizhou Li. 2010. Linguistically annotatedreordering: evaluation and analysis, Computational Linguistics, pages: 535-568.
    [38] Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical trans-lation model, In Proceedings of Annual Meeting of the Association forComputational Linguistics, pages: 523-530.
    [39] Dan Melamed. 2003. Multitext grammars and synchronous parsers, In Pro-ceedings of North American Chapter of the Association for ComputationalLinguistics, pages: 79-86.
    [40] Yuan Ding and Martha Palmer. 2005. Machine translation using probabilis-tic synchronous dependency insertion grammars, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 541-548.
    [41] Chris Quirk, Arul Menezes, and Colin Cherry. 2005. Dependency treelettranslation: syntactically informed phrasal SMT, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 271-279.
    [42] Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004.What’s in a translation rule?, In Proceedings of North American Chapterof the Association for Computational Linguistics, pages: 273-280.
    [43] Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight. 2006.SPMT: statistical machine translation with syntacti?ed target languagephrases, In Proceedings of Conference of Empirical Methods in NaturalLanguage Processing, pages: 44-52.
    [44] Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-string alignment tem-plate for statistical machine translation, In Proceedings of Conference onComputational Linguistics/Annual Meeting of the Association for Compu-tational Linguistics, pages: 609-616.
    [45] Jonathan Graehl and Kevin Knight. 2004. Training tree transducers, In Pro-ceedings of North American Chapter of the Association for ComputationalLinguistics, pages: 105-112.
    [46] Haitao Mi, Liang Huang and Qun Liu. 2008. Forest-based translation,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics, pages: 15-20.
    [47] Haitao Mi and Liang Huang. 2008. Forest-based translation rule extraction,In Proceedings of Conference of Empirical Methods in Natural LanguageProcessing.
    [48] Liang Huang and Haitao Mi. 2010. E?cient incremental decoding for tree-to-string translation, In Proceedings of Conference of Empirical Methods inNatural Language Processing, pages: 273-283.
    [49] David Chiang. 2010. Learning to translate with source and target syntax,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics.
    [50] Eugene Charniak, Kevin Knight, and Kenji Yamada. 2003. Syntax-basedlanguage models for statistical machine translation, In Proceedings of MTSummit.
    [51] Lane Schwartz, Chris Callison-Burch, William Schuler, and Stephen Wu.2011. Incremental syntactic language models for phrase-based translation,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics, pages: 620-631.
    [52] Taro Watanabe, Jun Suzuki, Hajime Tsukuda, and Hideki Isozaki. 2007.Online large-margin training for statistical machine translation, In Proceed-ings of Conference of Empirical Methods in Natural Language Processing.
    [53] David Chiang, Yuval Marton and Philip Resnik. 2009. Online large-margintraining of syntactic and structural translation features, In Proceedings ofConference of Empirical Methods in Natural Language Processing.
    [54] Percy Liang, Alexandre Bouchard, Dan Klein, and Ben Taskar. 2006. Anend-to-end discriminative approach to machine translation, In Proceedingsof Conference on Computational Linguistics/Annual Meeting of the Asso-ciation for Computational Linguistics.
    [55] David Chiang, Kevin Knight, and Wei Wang. 2010. 11,001 new features forstatistical machine translation, In Proceedings of North American Chapterof the Association for Computational Linguistics.
    [56] Ashish Venugopal, Andreas Zollmann, and Stephan Vogel. 2007. An e?-cient two-pass approach to synchronous-CFG driven statistical MT, In Pro-ceedings of North American Chapter of the Association for ComputationalLinguistics.
    [57] Richard Zens and Hermann Ney. 2003. A comparative study on reorderingconstraints in statistical machine translation, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 144–151.
    [58] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002.BLEU: a method for automatic evaluation of machine translation, In Pro-ceedings of Annual Meeting of the Association for Computational Linguis-tics, pages: 311–318.
    [59] Matthew Snover, Bonnie Dorr, Rich Schwartz, Linnea Micciulla, and JohnMakhoul. 2006. A study of translation edit rate with targeted human anno-tation, In Proceedings of AMTA, pages: 223–231.
    [60] Satanjeev Banerjee and Alon Lavie. 2005. METEOR: an automatic met-ric for MT evaluation with improved correlation with human judgments, InProceedings of Annual Meeting of the Association for Computational Lin-guistics Workshop, pages: 65–72.
    [61] Ashish Venugopal, Andreas Zollmann, and Stephan Vogel. 2007. An e?-cient two-pass approach to synchronous-CFG driven statistical MT, In Pro-ceedings of North American Chapter of the Association for ComputationalLinguistics.
    [62] Hao Zhang, Liang Huang, Daniel Gildea, and Kevin Knight. 2006. Syn-chronous binarization for machine translation, In Proceedings of NorthAmerican Chapter of the Association for Computational Linguistics, pages:256–263.
    [63] Nicola Ue?ng, Franz Och, and Hermann Ney. 2002. Generation of wordgraphs in statistical machine translation, In Proceedings of Conference ofEmpirical Methods in Natural Language Processing, pages: 156-163.
    [64] Shankar Kumar and William Byrne. 2004. Minimum Bayes-risk decoding forstatistical machine translation, In Proceedings of North American Chapterof the Association for Computational Linguistics, pages: 169-176.
    [65] Roy Tromble, Shankar Kumar, Franz Och, and Wolf-gang Macherey. 2008.Lattice minimum Bayes-risk decoding for statistical machine translation, InProceedings of Conference of Empirical Methods in Natural Language Pro-cessing, pages: 620-629.
    [66] Shankar Kumar, Wolfgang Macherey, Chris Dyer, and Franz Och. 2009.E?cient minimum error rate training and minimum Bayes-risk decodingfor translation hypergraphs and lattices, In Proceedings of Annual Meetingof the Association for Computational Linguistics, pages: 163-171.
    [67] Antti-Veikko Rosti, Spyros Matsoukas, and Richard Schwartz. 2007. Im-proved word-level system combination for machine translation, In Proceed-ings of Annual Meeting of the Association for Computational Linguistics,pages: 312-319.
    [68] B. Mellebeek, K. Owczarzak, and J. Genabith. 2006. Multi-engine machinetranslation by recursive sentence decomposition, In Proceedings of AMTA,pages: 110-118.
    [69] Boxing Chen, Min Zhang, and A. Aw. 2008. Regenerating hypotheses forstatistical machine translation, In Proceedings of Conference on Computa-tional Linguistics, pages: 105-112.
    [70]杜金华,魏玮,徐波. 2008.基于混淆网络解码的机器翻译多系统融合,中文信息学报, 22(4): 48-54.
    [71] Xiaodong He, Mei Yang, and Jianfeng Gao. 2008. Indirect-HMM-basedhypothesis alignment for combining outputs for machine translation systems,In Proceedings of Conference of Empirical Methods in Natural LanguageProcessing, pages: 98-107.
    [72] Necip Fazil Ayan, Jing Zheng, and Wen Wang. 2008. Improving alignmentsfor better confusion net-works for combining machine translation systems,In Proceedings of Conference on Computational Linguistics, pages: 33-40.
    [73] Evgeny Matusov, Nicola Ue? ng, and Hermann Ney. 2006. Computingconsensus translation from mul-tiple machine translation systems using en-chanced hypotheses alignment, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 33-40.
    [74] R. Frederking and S. Nirenburg. 1994. Three heads are better than one, InProceedings of Conference on Applied Natural Language Processing, pages:95-100.
    [75] Srinivas Bangalore, German Bordel and Giuseppe Riccardi. 2001. Com-puting consensus translation from multiple machine translation systems, InProceedings of ASRU, pages: 351-354.
    [76] K.C. Sim, W. Byrne, M. Gales, H. Sahbi, and P. Woodland. 2007. Consensusnetwork decoding for statistical machine translation system combination, InProceedings of ICASSP.
    [77] Antti-Veikko Rosti, Bing Zhang, Spyros Matsoukas, and Richard Schwartz.2008. Incremental hypothesis alignment for building confusion networks withapplication to machine translation system combination, In Proceedings ofAnnual Meeting of the Association for Computational Linguistics Workshop,pages: 183-186.
    [78] Taro Watanabe and Eiichiro Sumita. 2011. Machine translation systemcombination by confusion forest, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 1249-1257.
    [79] Tin Kam Ho. 1998. The random subspace method for constructing decisionforests, In IEEE Transactions on Pattern Analysis and Machine Intelligence,pages: 832-844.
    [80] Almut Silja Hildebrand and Stephan Vogel. 2008. Combination of machinetranslation systems via hypothesis selection from combined n-best lists, InProceedings of AMTA, pages: 254-261.
    [81] Philipp Koehn. 2004. Statistical signi?cance tests for machine translationevaluation, In Proceedings of Conference of Empirical Methods in NaturalLanguage Processing, pages: 388-395.
    [82] Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della Pietra.1996. A maximum entropy approach to natural language processing, Com-putational Linguistic.
    [83] Yang Ye, Ming Zhou, and Chin-Yew Lin. 2007. Sentence level machinetranslation evaluation as a ranking problem: one step aside from BLEU,In Proceedings of Annual Meeting of the Association for ComputationalLinguistics WorkSshop, pages: 240-247.
    [84] Nan Duan, Mu Li, Tong Xiao, and Ming Zhou. 2009. The feature sub-space method for SMT system combination, In Proceedings of Conferenceof Empirical Methods in Natural Language Processing, pages: 1096–1104.
    [85] Mu Li, Nan Duan, Dongdong Zhang, Chi-Ho Li, and Ming Zhou. 2009. Col-laborative decoding: partial hypothesis re-ranking using translation consen-sus between decoders, In Proceedings of Annual Meeting of the Associationfor Computational Linguistics, pages: 585–592.
    [86] Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency lan-guage model, In Proceedings of Annual Meeting of the Association forComputational Linguistics, pages: 577-585.
    [87] Nan Duan, Mu Li, Dongdong Zhang, and Ming Zhou. 2010. Mixture model-based minimum Bayes-risk decoding using multiple machine translation sys-tems, In Proceedings of Conference on Computational Linguistics, pages:313–321.
    [88] Shankar Kumar and William Byrne. 2002. Minimum Bayes-risk word align-ments of bilingual texts, In Proceedings of Conference of Empirical Methodsin Natural Language Processing, pages: 140-147.
    [89] John DeNero, David Chiang, and Kevin Knight. 2009. Fast consensusdecoding over translation forests, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 567-575.
    [90] John DeNero, Shankar Kumar, Ciprian Chelba, and Franz Och. 2010. Modelcombination for machine translation, In Proceedings of North AmericanChapter of the Association for Computational Linguistics.
    [91] Chi-Ho Li, Xiaodong He, Yupeng Liu, and Ning Xi. 2009. IncrementalHMM alignment for MT system combination, In Proceedings of AnnualMeeting of the Association for Computational Linguistics, pages: 949-957.
    [92] Wolfgang Macherey and Franz Och. 2007. An empirical study on com-puting consensus translations from multiple machine translation systems,In Proceedings of Conference of Empirical Methods in Natural LanguageProcessing, pages: 986-995.
    [93] Nan Duan, Mu Li, and Ming Zhou. 2011. Hypothesis mixture decoding forstatistical machine translation, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 1258–1267.
    [94] Richard Zens and Hermann Ney. 2006. N-Gram posterior probabilities forstatistical machine translation, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics Workshop, pages: 72–77.
    [95] Yang Liu, Haitao Mi, Yang Feng, and Qun Liu. 2009. Joint decodingwith multiple translation models, In Proceedings of Annual Meeting of theAssociation for Computational Linguistics, pages: 576-584.
    [96] Lei Cui, Dongdong Zhang, Mu Li, Ming Zhou, and Tiejun Zhao. 2010.Hybrid decoding: decoding with partial hypotheses combination over multipleSMT systems, In Proceedings of Conference on Computational Linguistics,pages: 214-222.
    [97] http://code.google.com/p/berkeleyparser/
    [98] Philipp Koehn. 2004. Pharaoh: A beam search decoder for phrase-basedstatistical machine translation models, In Proceedings of AMTA.
    [99] Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Mar-cello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran,Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and EvanHerbst. 2004. Pharaoh: Moses: open source toolkit for statistical machinetranslation, In Proceedings of Annual Meeting of the Association for Com-putational Linguistics, demonstration session.
    [100] Andreas Stolcke. 2002. SRILM - an extensible language modeling toolkit, InProceedings of ICSLP, pages: 901-904.
    [101] P.R. Clarkson and R. Rosenfeld. 1997. Statistical language modeling usingthe CMU-Cambridge toolkit , In Proceedings of ESCA Eurospeech.
    [102]李茂西,宗成庆. 2011.机器翻译系统融合技术综述,中文信息学报.
    [103]段楠,李沐,周明. 2011.统计机器翻译中一致性解码方法比较分析,中文信息学报.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700