Labeling hierarchical phrase-based models without linguistic resources
详细信息    查看全文
  • 作者:Gideon Maillette de Buy Wenniger ; Khalil Sima’an
  • 关键词:Hierarchical statistical machine translation ; Reordering ; Reordering labels ; Soft constraints
  • 刊名:Machine Translation
  • 出版年:2015
  • 出版时间:December 2015
  • 年:2015
  • 卷:29
  • 期:3-4
  • 页码:225-265
  • 全文大小:1,647 KB
  • 参考文献:Almaghout H, Jiang J, Way A (2010) CCG augmented hierarchical phrase-based machine translation. In: Federico M, Lane I, Paul M, Yvon F (eds) Proceedings of the seventh international workshop on spoken language translation (IWSLT). France, Paris, pp 211–218
    Almaghout H, Jiang J, Way A (2012) Extending CCG-based syntactic constraints in hierarchical phrase-based SMT. In: Proceedings of the annual conference of the European Association for Machine Translation (EAMT)
    Birch A, Osborne M, Blunsom P (2010) Metrics for MT evaluation: evaluating reordering. Mach Trans 24(1):15–26CrossRef
    Birch A, Osborne M (2010) LRscore for evaluating lexical and reordering quality in MT. In: Proceedings of the joint fifth workshop on statistical machine translation and metricsMATR. Uppsala, pp 327–332
    Chang PC, Galley M, Manning CD (2008) Optimizing chinese word segmentation for machine translation performance. In: Proceedings of the 3rd workshop on statistical machine translation. Columbus, pp 224–232
    Chen S, Goodman J (1999) An empirical study of smoothing techniques for language modeling. Comput Speech Lang 4(13):359–393CrossRef
    Cherry C, Foster G (2012) Batch tuning strategies for statistical machine translation. In: NAACL HLT 2012, The 2012 conference of the North American chapter of the association for computational linguistics: human language technologies, proceedings of the conference, Montréal, pp 427–436
    Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05). Michigan, pp 263–270
    Chiang D (2006) An introduction to synchronous grammars. Tutorials at the annual meeting of the association for compuational linguistics (ACL), Sydney. http://​www3.​nd.​edu/​~dchiang/​papers/​synchtut.​pdf
    Chiang D (2010) Learning to translate with source and target syntax. In: ACL 2010, The 48th annual meeting of the association for computational linguistics, conference proceedings, Uppsala, pp 1443–1452
    Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2):201–228CrossRef
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. Je R Stat Soc 39:1–38MathSciNet
    Denkowski M, Lavie A, (2011) Meteor 1.3: automatic metric for reliable optimization and evaluation of machine translation systems. In: WMT, (2011) 6th workshop on statistical machine translation. Proceedings of the workshop, Edinburgh pp 85–91
    Dixon WJ, Mood AM (1946) The statistical sign test. J Am Stat Assoc 41(236):557–566CrossRef
    Eisele A, Chen Y (2010) MultiUN: a multilingual corpus from United Nation documents. In: Proceedings of the 7th conference on international language resources and evaluation. Valletta, pp 2868–2872
    Galley M, Manning CD (2008) A simple and effective hierarchical phrase reordering model. In: Proceedings of the 2008 conference on empirical methods in natural language processing. Honolulu, pp 847–855
    Ganitkevitch J, Cao Y, Weese J, Post M, Callison-Burch C (2012) Joshua 4.0: packing, pro, and paraphrases. In: Proceedings of the 7th workshop on statistical machine translation, Montréal, pp 283–291
    Gildea D, Satta G, Zhang H (2006) Factoring synchronous grammars by sorting. In: COLING-ACL 2006, 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, Proceedings of the main conference poster sessions, Sydney, pp 279–286
    Hanneman G, Lavie A (2011) Automatic category label coarsening for syntax-based machine translation. In: Proceedings of the 5th workshop on syntax, semantics and structure in statistical translation, Portland, pp 98–106
    Hanneman G, Lavie A (2013) Improving syntax-augmented machine translation by coarsening the label set. In: NAACL HLT 2013, The 2013 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, Atlanta, pp 288–297
    He Y, Way A (2009) Metric and reference factors in minimum error rate training. Mach Transl 24(1):27–38CrossRef
    Hopkins M, May J (2011) Tuning as ranking. In: EMNLP 2011, conference on empirical methods in natural language processing, proceedings of the conference, Edinburgh, pp 1352–1362
    Huang L, Chiang D (2007) Forest rescoring: faster decoding with integrated language models. In: Proceedings of the 45th annual meeting of the association of computational linguistics. Czech Republic, pp 144–151
    Huang L, Knight K, Joshi A (2006) A syntax-directed translator with extended domain of locality. In: Proceedings of the workshop on computationally hard problems and joint inference in speech and language processing. New York City, pp 1–8
    Huck M, Wuebker J, Rietig F, Ney H (2013) A phrase orientation model for hierarchical machine translation. In: ACL 2013 8th workshop on statistical machine translation. Sofia, pp 452–463
    Isozaki H, Sudoh K, Tsukada H, Duh K (2010) Head finalization: a simple reordering rule for sov languages. In: Proceedings of the joint 5th workshop on statistical machine translation and metricsMATR. Uppsala, pp 244–251
    Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of MT summit X. Phuket, pp 79–86
    Koehn P (2010) Stat Mach Transl, 1st edn. Cambridge University Press, New York
    Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions. Czech Republic, pp 177–180
    Li J, Tu Z, Zhou G, van Genabith J (2012) Using syntactic head information in hierarchical phrase-based translation. In: Proceedings of the 7th workshop on statistical machine translation. Montréal, pp 232–242
    Liu Y, Liu Q, Lin S (2006) Tree-to-string alignment template for statistical machine translation. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics. Sydney, pp 609–616
    Maillette de Buy Wenniger G, Sima’an K (2014a) Bilingual markov reordering labels for hierarchical SMT. In: Proceedings of the 8th workshop on syntax, semantics and structure in statistical translation, Denver, pp 11–21
    Maillette de B, Wenniger G, Sima’an K (2014b) Visualization, search and analysis of hierarchical translation equivalence in machine translation data. Prague Bull Math Linguist 101:43–54
    Marton Y, Chiang D, Resnik P (2012) Soft syntactic constraints for Arabic–English hierarchical phrase-based translation. Mach Transl 26(1–2):137–157CrossRef
    Matsuzaki T, Miyao Y, Tsujii J (2005) Probabilistic CFG with latent annotations. In: ACL-05, 43rd annual meeting on association for computational linguistics, proceedings of the conference, Ann Arbor, pp 75–82
    Mi H, Huang L (2008) Forest-based translation rule extraction. In: Proceedings of the 2008 conference on empirical methods in natural language processing. Honolulu, pp 206–214
    Mi H, Huang L, Liu Q (2008) Forest-based translation. In: ACL-08: HLT, 46th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference, Columbus, pp 192–199
    Mino H, Watanabe T, Sumita E (2014) Syntax-augmented machine translation using syntax-label clustering. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Doha, pp 165–171
    Müller G (2002) Free word order, morphological case, and sympathy theory. In: Fanselow G, Fery C (eds) Resolving conflicts in grammars: optimality theory in syntax, morphology, and phonology. BuskeVerlag, pp 265–397
    Mylonakis M (2012) Learning the latent structure of translation. PhD thesis, Institute for Logic, Language and Computation, University of Amsterdam, Amsterdam
    Mylonakis M, Sima’an K (2011) Learning hierarchical translation structure with linguistic annotations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies. Portland, pp 642–652
    Neubig G, Watanabe T, Mori S (2012) Inducing a discriminative parser to optimize machine translation reordering. In: EMNLP-CoNLL 2012, 2012 joint conference on empirical methods in natural language processing and computational natural language learning, proceedings of the conference, Jeju Island, pp 843–853
    Nguyen T, Vogel S (2013) Integrating phrase-based reordering features into a chart-based decoder for machine translation. In: ACL 2013, 51st annual meeting of the association for computational linguistics, proceedings of the conference, vol 1: Long Papers. Sofia, pp 1587–1596
    Och F, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 160–167
    Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4):417–449CrossRef
    Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: A method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Pennsylvania, pp 311–318
    Petrov S, Barrett L, Thibaux R, Klein D (2006) Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics. Sydney, pp 433–440
    Prescher D (2005) Inducing head-driven PCFGs with latent heads: Refining a tree-bank grammar for parsing. In: Proceedings of the 16th European conference on machine learning, Porto, ECML’05, pp 292–304
    Rayner J, Best DJ (1999) Modeling ties in the sign test. Biometrics 2(55):663–665CrossRef
    Saluja A, Dyer C, Cohen SB (2014) Latent-variable synchronous cfgs for hierarchical translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Doha, pp 1953–1964
    Sima’an K, Maillette de Buy Wenniger G (2013) Hierarchical alignment trees: a recursive factorization of reordering in word alignments with empirical results. Internal Report. http://​staff.​science.​uva.​nl/​~simaan/​D-Papers/​HATsReport2013.​pdf
    Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: AMTA 2006: Proceedings of the 7th conference of the association for machine translation in the Americas, visions for the future of machine translation, Cambridge, pp 223–231
    Stanojević M, Sima’an K (2014) BEER: BEtter evaluation as ranking. In: Proceedings of the 9th workshop on statistical machine translation. Baltimore, pp 414–419
    Stanojević M, Sima’an K (2015) Reordering grammar induction. In: Proceedings of the 2015 conference on empirical methods in natural language processing, Lisbon, pp 44–54
    Steedman M (2000) The syntactic process. MIT Press, Cambridge
    Tiedemann J (2012) Parallel data, tools and interfaces in OPUS. In: Proceedings of the 8th international conference on language resources and evaluation (LREC-2012). Istanbul, pp 2214–2218
    Tillmann C (2004) A unigram orientation model for statistical machine translation. In: HLT-NAACL 2004, human language technology conference of the North American Chapter of the association for computational linguistics companion volume: short papers, student research workshop, demonstrations, tutorials abstracts, Boston, pp 101–104
    Uszkoreit J, Brants T (2008) Distributed word clustering for large scale class-based language modeling in machine translation. In: ACL-08: HLT, 46th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference, Columbus, pp 755–762
    Venugopal A, Zollmann A, Smith NA, Vogel S (2009) Preference grammars: softening syntactic constraints to improve statistical machine translation. In: NAACL HLT 2009, human language technologies: the 2009 annual conference of the North American Chapter of the association for computational linguistics, proceedings of the conference, Boulder, pp 236–244
    Wang W, May J, Knight K, Marcu D (2010) Re-structuring, re-labeling and re-aligning for syntax-based machine translation. Comput Linguist 36:247–277CrossRef
    Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23:377–404
    Xiao X, Su J, Liu Y, Liu Q, Lin S (2011) An orientation model for hierarchical phrase-based translation. IALP 2011, proceedings of the 2011 international conference on Asian language processing. Penang, pp 165–168
    Xiao T, Zhu J (2013) Unsupervised sub-tree alignment for tree-to-tree translation. J Artif Intell Res 48(1):733–782
    Zhang H, Gildea D, Chiang D (2008) Extracting synchronous grammar rules from word-level alignments in linear time. In: Coling 2008, 22nd international conference on computational linguistics, proceedings of the conference, Manchester, pp 1081–1088
    Zhou B, Xiang B, Zhu X, Gao Y (2008) Prior derivation models for formally syntax-based translation using linguistically syntactic parsing and tree kernels. In: Proceedings of the ACL-08: HLT second workshop on syntax and structure in statistical translation (SSST-2). Columbus, pp 19–27
    Zollmann A (2011) Learning multiple-nonterminal synchronous grammars for statistical machine translation. PhD thesis, Carnegie Mellon University, Pittsburgh. http://​www.​cs.​cmu.​edu/​~zollmann/​publications/​thesis.​pdf
    Zollmann A, Venugopal A (2006) Syntax augmented machine translation via chart parsing. In: HLT-NAACL 06, statistical machine translation, proceedings of the workshop, New York City, pp 138–141
  • 作者单位:Gideon Maillette de Buy Wenniger (1)
    Khalil Sima’an (1)

    1. Institute for Logic, Language and Computation, University of Amsterdam, Amsterdam, The Netherlands
  • 刊物类别:Humanities, Social Sciences and Law
  • 刊物主题:Linguistics
    Computational Linguistics
    Language Translation and Linguistics
    Artificial Intelligence and Robotics
  • 出版者:Springer Netherlands
  • ISSN:1573-0573
文摘
Long-range word order differences are a well-known problem for machine translation. Unlike the standard phrase-based models which work with sequential and local phrase reordering, the hierarchical phrase-based model (Hiero) embeds the reordering of phrases within pairs of lexicalized context-free rules. This allows the model to handle long range reordering recursively. However, the Hiero grammar works with a single nonterminal label, which means that the rules are combined together into derivations independently and without reference to context outside the rules themselves. Follow-up work explored remedies involving nonterminal labels obtained from monolingual parsers and taggers. As of yet, no labeling mechanisms exist for the many languages for which there are no good quality parsers or taggers. In this paper we contribute a novel approach for acquiring reordering labels for Hiero grammars directly from the word-aligned parallel training corpus, without use of any taggers or parsers. The new labels represent types of alignment patterns in which a phrase pair is embedded within larger phrase pairs. In order to obtain alignment patterns that generalize well, we propose to decompose word alignments into trees over phrase pairs. Beside this labeling approach, we contribute coarse and sparse features for learning soft, weighted label-substitution as opposed to standard substitution. We report extensive experiments comparing our model to two baselines: Hiero and the known syntax augmented machine translation (SAMT) variant, which labels Hiero rules with nonterminals extracted from monolingual syntactic parses. We also test a simplified labeling scheme based on inversion transduction grammar (ITG). For the Chinese–English task we obtain performance improvement up to 1 BLEU point, whereas for the German–English task, where morphology is an issue, a minor (but statistically significant) improvement of 0.2 BLEU points is reported over SAMT. While ITG labeling does give a performance improvement, it remains sometimes suboptimal relative to our proposed labeling scheme. Keywords Hierarchical statistical machine translation Reordering Reordering labels Soft constraints

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700