面向机器翻译的英语功能名词短语识别研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
英语名词短语识别在机器翻译中有着重要的作用。现有英汉机器翻译的瓶颈之一就是名词短语的歧义消解问题。本文提出一种英语功能短语及其自动识别方法,以消除英汉机器翻译中的名词短语结构歧义。基于名词短语在小句中的功能语法来确定名词短语的边界,这样的名词短语在本文中称为功能名词短语。研究工作主要包括英汉机器翻译中的名词短语结构歧义问题分析、英文词性自动标注和英文短语识别等研究。本文构建了一个小规模的商务领域英汉双语平行语料库作为研究语料,包含20万英文词和27万汉字。
     (1)英汉机器翻译中的名词短语结构歧义问题。基于忠实度与流畅度合并的机器翻译人功能评价方法,对SYSTRAN和GOOGLE机译系统的英汉机器翻译结果进行评测,分析了机器翻译中的词义歧义和句法歧义问题;在此基础上,归纳了机器翻译中名词短语的结构歧义问题。研究表明,不论是词义歧义问题还是句法歧义问题,都与名词短语的识别和理解有很大的关系。其中,N1+prep+N2是引起歧义的最典型的表层结构,这种表层结构容易引起的歧义主要体现在:与动词构成固定搭配的名词歧义问题;小品词引起的歧义问题;“介词+名词”作后置定语的歧义问题;以及“介词+名词”作状语的歧义问题等四种名词短语结构歧义。
     (2)研究面向英汉机器翻译的英语词性标注。研究了一个应用于机器翻译的英语词性标注系统,为功能名词短语的识别研究提供词性知识。在预实验的基础上,对宾州树库标注集进行细化和改进,形成了本文的词性标注集。依据此标注集,采用最大熵模型结合语言规则的方法进行了标注词性。实验结果表明,开放测试的精确率达到98.14%,未登录词的精确率为85.65%。
     (3)研究英语功能名词短语识别。本文识别了名词短语的边界和句法功能。首先依据系统功能语法,归纳了功能名词短语在小句中的功能,形成了本文的功能块标注集;其次选择了条件随机域模型结合语义信息和规则的方法进行了名词短语识别。为检验本文的词性标注集在功能名词短语识别中的作用,在开放测试中还选择了斯坦福标注器作为比较。实验结果表明,结合金标准词性实验的F值达到了89.04%。此外,使用本文的词性标注集有助于提高名词短语的识别,比使用宾州树库标注集提高了2.21%。
English NP chunking plays an important role in machine translation. One major problem with machine translation lies in its ability to resolve the ambious problems caused by nouns. This paper, therefore, presents a study on the automatic identification of a kind of English functional noun phrases (NP), on the purpose of resolving structural ambiguity caused by noun phrases in English-Chinese machine translation (MT). Functional noun phrases refer to those noun phrases which are defined based on their syntactic functions in clauses. Structural ambiguity caused by noun phrases then can be solved by identifying their syntactic functions. The study includes the following three aspects:the analysis of the ambiguity problems in English-Chinese machine translation, the MT-oriented English Part-of-speech (POS) tagging, and the NP chunking. This NP chunking study is made on a self-built English-Chinese parallel corpus of in business domain which consists of200,000English words and270,000Chinese characters. The main research work can be summarized as follows:
     (1) The analysis of the structural ambiguity problems caused by noun phrases in English-Chinese machine translation. This paper makes a comparative analysis of the ambiguity resolution of two MT approaches:Rule Based Machine Translation (RBMT) and Statistical Machine Translation (SMT), by comparatively analyzing the Chinese translation work translated by SYSTRAN and GOOGLE translation systems, based on a manual machine translation evaluation method combining both faithfulness and smoothness. The results show that both lexical ambiguity and structural ambiguity have a lot do with NP chunking and understanding. And a surface structure N1+prep+N2is a typical structure which has caused ambiguity problem. Four main structural ambiguity problem caused by NPs are ambiguity caused by NPs when they make an inseparate part of the verb phrase, ambiguity caused by particles, ambiguity caused by "prep+noun" structures when they function as postmodifers, and ambiguity caused by "prep+noun" structures when they function as adjuncts.
     (2) The MT-oriented English POS tagging. This tagger is supposed to provide the following NP chunking task with POS tags, for the purposed of machine translation. After the result analysis of a pre-test, a new tagset is made for MT purpose, which is based on the Penn Treebank tag set, and the English sentences in the corpus are annotated on this new tagset. The statistical method combining the rules is applied in the study; the maximum entropy model is adopted, and rule-based approach is used in post-processing. Experiments show that our tagger achieves an accuracy of98.14%in open test, and85.65%correct on unknown words.
     (3) The NP chunking. Both the scope of NP chunks and the syntactic function types of NP chunks are identified in this task. The function tags of noun phrases are categorized, based on the systematic functional grammar. The conditional random fields model is adopted combining both the semantic information and language rules. The system performance is further compared with that of the model trained using Stanford POS tags in the open tests. Test results show that the system has achieved an F-score of89.04%in the open test using our gold standard tags, which also proves that our new tagset is a better approach for NP chunking, which has increased the F-score by2.21%, compared with the model using the Penn Tree bank POS tags.
引文
[1]SAUSSURE F. Course in General Linguistics[M]. Illinois:Open Court Publishing Company, 1986.
    [2]CHOMSKY N. Syntactic Structures[M]. The Hague:Mouton,1957.
    [3]HIRST G. Semantic Interpretation and the resolution of Ambiguity [M]. London:Cambridge University Press,1987.
    [4]LEECH G. Semantics[M]. Harmondsworth:Penguin,1981.
    [5]BOLINGER D. Aspects of Language[M]. New York:Harcourt Brace Jovanovich Inc.,1981.
    [6]6a GRICE H P. Logic and Conversation[J]. In Selected Readings in the philosophy of Language, edited by He Ziran. Guangdong University of Foreign Studies,1999:142-156.
    [7]张克礼.英语歧义结构[M].天津:南开大学出版社,1993.
    [8]朱德熙.汉语句法中的歧义现象[M].北京:商务印书馆,1980.
    [9]冯志伟.歧义结构的潜在性[J].中文信息学报,1995,9(4):14-24.
    [10]林洪志.论英语句子结构歧义及排除方法[J].外语与外语教学,2001,4:15-17.
    [11]刘建鹏.从转换生成的角度研究英语歧义结构的机制[D].西安:西安电子科技大学,2006.
    [12]马登阁.英语歧义的语法分析[J].北京第二外国语学院学报,2003,6:12-16.
    [13]HALLIDAY M A K. An Introduction to Functional Grammar (third edition)[M], Beijing: Foreign Language Teaching and Research Press,2008.
    [14]邱述德.英语歧义[M].北京:商务印书馆,1998.
    [15]马博森.英语语法歧义现象的多级阶和多元功能解释[J].上海外国语大学学报,1995,6:14-18.
    [16]孔亚明.歧义之系统功能语言学研究[J].天津外国语学报,2007,14(1):45-52.
    [17]付晓丽.英语结构歧义的功能语言学阐释[J].西安外国语大学学报,2009,17(2):42-61.
    [18]李华.英语名词短语歧义结构分析[J].保定师专学报,2001,14(3):104-105.
    [19]MANNING C D, SCHUTZE H.统计自然语言处理基础[M].北京:电子工业出版社,2005:6-8.
    [20]GREENE B B, RUBIN G M. Automatic grammatical tagging of English Technical Report[J]. Department of Linguistics, Brown University,1971.
    [21]BRILL E. A simple rule-based part of speech tagger[C]. In Proceedings of ANLP-92, 3rd conference on Applied Natural Language Processing, Trento, Italy,1995:152-155.
    [22]BRILL E. Transformation-based error-driven learning and natural language processing: a case study in Part-of-Speech tagging[J]. Computational Linguistics, 1995,21(4):543-564.
    [23]BACH N X, CUONG L A, HA N V, et al. Transformation rule learning without rule templates: a case study in Part of Speech tagging[C]. In Proceedings of 7th International Conference on Advanced Language Processing and Web (ALPIT 2008), Dalian, China, 2008:9-14.
    [24]ZAMORA-MARTINEZ F, CASTRO-BLEDA M J, ESPANA-BOQUERA S, et al. A connect ionist approach to Part-of-Speech tagging[C]. In Proceedings of 1st International Joint Conference on Computational Intelligence (IJCCI 2009), Funchal, Portugal,2009:421-426.
    [25]RATNAPARKHI A. A maximum entropy model for Part-of-Speech tagging[C]. In Proceedings of the Empirical Methods in Natural Language Processing Conference(EMNLP-96), Philadelphia, USA,1996:133-142.
    [26]HUANG H Y, ZHANG X F. Part-of-Speech tagger based on maximum entropy model[C]. In Proceedings of 2nd IEEE International Conference on Computer Science and Information, Beijing, China,2009:26-29.
    [27]周雅倩.最大熵方法及其在自然语言处理中的应用[D].上海:复旦大学,2005.
    [28]赵岩,王晓龙,刘秉权,等.融合聚类触发对特征的最大熵词性标注模型[J].计算机研究与发展,2006,43(2):268-274.
    [29]BRANTS T. TnT-A statistical Part-of-Speech tagger[C]. In Proceedings of the Sixth Applied Natural Language Processing Conference (ANLP2000), Seattle, USA, 2000:224-231.
    [30]COLLINS M. Discriminative training methods for Hidden Markov models:theory and experiments with perceptron algorithms[C]. In Proceedings of the Third Conference on Empirical Methods in Natural Language Processing, Philadelphia, USA,2002:1-8.
    [31]梁以敏,黄德根.基于完全二阶隐马尔可夫模型的汉语词性标注[J].计算机工程,2005,31(10):177-179.
    [32]WEISCHEDEL R, METEER M, SCHWARTZ R. Coping with Ambiguity and Unknown Words Through Probabilistic Models[J]. Computational Linguistics,1993,19(2):359-382.
    [33]XIAO J H, WANG X L, LIU B Q. The study of a nonstationary maximum entropy markov model and its application on the pos-tagging task [J]. ACM Transactions on Asian Language Information Processing,2007,6 (2):1-8.
    [34]YUAN L C. Improvement for the automatic part-of-speech tagging based on hidden Markov model[C]. In Proceedings of 2nd International Conference on Signal Processing Systems (ICSPS 2010), Dalian, China,2010:744-747.
    [35]PHAN X H. CRF Tagger:CRF English POS Tagger. [2012,04,08] http://erftagger.sourceforge.net/
    [36]姜维,关毅,王晓龙.基于条件随机域的词性标注模型[J].计算机工程与应用,2006,21(1):13-16.
    [37]DAELEMANS W, ZAVREL J, BERCK P, et al. MBT:A memory-based Part of Speech tagger-generator[C]. In Proceedings of the Fourth Workshop on Very Large Corpora, Copenhagen, Denmark,1996:14-27.
    [38]SCHMID H. Probabilistic Part-of-Speech tagging using decision trees[C]. In Proceedings of the International Conference on New Methods in Language Processing, Manchester, UK,1994:44-49.
    [39]JELINEK F, LAFFERTY J, MANAGERMAN D, et al. Decision tree parsing using a hidden deviation model[C]. In Proceedings of the Human Language Technology Workshop, Plainsboro, USA,1994:272-277.
    [40]MAGERMAN D. Statistical decision-tree models for parsing [C]. In Proceedings of the 33rd Annual Meeting of the ACL, Massachusetts, USA,1995:276-283.
    [41]WANG Y S. Research on part-of-speech tagging using decision trees in English-Chinese machine translation system[J]. Computer Engineering and Applications,2010, 46(20):99-102.
    [42]郭永辉,吴保民,王炳锡.一种用于词性标注的相关投票融合策略[J].中文信息学报,2007,21(2):9-13.
    [43]Dos-SANTOS C N, MILIDIU R L, RENTERIA R P. Portuguese Part-of-Speech tagging using entropy guided transformation[C]. In Proceedings of 8th International Conference on Computational Processing of the Portuguese Language, Aveiro, Portugal,2008: 143-152.
    [44]AIN K. Hidden Markov model with rule based approach for Part of Speech tagging of Myanmar language[C]. In Proceedings of 3rd International Conference on Communications and Information, Vouliagmeni, Greece,2009:123-128.
    [45]阴晋岭,王惠临.词性标注的方法研究——结合条件随机场和基于转换学习的方法进行词性标注[J].现代图书情报技术,2009(3):46-51.
    [46]王蕾,朱巧明,李培峰,等.基于实例和错误驱动的规则学习方法及其应用[J].计算机应用与软件,2008,25(1):162-164.
    [47]张清华.融合技术在中文名实体识别中的研究与应用[D].哈尔滨:哈尔滨工业大学,2004.
    [48]周明,吴进,黄昌宁.用于词性标注的一种快速学习算法[J].计算机学报,1998(21):357-366.
    [49]刘群,张华平,俞鸿魁,等.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429.
    [50]张民,李生,赵铁军.统计与规则并举的汉语词性自动标注算法[J].软件学报,1998,9(2):134-138.
    [51]黄德根,张丽静,张艳丽.规则与统计相结合的兼类词处理机制[J].小型微型计算机系统,2003,24(7):1252-1255.
    [52]POS Tagging (State of the art). [2012,04,07] http://www.aclweb.org/aclwiki/index.php?title=POS_Tagging_(State_of_the_art)
    [53]DENIS P, SAGOT B. Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort [C]. In Proceedings of the Pacific Asia Conference on Language, Information and Computation (PACLIC 23), Hong Kong, China,2009:110-119.
    [54]TSURUOKA Y, TATEISHI Y, KIM J D, et al. Developing a Robust Part-of-Speech Tagger for Biomedical Text, Advances in Informatics[C]. In Proceedings of the 10th Panhellenic Conference on Informatics, Volas, Greece,2005:382-392.
    [55]TSURUOKA Y, TSUJII J. Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data[C]. In Proceedings of HLT/EMNLP 2005, Vancouver, Canada, 2005:467-474.
    [56]TOUTANOVA K, KLEIN D, MANNING C D, et al. Feature-rich part-of-speech tagging with a cyclic dependency network[C]. In Proceedings of HLT-NAACL 2003, Edmonton, Canada, 2003:252-259.
    [57]GIMRNEZ J, MARQUEZ L. SVMTool:A general POS tagger generator based on Support Vector Machines[C]. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04). Lisbon, Portugal,2004:43-46.
    [58]SPOUSTOVA D J, HAJIC J, RAAB J, et al. Semi-supervised Training for the Averaged Perceptron POS Tagger. In Proceedings of the 12 EACL, Athens, Greece,2009:763-771.
    [59]MANNING C D. Part-of-Speech Tagging from 97% to 100%:Is It Time for Some Linguistics?[C] In Proceedings of the 12th International Conference on of Intelligent Text Processing and Computational Linguistics(CICLing 2011), Tokyo, Japan,2011:171-189.
    [60]SHEN L, SATTA G, JOSHIA. Guided learning for bidirectional sequence classif ication[C]. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL 2007), Prague, Czech Republic,2007:760-767.
    [61]SoGGARD A. Semi-supervised condensed nearest neighbor for part-of-speech tagging[C]. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies (ACL-HLT), Portland, USA,2011:48-52.
    [62]CHURCH K. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text [C]. In Proceedings of the second Conference on Applied Natural Language Processing, Austin, USA,1988:136-143.
    [63]VOUTILAMEN A. NPTool, A Detector of English Noun Phrases[C]. In Proceedings of the Workshop on Very Large Corpora:Academic and Industrial Perspectives, Columbus, USA, 1993:48-57.
    [64]RAMSHAW L, MARCUS R. Text Chunking using Transformation-Based Learning[C]. In Proceedings of the Fourth Workshop on Very Large Corpus, Copenhagen, Denmark, 1995:82-94.
    [65]KOEHN P, KNIGHT K. Feature-Rich Statistical Translation of Noun Phrases[C]. In Proceedings of the 41st Annual Meeting of the association for Computational Linguistics, Sapporo, Japan,2003:311-318.
    [66]周雅倩,郭以昆,黄萱菁,等.基于最大熵方法的中英文基本名词短语识别[J].计算机研究与发展,2003,40(3):440-446.
    [67]梁颖红,赵铁军,岳琪.英语基本名词短语识别技术研究[J].信息技术,2004,28(12):22-24.
    [68]吕琳,刘玉树.最大熵和Brill方法结合识别英语BaseNP[J].北京理工大学学报,2006,26(6):500-503.
    [69]李文捷,周明,潘海华,等.基于语料库的中文最长名词短语的自动提取[J].陈力为,袁琦主编.计算语言学进展与应用.北京:清华大学出版社,1995:119-124.
    [70]周强,孙茂松,黄吕宁.汉语最长名词短语的自动识别[J].软件学报,2000,11(2):195-201.
    [71]冯冲,陈肇雄,黄河燕,等.基于条件随机域的复杂最长名词短语识别[J].小型微型计算机系统,2006,27(6):1134-1139.
    [72]代翠,周俏丽,蔡东风,等.统计和规则相结合的汉语最长名词短语自动识别[J].中文信息学报,2008,22(6):110-115.
    [73]钱小飞.以“的”字结构为核心的最长名词短语识别研究[J].计算机工程与应用,2010,46(18):138-141.
    [74]赵军.汉语基本名词短语识别及结构分析研究[D].北京:清华大学.1998.
    [75]李素建,刘群.汉语组块的定义和获取[C].全国第七届计算语言学联合学术会议(SWCL2003)论文集,哈尔滨,2003:110-115.
    [76]马艳军,刘颖.基于隐马尔可夫模型和候选排序的汉语基本名词短语识别[C].全国第八届计算语言学联合学术会议(JSCL-2005)论文集,南京,2005.
    [77]马艳军,刘颖.汉英准等价名词短语[C].第二届全国信息检索与内容安全学术会议(NCIRCS-2005)论文集,北京,2005.
    [78]ABNEY S. Partial Parsing Via finite-state Cascades[C]. In Proceedings of the ESSLLI' Robust Parsing Workshop, Prague, Czech Republic,1996.
    [79]CARDIE C, PIERCE D. Error-driven Pruning of Treebank Grammars for Base Noun Phrase Identification[C]. In Proceedings of COL2 ING-ACL'98, Montreal, Canada, 1998:218-224.
    [80]VEENSTRA J, BUCHHOLZ S. Fast NP Chunking Using Memory-Based Learning Techniques[C]. In Proceedings of the Eighth Belgian-Dutch Conference on Machine Learning, Wageningen, Netherlands,1998:71-78.
    [81]DAELEMANS W, BUCHHOLZ S, VEENSTRA J. Memory-Based Shallow Parsing[C]. In Proceedings of the CoNLL-99 Workshop, Bergen, Norway,1999:53-70.
    [82]ERIK F, TJONG K S. Memory-based Shallow Parsing[J]. Journal of Machine Learning Research,2002,2:559-594.
    [83]TAKEYA K, LEPAGE Y. Marker-based chunking for analogy-based translation of chunks[C].In Proceedings of the 2011 MT Summit 13, Xiamen, China,2011:338-345.
    [84]ZHANG T, DAMERAU F, JOHNSON D. Text Chunking Using Regularized Winnow[C]. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France,2001:539-546.
    [85]郭永辉,杨红卫,马芳,等.基于粗糙集的基本名词短语识别[J].中文信息学报,2006,20(3):14-21.
    [86]李生,孟遥.基于决策树的英语BNP识别[J].黑龙江工程学院学报,2001,15(1):36-39.
    [87]KUDO T, MATSUMOTO Y. Use of Support Vector Learning for Chunk Identification[C]. In Proceedings of the 4th Conference on CoNLL-2000 and LLL-2000, Lisbon, Portugal, 2000:142-144.
    [88]KUDO T, MATSUMOTO Y. Chunking with support vector machines[C]. In Proceedings of NAACL-2001, Pittsburgh, USA,2001.
    [89]WUYC, LEE Y S, YANG J C. Robust and Efficient Multiclass SVM Models for Phrase Pattern Recognition[J]. Pattern Recognition,2008,41:2874-2889.
    [90]KOELING R. Chunking with Maximum Entropy Models[C]. In Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal,2000:139-141.
    [91]CHARNIAK E. A Maximum-Entropy inspired parser[C]. In Proceedings of NAACL-2000, Seattle, USA,2000:132-139.
    [92]王晓娟,赵春.最大熵方法在英语名词短语识别中的应用研究[J].计算机仿真,2011,28(3):414-417.
    [93]MOLINA A, PLA F. Shallow Parsing using Specialized HMMs. Journal of Machine Learning Research,2002,2:595-613.
    [94]SHEN H, SARKAR A. Voting between multiple data representations for text chunking[C]. In Proceedings of the Eighteenth Meeting of the Canadian Society for Computational Intelligence, Canadian AI, Victoria, Canada,2005:389-400.
    [95]LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williamstown, USA, 2001:282-289.
    [96]SHA F, PEREIRA F. Shallow Parsing with Conditional Random Fields[C]. In Proceedings of HLT-NAACL 2003, Edmonton, Canada,2003:213-220.
    [97]SUN X, MORENCY L P, OKANOHARA D, et al. Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference[C]. In Proceedings of the 22nd International Conference on Computational Linguistics,2008, Manchester, UK, 2008:841-848.
    [98]梁颖红,赵铁军,翟舒.规则和边界统计相结合的英语基本名词短语识别[C].全国第七届计算语言学联合学术会议论文集,哈尔滨,2003:173-178.
    [99]谭魏璇,孔芳,倪吉,等.基于混合统计模型的中文基本名词短语识别[J].计算机应用与软件,2011,28(8):254-156.
    [100]ARGAMON S, DAGON I, KRYMOLOWSKY Y. A memory-based approach to learning shallow natural language patterns[C]. In Proceedings of COLING-ACL'98, Montreal, Canada, 1998:67-73.
    [101]NP Chunking (State of the art). [2012,04,07] http://aclweb.org/aclwiki/index.php?title=NP_Chunking_(State_of_the_art)
    [102]VISHWANATHAN S V N, SCHRAUDOLPH N, SCHMIDT M, et al. Accelerated Training Conditional Random Fields with Stochastic Gradient Methods[C]. In Proceedings of International Conference on Machine Learning, New York, USA,2006:969-976.
    [103]MCDONALD R, CRAMMER K, PEREIRA F. Flexible Text Segmentation with Structured Multilabel Classification[C]. In Proceedings of Human Language Technology Conference on Empirical Methods in Natural Language Processing (HLT-EMNLP), Vancouver, Canada,2005:987-994.
    [104]SANG K, VEENSTRA J. Representing texting chunks[C]. In Proceedings of the 7th Conference of the European Association for Computational Linguistics(EACL-1999), Bergen, Norway,1999:173-179.
    [105]廖博森.自然语言处理中介词短语附着消歧问题的研究[D].成都:电子科技大学,2010.
    [106]赵铁军,方高林,李生.英语介词短语附着决策的研究[J].高技术通讯,2001,3:36-40.
    [107]CAI D F, ZHANG L, ZHOU Q L, et al. A collocation based approach for prepositional phrase identification[C]. In Proceedings of IEEE NLP-KE2011, Tokushima, Japan, 2011:199-204.
    [108]SINCLAIR J.柯林斯COBUILD英语语法句型2:名词与形容词[M].上海:上海外语教育出版社,2000.
    [109]ABNEY S. Parsing by Chunks[J]. In:Principal-Based Parsing. Dordrecht:Kluwer Academic Publishers,1991:1-18.
    [110]SANG K, BUCHHOLZ. Introduction to the CoNLL-2000 Shared Task:Chunking[C]. In Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal,2000:127-132.
    [111]吴洁.商务英语的特点及翻译[J].中国科技翻译,2008,21(4):18-20.
    [112]黄水乞.外贸英文信函范例与常用精句[M].广州:广东经济出版社,2006.
    [113]李爽.国际商务函电[M].北京:清华大学出版社,2008.
    [114]松尾裕一,增泽史子.英语商务书信110[M].大连:大连理工大学出版社,2003.
    [115]兰天,时敏,叶富国.外贸英语函电[M].北京:科学出版,2010.
    [116]凌芳.商务英语函电模板手册[M].北京:机械工业出版,2010.
    [117]潘事文,熊昌英,林丽娟,主编.现代商务英语应用文[M].北京:中国水利水电出版,2010.
    [118]盛丹丹.商务英语步步赢-公文锦囊赢天下[M].北京:国防工业出版社,2010.
    [119]王慧莉,刘文宇.无敌商务英语信函[M].大连:大连理工大学出版社,2009.
    [120]王金荣.贸易函电英文写作案例大全[M].北京:中国宇航出版社,2009.
    [121]商务英语函电(中英对照).[2010,09,01]http://wenku.baidu.com/view/9d385e563clec5da50e27037.html
    [122]外贸英语常用语全攻略.[2010,09,01]http://wenku.baidu.com/view/18bl4e7c27284b73f242509c.html
    [123]外贸英语函电常用用语.[2010,09,01]http://wenku.baidu.com/view/60dda8f80242a8956bece4f7.html
    [124]外贸英语函电例句.[2010,09,01]http://wenku.baidu.com/view/c321440dbalaa8114431d9f3.html
    [125]外贸英语(运输常用句子).[2010,09,01]http://www.365u.com.cn/WenZhang/Detail/Article_38536.html
    [126]万用商业英文书信例句.[2010,09,01]http:/7wenku.baidu.com/view/e4f118d049649b6648d74771.html
    [127]姚振军.句法“最简方案”与“最简模式”机器翻译[J].大连理工大学学报(社会科学版),2005,26(1):86-91.
    [128]鲁孝贤.机器翻译语义排歧的方法[J].中国科技翻译,2007,20(4):20-25.
    [129]HUTCHINS J, SOMERS L. An Introduction to Machine Translation[M]. London:Academic Press,1992:81-98.
    [130]ARNOLD D, BALKAN L, MEIJER S, et al. Machine Translation:an Introductory Guide[M]. London:NCC Blackwell,1994:111-128.
    [131]GEER D. Statistical machine translation gains respect[J]. IEEE Computer,2005: 18-21.
    [132]NIST 2009 Open Machine Translation Evaluation (MTO9) Official Release of Results. [2009,10,27] http://www.itl.nist.gov/iad/mig/tests/mt/2009/ResultsRelease/
    [133]杨平.欧美的机器翻译[J].中国翻译,1995(2):47-54.
    [134]王正,孙东云.统计机器翻译系统在网络翻译教学中的应用[J].上海翻译,2009(1):73-77.
    [135]刘群.统计机器翻译综述[J].中文信息学报,2003(4):1-12.
    [136]SINCLAIR J. COBUILD英汉双解词典[M].上海:上海译文出版社,2002.
    [137]侯宏旭,刘群,张玉洁,等.2005年度863机器翻译评测方法研究与实施[J].中文信息学报,2006,20(3):7-18.
    [138]MARCUS M P, SANTORINI B, MARCINKIEWICZ H A. Building a large annotated corpus of English:the Penn Treebank[J]. Computational Linguistics,1993,19(2):313-330.
    [139]SINCLAIR J柯林斯COBUILD英语语法句型1:动词[M].上海:上海外语教育出版社,2000.
    [140]GREENE B B, RUBIN G M. Automatic Grammatical Tagging of English[R]. Technical Report, Department of Linguistics, Brown University,1971.
    [141]GARSIDE R, SMITH N. A hybrid grammatical tagger:CLAWS4[J]. In Roger Garside, Geoffrey Leech, and Anthony McEnery, editors, Corpus Annotation:Linguistic Information from Computer Text Corpora. London:Longman,1997:102-121.
    [142]BALDWIN T, VILLAVICENCIO A. Extracting the Unextractable:A Case Study on Verb-particles[C]. In Proceedings of the 2002 Conference on Natural Language Learning, Taipei,2002:1-7.
    [143]KIM S N, BALDWIN T. Automatic Identification of English Verb Particle Constructions Using Linguistic Features[C]. In Proceedings of the 2006 Meeting of the Association for Computational Linguistics:Workshop on Prepositions, Trento, Italy, 2006:65-72.
    [144]VILLAVICENCIO A. Verb-particle Constructions and Lexical Resources[C]. In Proceedings of the Meeting of the Association for Computational Linguistics:2003 workshop on Multiword expressions, Sapporo, Japan,2003:57-64.
    [145]KUMMERFELD J K, CURRAN J R. Classification of Verb-Particle Constructions with the GoogleWeblT Corpus[C]. In Proceedings of the Australasian Language Technology Association Workshop (ALTA 2008), Tasmania, Australia,2008:55-63.
    [146]JAYNES E T. Information Theory and Statistical Mechanics[J]. Phys. Rev.,1957, 106(4):620-630.
    [147]DARROCH J N, RATCLIFF D. Generalized iterative scaling for log-linear models[J]. The Annals of Mathematical Statistics,1972,43:1470-1480.