用户名: 密码: 验证码:
英文篇章结构分析关键问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近三年来,篇章结构分析(Discourse Structure Analysis,简称DSA)受到了计算语言学界的广泛关注(据统计,每届ACL、COLING和EMNLP会议上均发表8篇以上篇章结构分析方面的论文,而这方向的投稿文章则达30篇以上)。篇章结构分析研究成为了继传统信息抽取/信息检索、机器翻译和句法/语义分析领域之后的又一个研究热点。
     DSA旨在研究自然语言文本的内在结构,通过对文本单元(可以是词、短语、从句、句子或段落)的上下文进行全局分析来理解文本单元间的语义关系。因此,篇章结构分析能够抽取出文本内部丰富的结构化信息,对自然语言理解和自然语言生成均起着至关重要的作用。目前主流的DSA研究比较注重篇章中的词汇层面信息,例如:篇章中单词、单词形态学变化和单词对等;然而,篇章中句子的态度和句子的衔接方式等方面的信息却少有研究,导致目前的篇章结构分析性能不高。
     鉴于此,本文围绕学界广泛关注问题,在以下三个方面展开研究。具体而言:
     1.隐式篇章关系识别(Implicit Discourse Relation Recognition,简称IDRR)研究。本文在研究了基于单词对、语言模型和树核函数的隐式篇章关系识别模型的基础上,提出了一个基于态度韵律理论的隐式篇章关系识别模型。该模型通过计算句子的态度/情感来识别隐式篇章关系,并采用复合核方法集成了一种依存词对树核结构。上述方法在国际基准语料Penn Discourse Treebank (PDTB)2.0上进行试验,实验表明采用基于态度韵律理论的隐式篇章关系识别模型后,IDRR的准确率与目前基于单词对、语言模型和树核函数的方法相比得到显著提升。
     2.篇章论元识别(Discourse Argument Identification,简称DAI)研究。本文从句内(连接词与论元处于同一句)和句外(连接词与论元不处于同一句)两种情形分别处理DAI。针对句内情况,在研究基于组块、基于分类和基于句法树裁减方法的篇章论元识别模型的基础上,提出了一个基于浅层语义分析框架的篇章论元识别模型。该模型将篇章连接词看作谓词,并将谓词的论元映射成句法树中的一些组块,将传统方法的组块层次研究提升为具有丰富句法信息的句法树层次,同时将组块而不是单词作为篇章论元的识别单元。针对句外情况,本文提出了一种轻量级的规则解决方案,将连接词到当前句尾的单词序列和连接词的前一句分别作为连接词对应的两个论元。上述方法在国际基准语料PDTB上进行试验,实验表明采用基于浅层语义分析框架的篇章论元元识别模型后,DAI的F1值与目前基于组块的方法相比得到显著提升。
     3.篇章连贯性建模(Discourse Coherence Modeling,简称DCM)研究。本文在研究了基于实体和基于篇章关系的篇章连贯性模型的基础上,提出了一个基于主位-述位结构衔接性理论的篇章连贯性模型。该模型通过计算句子中主位和述位的相似度来描述篇章连贯性,并采用规则方法集成了基于主位结构和指代消解的两种篇章连贯性过滤机制。上述模型在五种不同文体的国际基准语料上进行试验,实验表明采用基于衔接性理论的篇章连贯性模型后,DCM的准确率与目前基于实体和篇章关系的有监督学习方法相比得到显著提升。
     在此基础上,本文设计了基于树核的英文篇章结构分析平台,并将上述三个关键问题的研究算法一并加以集成。为了验证上述方法对自然语言处理相关应用的实际作用,本文引入了学生作文的可读性评估作为测试实例,通过线性拟合等方法,将篇章关系值和篇章连贯性值作为可读性评估值。上述研究在开放语料上建立模型,并在实际语料上进行试验,结果表明了本文构建的英文篇章结构分析平台对于学生作文可读性评估十分有效,与目前基于实体和基于篇章关系的有监督学习方法相比,在算法精度和减少对大规模语料库的依赖性等方面具有优势。
     本文的创新点主要表现在:(1)针对隐式篇章关系识别研究,提出了基于态度韵律理论的隐式篇章关系识别模型。该模型通过计算句子的态度/情感来识别隐式篇章关系,并采用复合核方法集成了一种依存词对树核结构。与同类方法相比,在国际基准PDTB语料上将隐式篇章关系识别性能提升了大约6%;(2)针对篇章论元识别研究,提出了基于浅层语义分析框架的篇章论元识别模型,将传统方法的组块层次提升为具有丰富句法信息的句法树层次,同时将组块而不是单词作为篇章论元的识别单元。与同类方法相比,在国际基准PDTB语料上将标准句法树和自动句法树上的篇章论元识别性能分别提升了大约2%和6%;(3)针对篇章连贯性建模研究,提出了基于主位-述位结构衔接性理论的篇章连贯性模型,通过计算句子中主位和述位的相似度来描述篇章连贯性,并采用规则方法集成了基于主位结构和指代消解的两种篇章连贯性过滤机制。与同类方法相比,在国际基准Accident、Earthquake、Wall street journal、Britannical elementary语料上将篇章连贯性检测性能分别提升了3%-6%。
     本文的主要贡献:对篇章结构分析中的关键技术进行了深入的研究,提出了相关问题的一些解决方法,并设计了相应的算法和实验。实验表明,本文提出的这些方法有助于提高篇章结构分析的性能,同时减少对大规模语料库的依赖性,为今后的篇章结构分析研究奠定了一个重要基础,为同类研究提供了一个参考。
In recent three years, Discourse Structure Analysis (DSA) has been paied muchattention in the computational linguistic area (according to statistics, ACL, COLING andEMNLP publish at least8papers from more than30submissions related to DSA field eachyear). DSA has been regarded as the next hot topic after the traditional informationextraction/information retrieval, machine translation and syntactic/semantic analysis.
     DSA aims to investigate the internal structure of natural language text and tounderstand the semantic relationship between the text units which can be a word, a phrase,a clause, a sentence or even a paragraph, and it needs to analyse the whole structure of textunits. Therefore, DSA can further extract the rich structural information within texts, andplays an important role in both Natural Language Processing (NLP) and Natural LanguageGeneration (NLG). Generally speaking, about the DSA research, the mainstream methodpaied much attention to the lexical information in discourse such as token, morphology oftoken or token pairs in discourse. However, attitude of a sentence, the cohesion mechanismamong sentences in a discourse are often ignored. Therefore, the performance of thecurrent DSA is not efficient.
     Against above background, this paper focuses on the following three key problems inDSA mentioned in the compunational linguistics area. To be more specific,
     1. The research on Implicit Discourse Relation Recognition (IDRR). We present anattitude prosody theory-based IDRR model on the basis of the research of word pairs-based,language model-based and tree kernel-based IDRR models. Our model recognizes implicitdiscourse relation via calculating sentence-level attitude/sentiment information, in themeanwhile, also integrates a depencency word pair tree structure via a composite kernelways. Evaluation on the Penn Discourse Treebank (PDTB)2.0shows the importance of the attitude prosody theory-based IDRR model. It also shows that our model significantlyoutperforms other ones currently in the research field, e.g. word pairs-based, languagemodel-based and tree kernel-based models.
     2. The research on Discourse Argument Identification (DAI). This paper deals withDAI from both intra-sentences where connective and argument are located in a sentenceand inter-sentence where connective and argument are located in different sentencesperspectives. For the intra-sentences cases, we present a shallow semantic parsingframework-based model on the basis of the research of chunking-based,classification-based and syntactic tree subtraction-based models. Our model recasts thediscourse conjunction as the predicate and its scope into several constituents as the part ofthe predicate. Different from state-of-the-art chunking approaches, our parsing approachextends DAI from the chunking level to the parse tree level, where rich syntacticinformation is available, and focuses on determining whether a constituent, rather than atoken, is an argument or not. For inter-sentence cases, we present a lightweight heuristicrule-based solution which takes the word sequence between the connective and the end ofcurrent sentence and the direct previous sentence before the connective are two discoursearguments of the connective. Evaluation on PDTB shows that the effectiveness of ourshallow semantic parsing framework-based model. It also shows that our modelsignificantly outperforms chunking-based model currently in the research field.
     3.The research on Discourse Coherence Modeling (DCM). We present a theme-rhemestructure cohesion theory-based discourse coherence model on the basis of the research ofentity-based and discourse relation-based models. Our model describes discoursecoherence via calculating the similarity between theme or rheme of a sentence, in themeanwhile,also integrates two coherence filtering mechanisms based on theme structureand coreference using rule method. Evaluation on five different benchmark data setsreveals the effectiveness of our cohesion theory-driven discourse coherence model. It alsoshows that our system significantly outperforms other ones currently in the research field,e.g. supervised entity-based and discourse relation-based models.
     We integrate the above three key problems into our tree kernel-based discourse parsing platform based on above research. In order to verify the practical function of thesemethods in NLP applications, we investigate the applications of DSA using the studentessay readability assessment task, and take the linear combination of discourse relationshipvalue and discourse coherence value as readability value. We train the linear parameters onthe open dataset and test them on the actual dataset. Evaluation on the actual dataset showsthe influence of our discourse structure analysis platform in the student essay readabilityassessment. It also shows that our model significantly outperforms other ones currently inthe research field, e.g. supervised entity-based and discourse relation-based models. It cannot only significantly improve the system performance, but also alleviates its dependenceon large-scale annotated corpora.
     The major innovations of this dissertation include: for the IDRR, we present anattitude prosody theory-based IDRR model to recognize implicit discourse relation viacalculating sentence-level attitude/sentiment information, in the meanwhile, also integratea depencency word pair tree structure via a composite kernel ways. Evaluation on both theopen and closed corpus shows the performance improvement about6%compared with thestate-of-the-art approaches; for the DAI, we present a DAI model based on shallowsemantic parsing framework. Our parsing approach extends DAI from the chunking levelto the parse tree level, where rich syntactic information is available, and focuses ondetermining whether a constituent, rather than a token, is an argument or not. Evaluationon the benchmark PDTB corpus shows the performance improvement about2%and60%using golden and automatic parser trees respectively compared with the state-of-the-artapproaches; for the DCM, we present a theme-rheme structure cohesion theory-basedDCM model to describe discourse coherence via calculating the similarity between themeor rheme of a sentence, in the meanwhile, also integrate two coherence filteringmechanisms based on theme structure and coreference using rule method. Evaluation onthe benchmark Accident, Earthquake, Wall street journal and Britannical elementary corpusshows the performance improvement about3%-6%compared with the state-of-the-artapproaches.
     The major contributions of this paper lie in presenting some solutions and designing corresponding algorithms to the key technologies of DSA. Experiments show that theabove research not only significantly improves the performance of discourse structureanalysis, but also alleviates its dependence on large-scale annotated corpora. The proposedapproach laies a foundation and exhibits greate reference value to the future research in thediscourse structure analysis area.
引文
[1]冯志伟.计算语言学基础[M].北京:商务印书馆,2001.
    [2] Hobbs J. R. Coherence and coreference[J]. Cognitive Science,1979,3(1):67-90.
    [3] Hobbs J. R. Information, Intention, and Structure in Discourse: a First Draft[C]. In Proceedings of theNATO Workshop on BID1993:41-66.
    [4] Mann W. C. and Thompson S. A. Relational Propositions in discourse[J]. Discourse Processing,1986,9(1):57-90.
    [5] Mann W. C. and Thompson S. A. Rhetorical Structure Theory: Toward a Functional Theory of TextOrganization [J]. Text,1988,8(3):243-281.
    [6] Marcus D. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts[D].Ph.D Thesis, Department of Computer Science, University of Toronto,1997.
    [7] Marcus D. The Theory and Practice of Discourse Parsing and Summarization[M]. MIT Press,2000.
    [8] Grosz B. J. and Sidner C. L. Attention, Intentions, and the Structure of Discourse[J]. ComputationalLinguistics,1986,12(3):175-204.
    [9] Grosz B. J., Joshi A. K., and Weinstein S. Centering: a Framework for Modeling the Local Coherenceof Discourse[J]. Computational Linguistics,1995,21(2):203-225.
    [10] Webber B. D-LTAG: Extending Lexicalized TAG to Discourse[J]. Cognitive Science,2004,28(5):751-779.
    [11]吴为章,田小琳.汉语句群[M].北京:商务印书馆,2000.
    [12]邢福义.汉语复句研究[M].北京:商务印书馆,2001.
    [13] Taboada M. and Mann W. C. Applications of Rhetorical Structure Theory[J]. Discourse Studies,2006,8(4):567-588.
    [14]卫真道(著),徐赳赳(译).篇章语言学[M].北京:中国社会科学出版社,2002.
    [15] Joshi A. K. and Schabes Y. Tree-Adjoing Grammar and Lexicalized Grammars[R]. USA: University ofPennsylvania,1991.
    [16] Forbes K., Miltsakaki E., Prasad R., Sarkar A., Joshi A., and Webber B. D-LTAG System: DiscourseParsing with a Lexicalized Tree-adjoining Grammar[J]. Journal of Logic, Language and Information,2001,12(3):261-279.
    [17] Prasad R., Dinesh N., Lee A., Miltsakaki E., Robaldo L., Joshi A. and Webber B. The Penn DiscourseTreebank2.0[C]. In Proceedings of the LREC2008:2961-2968.
    [18] Carlson L., Marcu D., and Okurowski M. E. Building a Discourse-Tagged Corpus in the Framework ofRhetorical Structure Theory[C]. In Proceedings of the SIDAL2001:30–39.
    [19] Wolf F. and Gibson E. Representing discourse coherence: a corpus-based analysis[C]. In Proceedingsof the COLING2004:134-140.
    [20] Lin Z. H., Ng H. T., and Kan M. Y. A PDTB-styled End-to-end Discourse Parser[R]. Singapore:National University of Singapore,2010.
    [21] Pitler E. and Nenkova A. Using Syntax to Disambiguate Explicit Discourse Connectives in Text[C]. InProceedings of the ACL-AFNLP2009:13-16.
    [22] Pitler E., Raghupathy M., Mehta H., Nenkova A., Lee A., and Joshi A. Easily Identifiable DiscourseRelations[C]. In Proceedings of the COLING2008:85–88.
    [23] Wellner B. and Pustejovsky J. Automatically Identifying the Arguments of Discourse Connectives[C].In Proceedings of the EMNLP-CONLL2007:92-101.
    [24] Elwell R. and Baldridge J. Discourse Connective Argument Identification with Connective SpecificRankers[C]. In Proceedings of the ICSC2008:198-205
    [25] Dinesh N., Lee A., Miltsakaki E., Prasad R., and Joshi A. Attribution and the (Non-)alignment ofSyntactic and Discourse Arguments of Connectives[C]. In Proceedings of the ACL Workshop on FCA2005:29-36.
    [26] Prasad R., Joshi A., and Webber B. Exploiting Scope for Shallow Discourse Parsing[C]. InProceedings of the LREC2010:2076-2083.
    [27] Ghosh S., Johansson R., Riccardi G., and Tonelli S. Shallow Discourse Parsing with ConditionalRandom Fields[C]. In Proceedings of the IJCNLP2011:1071-1079.
    [28] Ghosh S., Tonelli S., Riccardi G., and Johansson R. End-to-end Discourse Parser Evaluation[C]. InProceedings of the ICSC2011:169-172.
    [29] Ghosh S., Riccardi G., and Johansson R. Global Features for Shallow Discourse Parsing[C]. InProceedings of the SIGDIAL2012:150-159.
    [30] Prasad R., Joshi A., and Webber B. Realization of Discourse Relations by Other Means: AlternativeLexicalizations[C]. In Proceedings of the COLING2010:1023-1031.
    [31] Wang W. T., Su J., and Tan C. L. Kernel Based Discourse Relation Recognition with TemporalOrdering Information[C]. In Proceedings of the ACL2010:710-719.
    [32] Zhou Z. M., Xu Y., Niu Z.Y., Lan M., Su J., and Tan C. L. Predicting of Discourse Connectives forImplicit Discourse Relation Recognition[C]. In Proceedings of the COLING2010:1507-1514.
    [33] Zhou Z.M., Lan M., Niu Z.Y., Xu Y., and Su J. The Effects of Discourse Connectives Prediction onImplicit Discourse Relation Recognition[C]. In Proceedings of the SIGDIAL2010:139-146.
    [34] Lin Z. H., Kan M. Y., and Ng H. T. Recognizing Implicit Discourse Relations in the Penn DiscourseTreebank[C]. In Proceedings of the EMNLP2009:343-351.
    [35] Pitler E., Louis A., and Nenkova A. Automatic Sense Prediction for Implicit Discourse Relations inText[C]. In Proceedings of the ACL-AFNLP2009:683-691.
    [36] Hernault H., Bollegala D., and Ishizuka M. A Semi-Supervised Approach to Improve Classification ofInfrequent Discourse Relations using Feature Vector Extension[C]. In Proceedings of the EMNLP2010:399-409.
    [37] Tofiloski M., Brooke J., and Taboada M. A Syntactic and Lexical-Based Discourse Segmenter[C]. InProceedings of the ACL-AFNLP2009:77-80.
    [38] Soricut R. and March D. Sentence Level Discourse Parsing Using Syntactic and LexicalInformation[C]. In Proceedings of the NAACL-HLT2003:149-156.
    [39] LeThanh H., Abeysinghe G., and Huyck C. Generating Discourse Structures for Written Texts[C]. InProceedings of the COLING2004:329-335.
    [40] Hernault H., Bollegala D., and Ishizuka M. A Sequential Model for Discourse Segmentation[C]. InProceedings of the ITPCL2010:315-326.
    [41] DuVerle D. A. and Prendinger H. A Novel Discourse Parser Based on Support Vector MachineClassification[C]. In Proceedings of the ACL-AFNLP2009:665-673.
    [42] Feng V. W. and Hirst G. Text-level Discourse Parsing with Rich Linguistic Features[C]. InProceedings of the ACL2012:60-68.
    [43] Pustejovsky J., Havasi C., Rumshisky A. and Sauri R. Classification of Discourse Coherence Relations:an Exploratory Study using Multiple Knowledge Sources[C]. In Proceedings of the SIGDIALWorkshop on DD2006:117-125.
    [44] Barzilay R. and Lapata M. Modeling Local Coherence: An Entity-Based Approach[C]. In Proceedingsof the ACL ACL2005:141-148.
    [45] Barzilay R. and Lapata M. Modeling Local Coherence: An Entity-Based Approach[J]. ComputationalLinguistics,2008,34(1):1-34.
    [46] Lapata M. and Barzilay R.. Automatic Evaluation of Text Coherence: Models and Representations[C].In Proceedings of the IJCAI2005:1085-1090.
    [47] Louis A. and Nenkova A. A Coherence Model Based on Syntactic Patterns[C]. In Proceedings of theEMNLP-CONLL2012:1157-1168.
    [48] Vanessa W. F. and Hirst G. Extending the Entity-based Coherence Model with Multiple Ranks[C]. InProceedings of the EACL2012:315-324.
    [49] Lin Z. H., Ng H. T., and Kan M. Y. Automatically Evaluating Text Coherence Using DiscourseRelations[C]. In Proceedings of the ACL2012:997-1006.
    [50] Elsner M. and Charniak E. Coreference-inspired Coherence Modeling[C]. In Proceedings of the ACL2008:41-44.
    [51] Iida R. and Tokunaga T. A Metric for Evaluating Discourse Coherence based on CoreferenceResolution[C]. In Proceedings of the COLING2012:483-494.
    [52] Elsner E., Austerweil J., and Charniak E. A Unified Local and Global Model for DiscourseCoherence[C]. In Proceedings of the NAACL2007:436-443.
    [53] Foltz P. W., Walter K., and Thomas K. L. The Measurement of Textual Coherence with LatentSemantic Analysis[J]. Discourse Processes,1998,25(2&3):285-307.
    [54] Barzilay R. and Lee L. Catching the Drift: Probabilistic Content Models, with Applications toGeneration and Summarization[C]. In Proceedings of the NAACL-HLT2004:113-120.
    [55]田然.近二十年汉语语篇研究述评[J].汉语学习,2005.1:51-55.
    [56]郑贵友.中文篇章结构分析的兴起与发展[J].汉语学习,2005,5:40-48.
    [57]聂仁发.汉语语篇研究回顾与展望[J].宁波大学学报(人文科学版),2009,22(3):40-45.
    [58]陈莉萍.修辞结构理论与句群研究[J].苏州大学学报(哲学社会科学版),2008,4:118-121.
    [59]徐赳赳, Webster J. J.复句研究与修辞结构理论[J].外语教学与研究,1999,4:16-22.
    [60]曹政.句群初探[M].杭州:浙江教育出版社,1984.
    [61]张志公.张志公文集①汉语语法[M].上海:上海教育出版社,1962.
    [62]吕叔湘.中国文法要略[M].北京:商务印书馆,1956:1-463.
    [63]王力.中国现代语法[M].北京:商务印书馆,1985:1-402.
    [64]陆俭明.现代汉语句法[M].北京:商务印书馆,1993:1-235.
    [65]黎锦熙.新著国语文法[M].湖南:湖南教育出版社,2007:1-347.
    [66]张益民,陆汝占,沈李斌.一种混合型的中文篇章结构自动分析方法[J].软件学报,2000,11(11):1527-1533.
    [67]张威,周昌乐.汉语语篇理解中元指代消解初步[J].软件学报,2002,13(4):732-738.
    [68]孔芳.指代消解关键问题研究[D].苏州大学博士学位论文,2009.
    [69]王跃龙,姬东鸿.汉语树库综述[J].当代语言学,2009,11(1):47-55.
    [70]周强.汉语句法树库标注体系[J].中文信息学报,2004,18(3):1-8.
    [71]乐明.中文篇章修辞结构的标注研究[J].中文信息学报,2008,22(4):19-23.
    [72] Xue N. W. Annotating Discourse Connectives in the Chinese Treebank[C]. In Proceedings of theACL Workshop on FCA2005:84-91.
    [73] Huang H. H. and Chen H. H. Chinese Discourse Relation Recognition [C]. In Proceedings of theIJCNLP2011:1442-1446.
    [74] Meyer T. Disambiguating Temporal-contrastive Discourse Connectives for Machine Translation[C]. InProceedings of the ACL-HLT2011:46-51.
    [75] Meyer T and Belis A. P. Multilingual Annotation and Disambiguation of Discourse Connectives forMachine Translation[C]. In Proceedings of the SIGDIAL2011:194-203.
    [76] Nagard R. L and Koehn P. Aiding Pronoun Translation with Co-reference Resolution[C]. InProceedings of the ACL Workshop on SMT and MMATR2010:258-267.
    [77] Haenelt K. Towards a Quality Improvement in Machine Translation: Modeling Discourse Structureand Including Discourse Development in the Determination of Translation Equivalents[C]. InProceedings of the TMIMT1992:205-212.
    [78] Mitkov R. How Could Rhetorical Relations be Used in Machine Translation (and at Least two OpenQuestions)?[C]. In Proceedings of the ACL Workshop on ISDR1993:86-89.
    [79]刘挺,王开铸.基于篇章多级依存结构的自动文摘研究[J].计算机研究与发展,1999,36(4):479-488.
    [80]王建波,王开铸.自然语言篇章理解及基于理解的自动文摘研究[J].中文信息学报,1992,6(2):1-7.
    [81]王建波,杜春玲,王开铸.基于篇章理解的自动文摘研究[J].中文信息学报,1995,9(3):33-42.
    [82] Chai J. and Jing R. Discourse Structure for Context Question Answering[C]. In Proceedings of theNAACL Workshop on PQA2004:23–30.
    [83] Sun M. and Chai J. Y. Discourse Processing for Context Question Answering Based on LinguisticKnowledge[J]. Knowledge-based Systems,2007,20(6):511-526.
    [84]张志昌,张宇,刘挺,李生.基于话题和修辞识别的阅读理解Why型问题回答[J].计算机研究与发展,2011,48(2):216-223.
    [85]吴华,黄泰翼.问答篇章生成系统中的用户模型和文本规划[J].中文信息学报,2001,15(4):28-34.
    [86]崔耀,陈永明.一个实验性的汉语篇章理解系统[J].中文信息学报,1994,8(3):24-34.
    [87] Huttunen S., Vihavainen A., Etter P. V., and Yangarber R. Relevance Prediction in InformationExtraction Using Discourse and Lexical Features[C]. In Proceedings of the NCCL2011:114-121.
    [88] Cimiano P., Reyle U., and Saric J. Ontology-driven Discourse Analysis for Information Extraction[J].Data&Knowledge Engineering,2005(55):59-83.
    [89]唐旭日,陈小荷,许超,李斌.基于篇章的中文地名识别研究[J].中文信息学报,2010,24(2):24-32.
    [90]袁毓林.用逻辑和篇章知识来约束模板匹配--逻辑结构和篇章结构知识在信息抽取中的运用[J].中文信息学报,2004,19(4):39-45.
    [91] Wang D.Y., Luk R.W.P., Wong K.F., and Kwok K.L. An Information Retrieval Approach Based onDiscourse Type[C]. In Proceedings of the ANLIS2006:197-202.
    [92] Morato J., Llorens J., Genova G., and Moreiro J. A. Experiments in Discourse Analysis Impact onInformation Classification and Retrieval Algorithms[J]. Information Processing and management,2003,39(6):825-851.
    [93] Mohler M., Bunescu R., and Mihalcea R. Learning to Grade Short Answer Questions using SemanticSimilarity Measures and Dependency Graph Alignments[C]. In Proceedings of the ACL2011:752-762.
    [94] Mohler M. and Mihalcea R. Text-to-text Semantic Similarity for Automatic Short Answer Grading[C].In Proceedings of the EACL2009:567-575.
    [95] Somasundaran S., Namata G., Wiebe J., and Getoor L. Supervised and Unsupervised Methods inEmploying Discourse Relations for Improving Opinion Polarity Classification[C]. In Proceedings ofthe EMNLP2009:170-179.
    [96] Escalante H. J. and Solorio T. Local Histograms of Character N-grams for Authorship Attribution[C].In Proceedings of the ACL2011:288-298.
    [97] Martin J. R. and White P. R. R. The Language of Evaluation: Appraisal in English[M]. London&NewYork: Palgrave Macmillan,2005.
    [98] Blitzer J., Dredze M., and Pereira F. Biographies, Bollywood, Boom-boxes and Blenders: DomainAdaptation for Sentiment Classification[C]. In Proceedings of the ACL2007:440-447.
    [99]钱龙华.命名实体间语义关系抽取[D].苏州大学博士学位论文,2009.
    [100]李军辉,周国栋,朱巧明,钱培德.中文名词性谓词语义角色标注[J].软件学报,2011,22(8):1725-1737.
    [101] Zhou G. D., Qian L. H., and Fan J. X. Tree Kernel-based Semantic Relation Extraction with RichSyntactic and Semantic Information[J]. Journal of Information Science,2010,180:1313-1325.
    [102] Zhou G. D., Li J. H., Fan J. X., and Zhu Q. M. Tree Kernel-based Semantic Role Labeling withEnriched Parse Tree Structure[J]. Information Processing and Management,2011,47:349-362.
    [103] Qian L. H. and Zhou G. D. Dependency-directed Tree Kernel-based Protein-protein InteractionExtraction from Biomedical Literature[C]. In Proceedings of the IJCNLP2011:10-19.
    [104] Qian L. H. and Zhou G. D. Tree Kernel-based Protein-protein Interaction Extraction from BiomedicalLiterature[J]. Journal of Biomedical Informatics,2012,45:535-543.
    [105] Kong F. and Zhou G. D. Improve Tree Kernel-based Event Pronoun Resolution with CompetitiveInformation[C]. In Proceedings of the IJCAI2011:1814-1819.
    [106] Collins M. and Duffy N. Convolution Kernels for Natural Language[C]. In Proceedings of the NIPS2001:625-632.
    [107] Moschitti A. A Study on Convolution Kernels for Shallow Semantic Parsing[C]. In Proceedings of theACL2004:335-342.
    [108] Ghanem A. S., Venkatesh S., and West G. Multi-class Pattern Classification in Imbalanced Data[C]. InProceedings of the ICPR2010:2881-2884.
    [109] Caruana R. and Freitag D. Greedy Attribute Selection[C]. In Proceedings of the ML1994:28-36.
    [110] Li S., Xia R., Zong C., and Huang C. A Framework of Feature Selection Methods for TextCategorization[C]. In Proceedings of the ACL-IJCNLP2009:692-700.
    [111] Xue N. W. and Palmer M. Calibrating Features for Semantic Role Labeling[C]. In Proceedings of theEMNLP2004:88-94.
    [112] Collins M. Head-driven Statistical Models for Natural Language Parsing[D]. Ph.D thesis, USA:University of Pennsyivania,1999.
    [113] Knott A. A Data-driven Methodology for Motivating a Set of Coherence Relations[D]. Ph.D thesis,Scotland: University of Edinburgh,1996.
    [114] Halliday M. A. K. An Introduction to Functional Grammar[M]. Hodder Education Press, London,United Kingdom,1994.
    [115]胡壮麟.语篇的衔接与连贯[M].上海:上海外语教育出版社,1994:1-235.
    [116]程晓堂.从主位结构看英语作文的衔接与连贯.山东师大外国语学院学报,2002,2:94-98.
    [117] Tomas K. L. and Susan T. D. A Solution to Plato’s Problem: the Latent Semantic Analysis Theory ofAcquisition, Induction and Representation of Knowledge[J]. Psychological Review,1997,104(2):211-240.
    [118] Salton G., Wong A., and Yang C. S. A Vector Space Model for Automatic Indexing[J].Communications of the ACM,1975,18(11):613–620.
    [119] Denkowski M. and Lavie A. Meteor1.3: Automatic Metric for Reliable Optimization and Evaluationof Machine Translation Systems. In Proceedings of the EMNLP Workshop on SMT2011:85-91.
    [120] Burstein J. C. The E-rater Scoring Engine: Automated Essay Scoring with Natural LanguageProcessing[M]. In Shermis, M. D.&Burstein, J. C.(Eds). Automated Essay Scoring: ACross-Disciplinary Perspective. NJ:Lawrence Erlbaum associates2003:113-121.
    [121] Runder L. M., Garcia V., and Welch C. An Evaluation of the Intellimetric Essay Scoring System[J].Journal of Technology, Learning, and Assessment,2006,4(4):1-21.
    [122] Dikli S. An Overview of Automated Scoring of Essays[J]. Journal of Technology, Learning, andAssessment,2006,5(1):3-35.
    [123] Burstein J., Tetreault J., and Andreyev S. Using Entity-Based Features to Model Coherence in StudentEssays[C]. In Proceedings the NAACL2010:681-684.
    [124] Higgins D., Burstin J., Marcu D., and Gentile C. Evaluating Multiple Aspects of Coherence in StudentEssays[C]. In Proceedings of the ACL-HLT2004:185-192.
    [125] Yannakoudakis H. and Briscoe T. Modeling Coherence in ESOL Learner Texts[C]. In Proceedings ofthe ACL Workshop on the IUBEA2012:33-43.
    [126] Yannakoudakis H., Briscoe T., and Medlock B. A New Dataset and Method for Automatically GradingESOL Texts[C]. In Proceedings of the ACL2011:180-189.
    [127]梁茂成.中国学生英语作文自动评分模型的构建[D].南京大学博士学位论文,2005.
    [128]曹亦薇,杨晨.使用潜语义分析的汉语作文自动评分研究[J].考度研究,2007,3(1):63-71.
    [129] Tian Y., Lu R. Z., and Wu B.S. Towards On-line Automated Semantic Scoring of English-ChineseTranslation[J]. Journal of Shanghai Jiaotong University (Science),2007,12(6):725-730.
    [130]王金铨.中国学习者汉译英机助评分模型的构建[D].北京外国语大学博士学位论文,2008.
    [131]王金铨,文秋芳.中国学生大规模汉译英测试机助评分模型的研究与构建[J].现代外语,2009,32(4):415-420.
    [132]王立欣.翻译标准自动量化方法研究[D].上海外国语大学博士学位论文,2007.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700