面向事件的知识处理研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
以“事件”作为知识表示的基本单元和信息组织的重要手段,已经受到越来越多的重视。研究面向事件的知识,可以为自动文摘和问题回答系统等信息处理技术提供服务。本文主要从面向事件的中文语料库构建、事件识别、事件要素识别以及事件因果关系抽取等四个方面进行了深入的研究,并针对以往研究中存在的不足,提出了一些切实可行的解决办法,具体包括:
     1.语料库建设是自然语言处理技术中的基础性的研究工作,由于研究的目的和研究的对象不相同,现有面向事件的语料库分别采用了不同的标注体系。这些标注体系主要关注某些特定类型的事件或事件要素,但是却忽略了一般意义上的事件以及人们对于事件的理解和认知。本文以调查问卷为基础,了解和分析了人们对于通常意义上的文本中的“事件”概念的理解,研究了中文事件的可标注性,提出了一种中文事件语料库的制作方法。该方法并不局限于标注某几类事件,而是针对文本中所有提及的事件。而且,该方法是建立在中文句法分析和语义分析基础之上的,符合中文的特点。评测实验表明,采用该方法标注得到的语料可以取得较高的一致性。我们还开发了一个标注辅助工具,收集了200篇突发事件领域的新闻报道作为生语料并对其进行了标注,制作了一个中文事件语料库(Chinese Event Corpus, CEC)。整个语料库的加工制作过程历时10个月,先后有近十人参与。与ACE和TimeBank语料库相比,CEC语料库的规模虽然偏小,但是对事件和事件要素的标注却最为全面。
     2.事件识别是事件抽取任务的基础,目前的事件识别大多采用了机器学习的方法,这种方法需要发掘有效的特征以提高识别效果。本文提出了一种基于多种特征融合的事件识别方法,在构造特征向量时,加入了上下文特征、词性特征、句法特征以及语义特征等等。在两种不同的分类器上对这些特征的区分能力分别进行了实验和分析,实验表明,随着有效特征的加入,事件识别的效果明显提高,而将多种特征融合在一起时,事件识别的效果最好。与基于tf×idf的事件识别方法相比,本文方法可以取得更好的识别效果。
     3.采用监督(分类)学习的方法识别事件要素,需要大规模人工标注的熟语料库作为训练集以获取事件要素的相关知识,对语料库的依赖性比较强,常常会因为语料稀疏的问题导致效果不理想。本文提出了一种基于半监督聚类和特征加权的事件要素识别方法,以减少对于语料的依赖。该方法利用少量的标记数据作为Seed集指导聚类,并且在聚类分析中根据不同特征的贡献分别赋予相应的权值。此外,本文还对传统的半监督聚类算法(Constrained-KMeans)和特征加权算法(ReliefF)进行了改进,使之适用于事件要素识别任务。实验表明,该方法在带标记语料较少的情况下具有一定的优势,可以取得相对较好的识别效果。
     4.事件因果关系是非常重要的一类语义关系,从文本中抽取事件因果关系具有广阔的应用前景。传统的事件因果关系抽取方法只能抽取显式带标记的、句内的一因一果关系。实际上,文本中除了包含上述因果关系之外,还包含了大量的无标记因果关系、跨句/跨段因果关系以及一因多果、多因一果和多因多果等。针对这种不足,本文提出了一种基于层叠条件随机场的事件因果关系抽取方法,该方法将事件因果关系的抽取问题转化为对事件序列的标注问题,采用层叠(两层)条件随机场标注出事件之间的因果关系。第一层条件随机场模型用于标注事件在因果关系中的语义角色,标注结果传递给第二层条件随机场模型用于识别因果关系的边界。语料分析和实验表明,本文方法不仅可以有效覆盖文本中的各种因果关系(包括:带标记/无标记因果关系、句内/跨句/跨段因果关系以及一因一果、一因多果、多因一果和多因多果等),并且均能取得较好的抽取效果。
Taking“Event”as a basic unit of knowledge representation and an important means for information organization has received increasing attention. The study of event-oriented knowledge can provide services for information processing technologies, such as Automatic Summarization and Question Answering System. This paper focuses on the following four aspects: the construction of event-oriented Chinese corpus, event recognition, event argument recognition, and event causal relation extraction. For the shortcomings of these studies, some practical solutions are presented, which include:
     1. Corpus construction is a fundamental task of natural language processing technology. For different studying purposes and objects, different annotation systems are employed in the existing event-oriented corpora. These annotation systems mainly focus on certain types of events or event arguments, but ignore the general events and people’s understanding and awareness for event. In this paper, a questionnaire based on event is designed, the common sense of event in text is analyzed from the questionnaire, the taggability of Chinese event is explored, and a method for building Chinese event corpus is presented. This method is not limited to certain types of events; all the events which mentioned in text are involved in it. In addition, the method is suitable for Chinese because it is based on syntactic analysis and semantic analysis of Chinese sentence. Evaluation results show that this method obtains a high annotation agreement. Further more, we have developed an annotation tool, collected 200 reported articles about emergencies as raw corpus and annotated it to build a Chinese event corpus (CEC). Nearly ten research members have taken part in the annotation job for 10 months. Comparing with the ACE and the TimeBank corpus, the CEC corpus is the smallest, but the annotated events and event arguments are the most comprehensive.
     2. Event recognition is the basis for the event extraction task. Most of the current approaches for event recognition employ machine learning methods, which need to explore effective features to improve the systems performance. This paper presents an event recognition method based on multi-features combination. While construct a feature vector, the context features, part of speech features, grammatical features and semantic features are all combined in it. The experiments with two different classifiers and analysis for the distinguishability of these features are carried out. The experimental results show that the performance improved obviously with the addition of effective features, and the system achieves the best performance while combining multi-features. Comparing with tf×idf based event recognition method, our method obtains better performance.
     3. The approach of event argument recognition based on supervised (classification) learning needs large-scale annotation corpus as training set to obtain the knowledge of event argument. This approach highly relies on the corpus, and it would get a poor system performance if the corpus is sparse. This paper presents a method for event argument recognition based on semi-supervised clustering and feature weighting, which can reduce the dependence on the corpus. In this method, a few labeled data is taken as seed set to guide the clustering analysis. Different weights are assigned to different features according to their importance of contribution on clustering. In addition, the traditional semi-supervised clustering algorithm (Constrained-KMeans) and feature weighting algorithm (ReliefF) are improved to apply to the task of event argument identification. Experimental results show that our method achieves good performance while the labeled data is insufficient.
     4. Event causal relation is an important semantic relation. Event causal relation extraction has a broad prospect of application. Traditional methods for event causal relation extraction are limited to marked、inner-sentence and“one cause, one effect”relation. In fact, there are also a large number of unmarked, outer-sentence/outer-paragraph,“one cause, many effects”,“many causes, one effect”and“many causes, many effects”causal relations in text. This paper presents a method for event causal relation extraction based on cascaded Conditional Random Fields (CRFs). The method casts the problem of event causal relation extraction as event sequence labeling and employs dual-layer CRFs model to label the causal relation of event sequence. The first layer of the CRFs model is used to label the semantic role of causal relation of the events, and then the outputs of the first layer are passed to the second layer for labeling the boundaries of the event causal relation. The corpus analysis and experimental results show that our method not only covers each class of event causal relation (including: marked/unmarked, inner-sentence/outer-sentence/outer-paragraph,“one cause, one effect”,“one cause, many effects”,“many causes, one effect”,“many causes, many effects”) in text, but also achieves good performance.
引文
[1] T. Fernando. Observing events and situations in time. Linguistics and Philosophy, 2007. 30(5): pp. 527-550.
    [2] W. V. O. Quine. Events and Reification. Actions and Events. Perspectives in the Philosophy of Donald Davidson, Oxford: Blackwell, 1985: pp. 162–171.
    [3] T. Trabasso and P. Van Den Broek. Causal Thinking and the Representation of Narrative Events. Journal of Memory and Language, 1985. 24(5): pp. 612-630.
    [4] R. A. Zwaan. Five dimensions of narrative comprehension: The event-indexing model. Narrative comprehension, causality, and coherence: Essays in honor of Tom Trabasso, 1999: pp. 93-110.
    [5] A. Chemero. What Events Are. Ecological Psychology, 2000. 12(1): pp. 37-42.
    [6] J. M. Zacks and B. Tversky. Event structure in perception and conception. Psychological Bulletin, 2001. 127(1): pp. 3-21.
    [7] S. Glasbey. Event Structure, Punctuality, and When. Natural Language Semantics, 2004. 12(2): pp. 191-211.
    [8] J. Allan, J. Carbonell, G. Doddington, et al. Topic detection and tracking pilot study: Final report. Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, 1998.
    [9] ACE. The ACE 2005 (ACE05) evaluation plan. in http://www.nist.gov/speech/tests/ace/ace05/doc/ace05-evalplan.v3.pdf. 2005.
    [10] N. Daniel, D. Radev, and T. Allison. Sub-event based multi-document summarization. 2003: Association for Computational Linguistics Morristown, NJ, USA. pp. 9-16.
    [11] E. Filatova and V. Hatzivassiloglou. Event-Based Extractive Summarization. in Proceedings of ACL Workshop on Summarization. 2004: Association for Computational Linguistics. pp. 104-111.
    [12] H. Yang, T. S. Chua, S. Wang, et al. Structured use of external knowledge for event-based open domain question answering. 2003, ACM Press New York, NY, USA. p. 33-40.
    [13]姚天顺and朱靖波.自然语言理解——一种让机器懂得人类语言的研究.北京:清华大学出版社, 2002.
    [14] W. Francis and H. Kucera. Brown corpus manual. Brown University, 1979.
    [15]刘开瑛.中文文本自动分词和标注. 2000:商务印书馆.
    [16] D. Gildea and D. Jurafsky. Automatic labeling of semantic roles. Computational Linguistics, 2002. 28(3): pp. 245-288.
    [17]刘挺,车万翔, and李生.基于最大熵分类器的语义角色标注.软件学报, 2007. 18(3): pp. 565-573.
    [18] M. Marcus, M. Marcinkiewicz, and B. Santorini. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 1993. 19(2): pp. 313-330.
    [19]周强.汉语句法树库标注体系.中文信息学报, 2004. 18(4): pp. 1-8.
    [20] W. Mann and S. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 1988. 8(3): pp. 243-281.
    [21]乐明.汉语篇章修辞结构的标注研究.中文信息学报, 2008. 22(4): pp. 19-23.
    [22] L. D. Consortium. ACE(Automatic Content Extraction) English Annotation Guidelines for Events. in http://www.ldc.upenn.edu/Projects/ACE/. 2005.
    [23] L. D. Consortium. ACE (Automatic Content Extraction) Chinese Annotation Guidelines for Events. 2005: http://projects.ldc.upenn.edu/ace/docs/Chinese-Entities-Guidelines_v5.5.pdf.
    [24] J. Pustejovsky, P. Hanks, R. Sauri, et al. The timebank corpus. Corpus Linguistics, 2003: pp. 647–656.
    [25] J. Pustejovsky, J. Castano, R. Ingria, et al. TimeML: Robust specification of event and temporal expressions in text. 2003. p. 7.
    [26] A. Setzer. Temporal Information in Newswire Articles: An Annotation Scheme and Corpus Study. PH.D, University of Sheffield. 2001.
    [27]邹红建and杨尔弘.语篇标注中的事件标注研究. in第七届中文信息处理国际会议. 2007.湖北武汉. pp. 752-758.
    [28]林静,曹德芳, and苑春法.中文时间信息的TIMEX2自动标注.清华大学学报(自然科学版), 2008. 48(1): pp. 117-120.
    [29]于江德,樊孝忠, and庞文博.事件信息抽取中语义角色标注研究.计算机科学, 2008. 35(3): pp. 155-157.
    [30] N. Chinchor and E. Marsh. MUC-7 Information Extraction Task Definition (version 5.1). 1998.
    [31] D. Appelt, J. Hobbs, J. Bear, et al. FASTUS: A finite-state processor forinformation extraction from real-world text. 1993: LAWRENCE ERLBAUM ASSOCIATES LTD. pp. 137-142.
    [32] A. Mikheev, C. Grover, and M. Moens. Description of the LTG system used for MUC-7. in Proceedings of 7th Message Understanding Conference (MUC-7). 1998.
    [33] R. Yangarber and R. Grishman. NYU: Description of the Proteus/PET system as used for MUC-7. in Proceedings of the Seventh Message Understanding Conference (MUC-7). 1998.
    [34] S. Yu, S. Bai, and P. Wu. Description of the kent ridge digital labs system used for muc-7, in Proceedings of the 7th Message Understanding Conference. 1998.
    [35] H. Chen, Y. Ding, S. Tsai, et al. Description of the NTU System Used for MET2, in Proceedings of the Seventh Message Understanding Conference. 1997.
    [36] J. Allan. Topic detection and tracking: event-based information organization. 2002: Kluwer Academic Publishers.
    [37]李保利and俞士汶.话题识别与跟踪研究.计算机工程与应用, 2003. 39(17): pp. 7-10.
    [38]洪宇,张宇,刘挺, et al.话题检测与跟踪的评测及研究综述.中文信息学报, 2007. 21(006): pp. 71-87.
    [39] Y. Yang, T. Pierce, and J. Carbonell. A study of retrospective and on-line event detection. 1998: ACM New York, NY, USA. pp. 28-36.
    [40] J. Allan, R. Papka, and V. Lavrenko. On-line New Event Detection and Tracking. in Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. 1998. Melbourne,Australia: ACM.
    [41] Y. Yang, J. Carbonell, R. Brown, et al. Learning approaches for detecting and tracking news events. IEEE Intelligent Systems, 1999. 14(4): pp. 32-43.
    [42] J. Allan, V. Lavrenko, and H. Jin. First story detection in TDT is hard. 2000: ACM New York, NY, USA. pp. 374-381.
    [43] Y. Yang, J. Zhang, J. Carbonell, et al. Topic-conditioned novelty detection. 2002: ACM New York, NY, USA. pp. 688-693.
    [44] G. Kumaran and J. Allan. Text classification and named entities for new event detection. 2004: ACM New York, NY, USA. pp. 297-304.
    [45] W. Lam, H. Meng, K. Wong, et al. Using contextual analysis for news event detection. International Journal of Intelligent Systems, 2001. 16(4): pp.525-546.
    [46] Z. Li, B. Wang, M. Li, et al. A probabilistic model for retrospective news event detection. 2005: ACM New York, NY, USA. pp. 106-113.
    [47] J. Makkonen, H. Ahonen-Myka, and M. Salmenkivi. Simple semantics in topic detection and tracking. Information Retrieval, 2004. 7(3): pp. 347-368.
    [48] T. Brants, F. Chen, and A. Farahat. A system for new event detection. 2003: ACM New York, NY, USA. pp. 330-337.
    [49] R. Swan and J. Allan. Automatic generation of overview timelines. 2000: ACM New York, NY, USA. pp. 49-56.
    [50] F. Fukumoto and Y. Suzuki. Event tracking based on domain dependency. 2000: ACM New York, NY, USA. pp. 57-64.
    [51] D. Smith. Detecting and browsing events in unstructured text. 2002: ACM New York, NY, USA. pp. 73-80.
    [52]吴平博,陈群秀, and马亮.基于事件框架的事件相关文档的智能检索研究.中文信息学报, 2003. 17(06): pp. 25-30.
    [53]贾自艳,何清,张海俊, et al.一种基于动态进化模型的事件探测和追踪算法.计算机研究与发展, 2004. 41(07): pp. 1273-1280.
    [54]洪宇,张宇,范基礼, et al.基于语义域语言模型的中文话题关联检测.软件学报, 2008. 19(9): pp. 2265-2275.
    [55]张阔,李涓子,吴刚, et al.基于词元再评估的新事件检测模型.软件学报, 2008. 19(04): pp. 817-828.
    [56]张晓艳,王挺, and陈火旺.基于多向量和实体模糊匹配的话题关联识别.中文信息学报, 2008. 22(1): pp. 9-14.
    [57] G. Doddington, A. Mitchell, M. Przybocki, et al. The Automatic Content Extraction (ACE) Program–Tasks, Data, and Evaluation. in 4th International Conference on Language Resources and Evaluation. 2004. Centro Cultural de Belem, Lisbon, Portugal. pp. 837-840.
    [58] D. Ahn. The stages of event extraction. in Proceedings of the COLING-ACL 2006 Workshop on Annotating and Reasoning about Time and Events. 2006. Sydney: Association for Computational Linguistics. pp. 1–8.
    [59]赵妍妍,王啸吟,秦兵, et al.中文事件抽取中事件类别的自动识别. in第三届学生计算语言学研讨会. 2006.中国辽宁沈阳. pp. 240-245.
    [60] H. Tan, T. Zhao, and J. Zheng. Identification of Chinese Event and Their Argument Roles. in IEEE 8th International Conference on Computer and Information Technology Workshops. 2008. Sydney, Australia: IEEE computersociety. pp. 14-19.
    [61]谭红叶.中文事件抽取关键技术研究.博士论文,哈尔滨工业大学. 2008.
    [62] Z. Chen and H. Ji. Language Specific Issue and Feature Exploration in Chinese Event Extraction. in North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT) 2009. 2009. Boulder, Colorado: Association for Computational Linguistics. pp. 209-212.
    [63] J. Fu, Z. Liu, and W. Liu. Identification of Chinese event argument. in The Second International Conference on the Applications of Digital and Web Technologies (ICADIWT 2009). 2009. London, United Kingdom pp. 468-473
    [64] E. Filatova and V. Hatzivassiloglou. Domain-Independent Detection, Extraction, and Labeling of Atomic Events. in Proceedings of Recent Advances in Natural Language Processing 2003. Borovetz, Bulgaria. pp. 145-152.
    [65] M. Naughton, N. Stokes, and J. Carthy. Investigating statistical techniques for sentence-level event classification. in Proceedings of the 22nd International Conference on Computational Linguistics. 2008. Manchester: Association for Computational Linguistics. pp. 617-624.
    [66] J. Allan, C. Wade, and A. Bolivar. Retrieval and novelty detection at the sentence level. in Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. 2003: ACM. pp. 321-328.
    [67]周剑辉,苑春法,黄锦辉, et al.金融领域内信息抽取规则的自动获取. in Proceedings of the 20th International Conference on Computer Processing of Oriental Languages 2003. Shenyang, China. pp. 410-416.
    [68]梁晗,陈群秀, and吴平博.基于事件框架的信息抽取系统.中文信息学报, 2006. 20(2): pp. 40-46.
    [69]吴平博,陈群秀, and马亮.基于时空分析的线索性事件的抽取与集成系统研究.中文信息学报, 2006. 20(1): pp. 21-28.
    [70]钟涛and陈群秀.基于层式有限状态自动机的灾难事件抽取系统. in第三届全国信息检索与内容安全学术会议. 2007.江苏苏州. pp. 24-30.
    [71]冯礼.基于事件框架的突发事件信息抽取.硕士学位论文,上海交通大学. 2008.
    [72] J. Wiebe, T. O'Hara, and T. Ohrstrom-Sandgren. An empirical approach to temporal reference resolution. Journal of Artificial Intelligence Research, 1998. 9: pp. 247-293.
    [73] J. Alexandersson, N. Reithinger, and E. Maier. Insights into the Dialogue Processing of VERBMOBIL. 1997: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA. pp. 33-40.
    [74] S. Busemann, T. Declerck, A. Diagne, et al. Natural language dialogue service for appointment scheduling agents. in Proceedings of ANLP. 1997: DFKI. pp. 25-32.
    [75] J. Allan, R. Gupta, and V. Khandelwal. Temporal summaries of new topics. 2001: ACM New York, NY, USA. pp. 10-18.
    [76] M. Verhagen, R. Gaizauskas, F. Schilder, et al. Semeval-2007 task 15: Tempeval temporal relation identification. 2007.
    [77] J. Makkonen and H. Ahonen-Myka. Utilizing temporal information in topic detection and tracking. LECTURE NOTES IN COMPUTER SCIENCE, 2003: pp. 393-404.
    [78] I. Mani. Recent Developments in Temporal Information Extraction Recent Advances in Natural Language Processing III: Selected Papers from RANLP 2003, 2004: p. 45.
    [79]王昀and苑春法.基于转换的时间-事件关系映射.中文信息学报, 2004. 18(4): pp. 23-30.
    [80] L. Griffin. Narrative, event-structure analysis, and causal interpretation in historical sociology. American Journal of Sociology, 1993: pp. 1094-1133.
    [81] R. Kaplan and G. Berry-Rogghe. Knowledge-based acquisition of causal relationships in text. Knowledge Acquisition, 1991. 3(3): pp. 317-337.
    [82] D. Garcia. COATIS, an NLP system to locate expressions of actions connected by causality links. in Knowledge acquisition, modeling and management: 10th European Workshop,EKAW'97. 1997. Catalonia, Spain: Springer-Verlag. pp. 347-352.
    [83] C. Khoo, S. Chan, and Y. Niu. Extracting causal knowledge from a medical database using graphical patterns. in Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. 2000. pp. 336-343.
    [84] R. Girju. Automatic detection of causal relations for question answering. in Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering. 2003. pp. 76-83.
    [85] B. Steven and H. Martin James. Learning Semantic Links from a Corpus of Parallel Temporal and Causal Relations. in Proceedings of ACL-08:HLT. 2008: Association for Computational Linguistics. pp. 177-180.
    [86]干红华.基于事件的因果关系可计算化分析研究.博士论文,浙江大学.2003.
    [87]刘宗田,黄美丽,周文, et al.面向事件的本体研究.计算机科学, 2009. 36(11): pp. 189-192.
    [88] J. Vronis. A study of polysemy judgements and inter-annotator agreement. in Programme and advanced papers of the Senseval workshop. 1998. Herstmonceux Castle, England. pp. 1-27.
    [89] N. Cristianini and J. Shawe-Taylor. An introduction to support Vector Machines: and other kernel-based learning methods. 2000: Cambridge University Press.
    [90] J. Suykens and J. Vandewalle. Least squares support vector machine classifiers. Neural processing letters, 1999. 9(3): pp. 293-300.
    [91]王洋,秦兵, and郑实福.句子相似度计算在FAQ中的应用. in第一届学生计算语言学研讨会论文集. 2002. p. 6.
    [92]张钹.自然语言处理的计算模型.中文信息学报, 2007. 21(3): pp. 3-7.
    [93]罗强.中文语义依存分析技术及其答案抽取应用的研究.博士论文,华南理工大学. 2006.
    [94]马金山.基于统计方法的汉语依存句法分析研究.博士论文,哈尔滨工业大学. 2007.
    [95]董振东and董强.知网URL: http://www.keenage.com, 2001.
    [96]刘群and李素建.基于《知网》的词汇语义相似度计算. Computational Linguistics and Chinese Language Processing, 2002. 7(2): pp. 59-76.
    [97]梅家驹,竺一鸣,高蕴琦, et al.同义词词林. 1983:上海辞书出版社.
    [98]鲁松and白硕.自然语言处理中词语上下文有效范围的定量描述.计算机学报, 2001. 24(07): pp. 742-747.
    [99] S. Bethard and J. Martin. Identification of Event Mentions and their Semantic Class. 2006: Association for Computational Linguistics. pp. 146-154.
    [100]赵妍妍,秦兵,车万翔, et al.中文事件抽取技术研究.中文信息学报, 2008. 22(01): pp. 3-8.
    [101] Z. Chen and H. Ji. Can one language bootstrap the other: a case study on event extraction. in Proceedings of the NAACL HLT 2009. 2009: Association for Computational Linguistics. pp. 66-74.
    [102] M. Robnik-Sikonja and I. Kononenko. Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning, 2003. 53(1-2): pp. 23-69.
    [103] S. Basu, A. Banerjee, and R. Mooney. Semi-supervised Clustering by Seeding.2002: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA. pp. 27-34.
    [104] S. Basu, M. Bilenko, and R. Mooney. A probabilistic framework for semi-supervised clustering. 2004: ACM New York, NY, USA. pp. 59-68.
    [105] M. Bilenko, S. Basu, and R. Mooney. Integrating constraints and metric learning in semi-supervised clustering. 2004: ACM New York, NY, USA.
    [106]邓超and郭茂祖.基于Tri-Training和数据剪辑的半监督聚类算法.软件学报, 2008. 19(3): pp. 663-673.
    [107]高滢,刘大有,齐红, et al.一种半监督K均值多关系数据聚类算法.软件学报, 2008. 19(11): pp. 2814-2821.
    [108] D. Wettschereck, D. Aha, and T. Mohri. A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, 1997. 11(1): pp. 273-314.
    [109]杨尔弘.突发事件信息提取研究.博士论文,北京语言大学. 2005.
    [110] M. Surdeanu, S. Harabagiu, J. Williams, et al. Using predicate-argument structures for information extraction. in Proceedings of the 41st Annual Meeting on Association for Computational Linguistics. 2003. pp. 8-15.
    [111] C. Khoo, J. Kornfilt, R. Oddy, et al. Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing. Literary and Linguistic Computing, 1998. 13(4): p. 177.
    [112] T. Inui, K. Inui, and Y. Matsumoto. What kinds and amounts of causal knowledge can be acquired from text by using connective markers as clues? LECTURE NOTES IN COMPUTER SCIENCE, 2003: pp. 180-193.
    [113] C. Pechsiri and A. Kawtrakul. Mining Causality from Texts for Question Answering System. IEICE TRANSACTIONS on Information and Systems, 2007. 90(10): pp. 1523-1533.
    [114] J. Lafferty, A. McCallum, and F. Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Machine Learning, 2001. 951: pp. 282-289.
    [115] F. Sha and F. Pereira. Shallow parsing with conditional random fields. in Proceedings of HLT-NAACL. 2003. pp. 213-220.
    [116]刘群,张华平,俞鸿魁, et al.基于层叠隐马模型的汉语词法分析.计算机研究与发展, 2004. 41(8): pp. 1421-1429.
    [117] S. Eddy. Hidden markov models. Current Opinion in Structural Biology, 1996. 6(3): pp. 361-365.
    [118] A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. in Proceedings of the Seventeenth International Conference on Machine Learning. 2000. pp. 591-598.
    [119] F. Peng, F. Feng, and A. McCallum. Chinese segmentation and new word detection using conditional random fields. in Proceedings of the 20th international conference on Computational Linguistics. 2004: Association for Computational Linguistics. pp. 562-568.
    [120] X. He, R. Zemel, and M. Carreira-Perpinan. Multiscale conditional random fields for image labeling. in IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004: IEEE Computer Society. p. 8.
    [121] M. Bundschus, M. Dejori, M. Stetter, et al. Extraction of semantic biomedical relations from text using conditional random fields. BMC bioinformatics, 2008. 9(1): pp. 207-220.
    [122] J. Nocedal and S. Wright. Numerical optimization. 1999: Springer.
    [123] C. Sutton and A. McCallum. Composition of conditional random fields for transfer learning. in proceedings of HLT/EMLNLP. 2005. pp. 748-754.
    [124] A. McCallum. Efficiently inducing features of conditional random fields. in Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI03). 2003: Citeseer. p. 8.
    [125] McCallum and A. Kachites. MALLET: A Machine Learning for Language Toolkit. http://mallet.cs.umass.edu 2002.
    [126] L. Joskowicz, T. Ksiezyck, and R. Grishman. Deep domain models for discourse analysis. in Proceedings of the Annual AI Systems in Government Conference. 1989. pp. 195-200.
    [127] E. Blanco, N. Castell, and D. Moldovan. Causal Relation Extraction. Language Resources and Evaluation, 2008: pp. 310-313.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700