用户名: 密码: 验证码:
事件本体及其在查询扩展中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本体通过明确地定义概念和概念间的关系描述事物或现象的本质。本体已经成为人工智能和知识工程中一种先进的技术,在知识的表示、获取和应用等方面发挥着越来越重要的作用。国内外许多研究机构纷纷对其展开了广泛的理论研究和应用探索。
     近来,“事件”的概念逐渐被计算语言学、人工智能、信息检索、信息抽取、自动文摘和自然语言处理等知识处理领域所采用。大量的文本,比如小说、史传、回忆录、神话传说、民间故事、叙事诗、戏剧、人物传记、新闻报道等,都包含有各类事件。“事件”关联了参与者、时间和地点等概念,是比“概念”粒度更大的知识单元。
     人类遭受过并且正在遭受着各种灾难性突发事件的危害,包括地震、火山爆发、洪水、飓风、化学品泄漏、核辐射逃逸、传染病、事故、爆炸、城市火灾等等。由于现实中的事件在网络上都有明显的反映,借助搜索引擎从互联网上获取事件相关信息已经是用户的迫切需求。但由于互联网上的信息急剧膨胀,通用搜索引擎返回的结果往往是信息量大、查询不准确。用户在输入某个关键字后,搜索到的有用信息并不多,对事件类信息的检索需求更是如此。因此,面向事件的信息检索技术亟待研究开发。
     本文围绕事件本体及其在查询扩展中的应用,研究了事件类之间的关系、事件类关联强度的量化,在此基础上提出了一种面向事件的本体模型。接着,将事件的思想应用于查询扩展领域,探索了基于伪相关反馈和基于事件本体的面向事件的查询扩展方法。本文的研究内容和创新点主要包括:
     (1)定义事件类,依据事件类的动作要素,分析事件类之间存在的关系;依据事件类的参与者、时间、环境等要素,揭示事件实例存在的关系。定义事件类影响因子的概念,用来描述一个事件类实例的发生对有关系的其他事件类的实例所产生的影响大小。在事件类、事件类关系、事件类影响因子的基础上,改进并完善了事件本体模型。
     (2)选取突发事件领域的一个子领域研究事件本体的构建方法。分析动态事件与静态概念在相互关联上的区别,根据事件具有的随时间而动态变迁的特性,提出了综合考虑事件的Authorities值和Hubs值的事件重要度的计算方法(记作HARank),并将此算法应用于文本集合中重要事件的识别、事件本体中事件类重要度的计算。
     (3)针对用户获取互联网上事件类信息的需求,提出面向事件的查询扩展技术。方法之一是面向事件的基于伪相关反馈查询扩展方法,重点研究文本中事件的识别、查询项中限定项与事件项的判别、扩展事件的选取、查询项的权值设置以及查询项与文档相似度的计算等问题;方法之二是基于事件本体面向事件的查询扩展方法,重点研究基于事件本体的查询项中事件项的识别、查询项中事件项到事件本体的事件类、事件类到其各个要素的联想扩展。
Ontology reflects the essence of objects or phenomenon through defining concepts and their relations explicitly. Ontology has become an advanced technology for artificial intelligence and knowledge engineering, which plays an increasingly important role in the representation, acquisition and application of knowledge. Therefore, many research institutions at home and abroad have launched extensively theoretical research and application exploration.
     Recently, the‘event’has been applied to computational linguistics, artificial intelligence, information retrieval, information extraction, automatic summarization and natural language processing. The various texts such as novel, biography, memoir, myth and legend, folktale, epic, drama, press, and so on, include lots of events. An event identified by event triggers is associated with participants, time, location, etc, which is a larger knowledge unit than a concept.
     The human have suffered and have been suffering all kinds of emergency, including earthquakes, volcanic eruptions, floods, hurricanes, chemical spills, nuclear radiation escape, diseases, accidents, explosions, urban fires, etc. Because an important event in the real world is always reflected on the network in different styles, getting event information has become the key component for the users. With an overwhelming volume of information currently available, the results returned by universal search engine are often informative and inaccurate, and users’experiences of searching are not very good. It is more so for the retrieval of events. So it is urgent to study and develop the event-oriented retrieval technology.
     The work surrounds event ontology and its application in query expansion in this paper. First, we study the relations of event classes and the association of event classes, and propose an event-oriented ontology model. Then, event is applied in query expansion area, and we explore the methods of query expansion based on pseudo relevant feedback and event ontology. The main contents and innovations of this paper include:
     (1) Define the concept of event class, analyze the relations of event classes according to the action element of the event class and the relations of event instances according to the object, time and environment elements of the event class. Define the influence factor between event classes to depict the probability by that if an event instance occurred, the other event instance occurs too. Improve an event-oriented ontology model on the basis of the event class, the event class relation and the event class influence factor.
     (2) Select a sub-area of emergency to study the method of constructing event ontology. Analyze the distinctions of dynamic event and static concept. Put forward a method of computing the importance of events, which synthetically considers both the hubs and authorities of events, denoted by HARank (Hubs-Authorities Rank), and apply the HARank to identify important events from the collection of texts and rank event classes for event ontology.
     (3) Propose the technology of event-oriented query expansion aiming at the requirements of getting event information. First, present the method of event-oriented query expansion based on pseudo relevant feedback, including the identification of events from texts, the discrimination of qualified terms and event terms of query terms, the selection of expansion events, the setting of query term weights and the similarity computation between query terms and texts. Second, present the method of event-oriented query expansion based on event ontology, including the identification of event terms of query terms based on event ontology, and the associative expansions from event terms of query terms to event classes of event ontology and from the event class to its elements.
引文
[Allan, 1998] Allan J., Carbonnell J., Doddington G., et al. (1998). Topic detection and tracking pilot study: Final report. Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, San Francisco, CA, Morgan Kaufmann Publishers, pp. 194-218.
    [Allan, 2001] Allan J., Gupta R., Khandelwal V. (2001). Temporal summaries of new topics. Proceedings of the 24th annual international ACM SIGIR conference on Research and Development In information Retrieval, New York, USA. pp. 10-18.
    [Blanco, 2008] Blanco E., Castell N., Moldovan D. (2008). Causal relation extraction. Language Resources and Evaluation, pp. 310-313.
    [Borst, 1997] Borst W. N. (1997). Construction of engineering ontologies for knowledge sharing and reuse, [PhD Thesis] Enschede: University of Twente.
    [Buckley, 1995] Buckley C., Singhal A., Mitra M., et al. (1995). New retrieval approaches using SMART. The 4th Text Retrieval Conference(TREC-4, Washington, D.C., Nov.), D. K. Harman, ED. National Institute of Standards and Technology, Gaithersburg, MD, pp. 25-48.
    [Chang, 2006] Chang Y., Qunis I., Kim M. (2006). Query reformulation using automatically generated query concepts from a document space. Information Processing and Management, 42(2), pp. 453-468.
    [Chen, 2003] Chen X. (2003). Why did John Herschel Fail to Understand Polarization? The differences between object and event concepts. Studies in History and Philosophy of Science, 34(3), pp. 491-513.
    [Deerwester, 1990] Deerwester S., Dumai S. T., Furnas G. W., et al. (1990). Indexing by latent semantic analysis. J. Am. Soc. In.f Sc.i, 41(6), pp. 391-407.
    [Ding, 2005] Ding L., Pan R., Finin T., et al. (2005). Finding and ranking knowledgeon the semantic Web. Proceedings of the ISWC. Galway, Ireland, pp. 156-170.
    [Filatova, 2003] Filatova E., Hatzivassiloglou V. (2003). Domain-independent detection, extraction, and labeling of atomic events, in: Proceedings of RANLP, Borovetz, Bulgaria, pp. 145-152.
    [Fogaras, 2003] Fogaras D. (2003). Where to Start Browsing the Web. Proceedings of the IICS. Leipzig, Germany, pp. 65-79.
    [Gheorghe, 2001] Gheorghe A. V., Vamanu D. V. (2001). Adapting to new challenges: IIDS for emergency preparedness and management.International Journal of Risk Assessment and Management, 2(3/4), pp. 211–223.
    [Gomez-Perez, 1999] Gomez-Perez A. (1999). Evaluation of taxonomic knowledge in ontologies and knowledge bases. Proceedings of KAW’99, Banff, Alberta, Canada.
    [Gruber, 1993] Gruber T. R. (1993). Toward principles for the design of ontologies used for knowledge sharing: A translation approach to portable ontologies. Knowledge Acquisition, 5, pp. 199-220.
    [Han, 2007] Han Y. (2007). Reconstruction of people information based on an event ontology. The 2007 IEEE International conference on natural language processing and knowledge engineering, Beijing, China, pp. 446–451.
    [He, 2007] He B., Ounis I. (2007). Combining fields for query expansion and adaptive query expansion. Information Processing and Management, 43(5), pp. 1294-1307.
    [Hearst, 1996] Hearst M. A. (1996). Improving full-text precision on short queries using simple constraints. The symposium on document analysis and information retrieval, Las Vegas, NV, pp.237-267.
    [Hsu, 2001] Hsu W. L., Wu S. H., Chen Y. S. (2001). Event identification based on the information map-INFOMAP. The IEEE International Conference on Systems, Man, and Cybernetics, Tucson, Arizona, USA, pp. 1661-1666.
    [Jane, 1970] Jane J. Robinson. (1970). Dependency structures and transformational rules, Language, 46(2), Part 1, pp. 259-285.
    [Jing, 1994] Jing Y., Croft W. B. (1994). An association thesaurus for information retrieval. Proceedings of the Intelligent Multimedia Information Retrieval Systems (RIAO’94) [C], pp. 146-160.
    [kaneiwa, 2007] kaneiwa K., M. Iwaztume K., Fukuda. (2007). An upper ontology for event classifications and relations. M.A. Orgun and J.Thornton (Eds.): AI 2007, LNAI 4830, pp. 394-403.
    [Kevin, 2004] Kevin F., Liu R. (2004). Agent-based resource discovery architecture for environmental emergency management. Expert Systems with Applications, 27, pp. 77–95.
    [Khoo, 1998] Khoo C., Kornfilt J., Oddy R., et al. (1998). Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing.Literary and Linguistic Computing, 13(4), pp. 177.
    [Kleinberg, 1999] Kleinberg J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of ACM, 46(5), pp. 604-632.
    [LDC, 2005] L. D. Consortium. (2005). ACE (Automatic Content Extraction) English Annotation Guidelines for Events. in http://www.ldc.upenn.edu/Projects/ACE/.
    [Lee, 2003] Lee C. S., Chen Y. J., Jian Z. W. (2003). Ontology-based fuzzy event extraction agent for Chinese e-news summarization. Expert Systems with Applications, 25, pp. 431-447.
    [Li, 2005] Li Z., Wang B., Li M., et al. (2005). A probabilistic model for retrospective news event detection. The 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 106-113.
    [Li, 2006] Li W., Xu W., Wu M., et al. (2006). Extractive summarization using Inter-and Intra-Event Relevance. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, Sydney, Australia, pp. 369-376.
    [Lin, 2005] Lin H. F., Liang J. M. (2005). Event-based ontology design for retrieving digital archives on human religious self-help consulting. The 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service, Hong Kong, China, pp.522-527.
    [Maedche, 2002] Maedche A. (2002). Ontology learning for the semantic Web. Boston: Kluwer Academic Publishers, pp. 72-79.
    [Maki, 2004] Maki W., McKinley L., Thompson A. (2004). Semantic distance norms computed from an electronic dictionary (wordnet). Behavior Research Methods, Instruments, & Computers, 36, pp. 421-431.
    [Mendez-Torreblanca, 2004] Mendez-Torreblanca A., Lopez-Lopez A. (2004). From text to ontology: The modelling of economics events. Lecture Notes in Computer Science, pp. 502-503.
    [Missikoff, 2002] Missikoff M., Navigli R., Velardi P. (2002). Integrated approach for web ontology learning and engineering. IEEE Computer, 35(11), pp. 60?63.
    [Mitra, 1998] Mitra M., Singhal A., Buckley C. (1998). Improving automatic query expansion. The 21st annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY, pp. 206-214.
    [Modica, 2001] Modica G., Gal A., Jamil H. M. (2001). The use of machine-generated ontologies in dynamic information seeking. In: Batini C, Giunchiglia F., Giorgini P., Mecella M., eds. Proc. of the 9th Int’l Conf. on Cooperative Information Systems. Heidelberg: Springer-Verlag, 2001, pp. 433?448.
    [Navigli, 2003] Navigli R., Velardi P. (2003). An analysis of ontology-based query expansion strategies. In Workshop on Adaptive Text Extraction and Mining (ATEM 2003), in the 14th European Conference on Machine Learning (ECML 2003), pp. 42-49.
    [Nie, 2005] Nie Z., Zhang Y., Wen J. R., et al. (2005). Object-level ranking: bringing order to Web objects. Proc. of the WWW. Chiba, Japan, pp. 567-574.
    [Pustejovsky, 2000] Pustejovsky J. (2000). Events and the semantics of opposition, in Events as Grammatical Objects, J. Pustejovsky and C. Tenny, Eds.: Stanford: Center for the Study of Language and Information (CSLI Publications), pp. 445–482.
    [Pustejovsky, 2003] Pustejovsky J., Hanks P., Sauri R., et al. (2003). The timebankcorpus. Corpus Linguistics, pp. 647–656.
    [Qiu, 1993] Qiu Y., Frei H. P. (1993). Concept-based query expansion. Proceedings of the 16th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR’93) [C], pp. 160-169.
    [Rocchio, 1971] Rocchio J. (1971). Relevance feedback in information retrieval. The Smart Retrieval System-Experiments in Automatic Document Processing, Salton G, Ed. Prentice-Hall, Englewood Cliffs, NJ, pp. 313-323
    [Sakeena, 2002] Sakeena, M. A. F., Karunanda, A. S. (2002). Emergency response management agent. Project report: ERMA using hybrid intelligent techniques, Sri Lanka: Informatics Institute of Technology, pp. 107-296.
    [Sánchez, 2006] Sánchez D., Moreno A. (2006). A methodology for knowledge acquisition from the web. International Journal of Knowledge-Based and Intelligent Engineering Systems, , 10, pp. 453-475.
    [Sparck, 1971] Sparck Jones K. (1971) Automatic Keyword Classification for Information Retrieval. Butterworths, London, pp. 10-213.
    [Studer, 1998] Studer R., Benjamins V. R., Fensel D. (1998). Knowledge engineer, principles and methods. Data and Knowledge Engineering, 25, pp. 161-197.
    [Tan, 2008] Tan H., Zhao T., Zheng J. (2008). Identification of Chinese event and their argument roles. Computer and Information Technology Workshops, 2008 CIT Workshops, IEEE 8th International Conference on, pp. 14-19.
    [Tallis, 2001] Tallis M., Kim J., Gil Y. (2001). User sutdies of knowledge acquisition yools: methodology and lessons learned. J. Expt. Theor. Artif. Intell. 13, pp. 359–378.
    [Tufekci, 1995] Tufekci, S. (1995). An integrated emergency management decision support system for hurricane emergencies. Safety Science, 20(1), pp. 39–48.
    [Van, 1979] Van Rijsbergen, C. J. (1979). Information Retrieval (2nd ed.). Butterworths, London, UK, 1979.
    [Vanderwende, 2004] Vanderwende L., Banko M., Menezes A. (2004). Event-CentricSummary Generation. Proceedings of the DUC-2004 Workshop, Boston, MA, pp. 127-132.
    [Vargas-Vera, 2004] Vargas-Vera M., Celjuska D. (2004). Event recognition on news stories and semi-automatic population of an ontology. Web Intelligence, WI 2004. Proceedings of IEEE/WIC/ACM International Conference, pp. 615-618.
    [Voorhees, 1994] Voorhees E. (1994). Query Expansion using lexical– semantic relations. Proceedings of the 17th annual international ACM SIGIR Conference on Research on and development in information retrieval, Dublin,Ireland, pp. 61-69.
    [WordNet, 2006] WordNet (2006). A lexical database for the English language
    [EB/OL]. http://wordnet.princeton.edu/.
    [Wybo, 1998] Wybo J. L., Kowalski K. M. (1998). Command centers and emergency management support. Safety Science, 30(1/2), pp. 131–138.
    [Wilson, 2001] Wilson J., Oyola-Yemaiel A. (2001). The evolution of emergency management and the advancement towards a profession in the United States and Florida. Safety Science, 39(1/2), pp. 117–131.
    [Xu, 2000] Xu J., Croft B. W. (2000). Improving the effectiveness of informational retrieval with local context analysis. ACM Transactions on information systems, 18(1), pp. 79-112.
    [Xu, 2009] Xu R. Z., Dai X. L., Yang F., et al. (2009). Research on the construction method of emergency plan ontology based on OWL. Proceedings of the 2009 International Symposium on Web Information System and Applications, P. R. China, pp. 19-23.
    [Yang, 1998] Yang Y., Pierce T., Carbonell J. (1998). A study of retrospective and on-line event detection. The 21st annual international ACM SIGIR conference on research and development in information retrieval, Australia, pp. 28-36.
    [Yang, 2003] Yang H., Chua T. S., Wang S., et al. (2003). Structured use of external knowledge for event-based open domain question answering. The 26th annual international ACM SIGIR conference on Research and development in informationretrieval Toronto, Canada: ACM Press, pp. 33-40.
    [Zacks, 2001] Zacks J. M., Tversky B. (2001). Event structure in perception and conception. Psychological Bulletin, 2001, 127(1), pp. 3-21.
    [Zhi, 2008] Zhi H. L., Liu Z. T. (2008). Event importance analysis based on directed graph. International Symposium on Intelligent information technology application workshops, pp. 451-453.
    [Zhong, 2010] Zhong Z. M., Liu Z. T. (2010). Identifying key people from a single document using people event map. Journal of Computational Information Systems, 6(1), pp. 17-23.
    [丁国栋, 2006]丁国栋,白硕,王斌(2006).一种基于局部共现的查询扩展方法. 中文信息学报, 20(3), pp. 84-91.
    [干红华, 2003]干红华,潘云鹤(2003).一种基于事件的因果关系的结构分析方法.模式识别与人工智能, 16(1), pp. 56-62.
    [贾自艳, 2004]贾自艳,何清,张海俊,等(2004).一种基于动态进化模型的事件探测和追踪算法.计算机研究与发展, 41(7), pp. 1273-1280.
    [李飞, 2004]李飞,高济,刘柏嵩,等(2004).知识管理中语义与关键字相结合的检索方法.计算机辅助设计与图形学学报, 16(12), pp. 1696-1702.
    [廖明宏,2000]廖明宏(2000).本体论与信息检索.计算机工程, 26(2), pp. 56-58.
    [刘海涛, 1997]刘海涛(1997).依存语法和机器翻译.语言文字应用, 3, pp. 89-93.
    [刘宗田, 2009]刘宗田,黄美丽,周文,等(2009).面向事件的本体模型.计算机科学, 36(11), pp. 189-192.
    [潘云鹤, 1994]潘云鹤,耿卫东(1994).面向智能计算的记忆结构理论综述.计算机研究与发展, 31(12), pp. 37-42.
    [宋峻峰, 2005]宋峻峰,张维明,肖卫东,等(2005).基于本体的信息检索模型研究.南京大学学报, 41(2), pp. 189-197.
    [田萱, 2008]田萱,杜小勇,李海华(2008).语义查询扩展中词语-概念相关度的计算.软件学报, 19(8), pp. 2043-2053.
    [吴刚, 2007]吴刚,张阔,李涓子,等(2007).利用相互增强关系迭代计算本体中概念与关系的重要性.计算机学报, 30(9), pp. 1490–1499.
    [吴平博,2003]吴平博,陈群秀,马亮(2003).基于事件框架的事件相关文档的智能检索研究.中文信息学报, 17(6), pp. 25-30.
    [杨立, 2005]杨立,左春,王裕国(2005).面向服务的知识发现体系结构研究与实现.计算机学报, 28(4), pp. 445-457.
    [杨丽英, 2006]杨丽英,李红娟,张永奎(2006).突发事件新闻语料分类体系研究[C]//《中文信息处理前沿进展》(中国中文信息学会二十五周年学术年会论文集).北京:清华大学出版社.
    [于满泉, 2006]于满泉,骆卫华,许洪波,等(2006).话题识别与跟踪中的层次化话题识别技术研究.计算机研究与发展, 43(3), pp. 489-495.
    [赵妍妍, 2008]赵妍妍,秦兵,车万翔,等(2008).中文事件抽取技术研究.中文信息学报,1, pp. 3-8.
    [仲兆满, 2009]仲兆满,刘宗田,周文,等(2009).事件关系表示模型.中文信息学报, 23(6), pp. 56-60.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700