基于问答网络论坛知识体系的自动问答系统研究

作者：于士涛
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：问答网络论坛 ; 信息检索 ; 自动问答系统 ; 依存句法 ; 知识体系
英文关键词：Question and Answer Web Forum ; Information Retrieval ; Automatic Question Answering System ; Dependency Grammar ; Knowledge System
学位年度：2009
导师：袁晓洁
学科代码：081203
学位授予单位：南开大学
论文提交日期：2009-05-01

摘要

随着信息检索技术的发展,互联网上出现了多种类型的搜索服务。其中应用最为广泛的当数Web搜索引擎服务,实现了对海量Web文档的获取、处理、存储和访问,使用户能够在互联网上方便快捷的查找到所需信息,在人们日常生活中发挥了重要作用。但是,随着互联网信息量的增长和搜索引擎技术的成熟,互联网用户已经不满足于单纯基于关键词的搜索服务,而希望通过自然语言描述,表达自己的查询需求,希望搜索服务系统能够理解用户意图,返回恰当的结果。因此,自动问答系统成了互联网用户的下一步渴望。自动问答系统的用户界面类似于搜索引擎,但用户提交的不再是关键词,而是自然语言问句,系统返回的是根据与问题相关程度排序的答案列表。
     目前,自动问答领域已经积累了大量的研究成果,包括基于不同语言的、不同数据集的自动问答研究。但是,自动问答还没能像搜索引擎一样,形成产品化的服务。本文旨在为产品化的自动问答互联网服务积累经验,研究基于一种特定的数据集——问答网络论坛数据集,构建自动问答系统的过程;进而在信息检索过程中,通过引入自然语言句法和语义信息、重新定义数据组织形式等措施,研究系统性能的一系列改进方案。主要贡献和创新点包括:
     ●构建自动问答系统的评测平台。在问答网络论坛数据集上,参照搜索引擎实现原理,基于标引项,采用文本相似度检索模型构建自动问答系统。实验表明:系统性能略优于问答网络论坛自带的“相似问题搜索”功能,将代替该功能,作为本文系统性能改进的评价基准。
     ●基于依存项的自动问答系统性能改进。提出依存项定义,在标引项基础上引入了自然语言依存句法分析结果,将句法信息引入到信息检索过程中。实验表明:依存项可以有效表达问题的自然语言句法特征,原有的信息检索模型不做任何改变,即可改进自动问答系统性能。
     ●基于问题分类的自动问答系统性能改进。针对问答网络论坛数据集,提出一套新的自然语言问题分类体系,将自然语言句法和语义信息作为训练特征,训练出足够精确的问题分类器。问题分类结果用于指导答案排序。实验表明:分类器对论坛数据集分类效果良好,类别指导排序明显改进了系统性能。
     ●基于自然语言知识体系的自动问答系统改进。参考前两种系统改进经验,提出新的数据组织形式:向概念体系添加谓语关联,建立自然语言知识体系。谓语关联由数据集中的问题答案对产生。这是一种综合的改进,既充分利用了数据集中的答案信息,又借助自然语言概念体系的关联关系,增强了系统的查询扩展和逻辑推理能力。论坛数据填充到此体系中,并在此体系上重建自动问答系统。实验表明:重建后系统性能得到全面改进。
With the development of information retrieval technology, various types of search services have appeared on the Internet. In all the services, the one that is most widely used is Web search engine, which has realized the acquisition, processing, storage and access on the mass of Web documents, in order that users can find necessary information on the Internet quickly and easily. Thus the Web search engine plays an important role in people's daily life. However, for the growth of Internet information and the maturity of search technology, Internet users have no longer satisfied with a keyword-based search service, and hope that they can express their query needs through natural language description, and the search service system can understand their intention to return appropriate results. Therefore, the automatic question answering system has become the next desire of Internet users. Automatic question answering system offers a user interface similar to search engines; while users will no longer commit keywords, but natural language questions. The system will return a list of answers ranked by their association with the question.
     By present, a great deal of research results has been accumulated in the field of automatic question answering, including those based on different languages and different data sets. However, there is no automatic question answering service yet, as a product as search engine.
     Aiming at the accumulation of experience for the product of automatic question answering Internet service, this paper studies the construction process of automatic question answering system, which is based on a specific data set - Question and Answer Web Forum data sets. This paper also studies a series of programs to improve the system performance, through the introduction of natural language syntactic and semantic information and new organizational form of data sets into information retrieval process. The main contributions and innovations include:
     ·The construction of evaluation platform for automatic question answering system. Build an automatic question answering system on the question and answer forum data sets, referring to the realization of the search engine, using the term-based text similarity model. Experiments show that: the performance of the system is slightly better than that of their own similar-question-search service in question and answer forums. Then it will be treated as the evaluation baseline of system performance improvement in this paper, instead of the similar-question-search service.
     ·Performance improvement of automatic question answering system based on dependency term. A definition of dependency term is proposed, based on term, integrating the natural language dependency structure, introducing syntactic information into the information retrieval process. Experiments show that: dependency term can effectively express the characteristics of natural language questions, and improves the performance of automatic question answering system without changing of original information retrieval models.
     ·Performance improvement of automatic question answering system based on question classification. A new definition of taxonomy for natural language questions is proposed, for the the question and answer forum data sets, and a question classifier is trained by natural language syntactic and semantic features, which is accurate enough to guide the answer ranking. Experiments show that: the question classifier works well on the Web forum data sets, and the question-class-guided ranking significantly improves the system performance.
     ·Performance improvement of automatic question answering system based on natural language-based knowledge system. Madding references to the former two performance improving methods of the system, a new data organizing form is proposed: adding predicate links to concept system to establish a natural language-based knowledge system. The predicate links are generated from the question/answer pairs in the date sets. This is a comprehensive improvement; making full use of not only the information from answers, but also the relationships form concept system, enhancing the logical reasoning ability of the system. Forum data are all filled into the knowledge system, based on which the automatic question answering system is rebuilt. Experiments show that: the reconstruction brings further improvement to the system performance.

引文

[1]Salton G,McGill M.Introduction to Modern Information Retrieval.McGraw-Hill Book Company,1983
    [2]Baeza-Yates R A,Ribeiro-Neto B A.Modern Information Retrieval.ACM Press /Addison-Wesley,1999
    [3]Singhal A.Modern Information Retrieval:A Brief Overview.IEEE Data Engineering Bulletin.2001,24(4):35-43
    [4]Bush V.As We may Think.interactions.1996,3(2):35-46
    [5]Garfield E.A Tribute to Calvin N.Mooers,a Pioneer of Information Retrieval.The Scientist.1997,11(6):9
    [6]Salton G,Buckley C,Fox E A.Automatic Query Formulations in Information Retrieval.Tech.rep.Ithaca,NY,USA,1982
    [7]Zaragoza H.Information Retrieval:Algorithms and Heuristics.Information Retrieval.2002,5(2-3):271-274
    [8]郑实福,刘挺,等.自动问答综述.中文信息学报.2002,16(6):46-52
    [9]Venkat N.Gudivada,Vijay V.Raghavan,William I.Grosky,Rajesh Kasanagottu,Information Retrieval on the World Wide Web,IEEE Internet Computing,1997,Vol.1(5):58-68
    [10]Voorhees E M.The TREC-8 Question Answering Track Report.In:Proceedings of the 8th Text REtreival Conference,2001
    [11]Voorhees E M.Overview of the TREC 2001 question answering track.In:Voorhees E M and Harman D K,eds.In:Proceedings of the 10th Text REtreival Conference(TREC 2001),2002:42-51
    [12]Voorhees E M.Overview of the TREC 2003 Question Answering Track.In:Proceedings of the 12th Text REtrieval Conference(TREC 2003),2004:54-68
    [13]Voorhees E M.Overview of the TREC 2004 Question Answering track.In:Proceedings of the 13th Text REtreival Conference(TREC 2004),2005:52-62
    [14]Voorhees E M and Dang H T.Overview of the TREC 2005 Question Answering Track.In:Proceedings of the 14th Text REtrieval Conference(TREC 2005),2006
    [15]Dang H T,Lin J,and Kelly D.Overview of the TREC 2006 Question Answering Track.In:Proceedings of the 15th Text REtrieval Conference(TREC 2006),Gaithersburg,Maryland,2007
    [16]Dang H T,Kelly D,and Lin J.Overview of the TREC 2007 question answering track.In:Proceedings of the 16th Text REtrieval Conference,2007
    [17]Cui Hang,Sun Renxu,Li Keya,et al.Question answering passage retrieval using dependency relations.In SIGIR'05:Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,New York,NY,USA.ACM Press,2005,400-407
    [18]Liu Y,Agichtein E.On the evolution of the Yahoo! Answers qa community.In:Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR).2008:737-738
    [19]Larson T,Gong J,Daniel J.Providing A Simple Question Answering System By Mapping Questions to Questions[Project Report],2006http://nlp.stanford.edu/courses/cs224n/2006/fp/henggong-telarson-joshd-1-cs224n_project_report.pdf
    [20]Cui H,Sun R,Li K,et al.Question Answering Passage Retrieval Using Dependency Relations,In:Proceedings of SIGIR 2005,Salvador,Brazil,2005
    [21]Sun R,Jiang J,Tan Y F,et al.Using syntactic and semantic relation analysis in question answering.In:Proceedings of the 14th Text REtrieval Conference,2005
    [22]李晓明,闫宏飞,王继民.搜索引擎--原理、技术与系统.北京:科学出版社,2004
    [23]Sergey B and Lawrence P.The Anatomy of a Large-Scale Hypertextual Web Search Engine.In:Proceedings of the 7th International Conference on World Wide Web.Brisbane,Australia,1998:107-117
    [24]Salton G,Fox E A,Wu H.Extended Boolean Information Retrieval.Commun.ACM.1983,26(11):1022-1036
    [25]Salton G,Lesk M E.Computer Evaluation of Indexing and Text Procsessing.J.ACM.1968,15(1):8-36
    [26]Salton G,Yang C S,Yu C T.A Theory of Term Importance in Automatic Text Analysis.Tech.rep.,Ithaca,NY,USA,1974
    [27]Salton G.Full Text Information Procsessing Using the Smart System.Data Eng.1990,13(1):2-9
    [28]Robertson S E,Jones K S.Relevance Weighting of Search Terms.Journal of the American Society for Information Science.1976,27(3):129-146
    [29]Robertson S E.The Probabilistic Character of Relevance.Information Procsessing &Management.1977,13(4):247-251
    [30]Jones K S.Experiments in Relevance Weighting of Search Terms.Information Processing &Management.1979,15(3):133-144
    [31]Robertson S E,van Rijsbergen C J,Porter M F.Probabilistic Models of Indexing and Searching.SIGIR'80:Proceedings of the 3rd annual ACM conference on Research and development in information retrieval.Kent,UK,1981:35-56
    [32]Robertson S E,Walker S.Some Simple Effective Approximations to the 2-poisson Model for Probabilistic Weighted Retrieval.SIGIR'94:Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA,1994:232-241
    [33]Robertson S E,Walker S,Jones S,et al.Okapi at Trec-3.In:Proceedings of the 3rd Text REtreival Conference.1994
    [34]Robertson S E,Walker S.Okapi/keenbow at Trec-8.In:Proceedings of the 8th Text REtreival Conference.1999
    [35]Ponte J M,Croft W B.A Language Modeling Approach to Information Retrieval.SIGIR'98:Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA,1998:275-281
    [36]吴友政,赵军,段湘煜,徐波.问答式检索技术及其评测研究综述.中文信息学报.19(3),2005:1-13
    [37]于士涛,袁晓洁,师建兴,杨娜.一种Web问答系统中基于XML片段的语义项模型.计算机研究与发展,2007,44(增刊):386-391(第24届中国数据库学术会议,海口)
    [38]Lin D.Dependency-based Evaluation of MINIPAR,Workshop on the Evaluation of Parsing Systems,Granada,Spain,1998
    [39]Berwick R C,Abney S P,Tenny C.Principle-based Parsing:Computation and Psycholinguistics.Kluwer Academic Publishers,1991
    [40]Lin D.Principle-based Parsing without Overgeneration.Proceedings of ACL-93.Columbus,Ohio,1993:112-120
    [41]孔令波,唐世渭,杨冬青,等.XML数据的查询技术.软件学报,2007,18(6):1400-1418
    [42]孟小峰,周龙骥,王珊.数据库技术发展趋.软件学报,2004,15(12):1822-1836
    [43]Abiteboul S,Quass D,McHugh J,Widom J,Wiener J.The Lorel query language for semistructured data.International Journal on Digital Libraries,1997,1(1):68-88
    [44]Deutsch A,Fernandez M,Florescu D,Levy A,Suciu D.A query language for XML.Computer Networks,1999,31(11-16):1155-1169
    [45]Ceri S,Comai S,Damiani E,Fraternali P,Paraboschi S,Tanca L.XML-GL:Agraphical language for querying and restructuring XML documents.Computer Networks,1999,31(11-16):1171-1187
    [46]Chamberlin D,Robie J,Florescu D.Quilt:An XML query language for heterogeneous data sources.In:Suciu D,Vossen G,eds.Proceedings of the International Workshop on the Web and Databases(WebDB 2000).Dallas:Springer-Verlag,2000.1-25
    [47]Clark J,DeRose S.XML Path Language(XPath) Version 1.0 W3C Recommendation.World Wide Web Consortium,1999
    [48]Charaberlin D.XQuery:A query language for XML W3C working draft.Technical Report,WD-xquery-20010215,World Wide Web Consortium,2001
    [49]Li QZ,Moon B.Indexing and querying XML data for regular path expressions.In:Apers PMG,Atzeni P,Ceri S,Paraboschi S,Ramamohanarao K,Snodgrass RT,eds.Proceedings of the 27th International Conference on Very Large Data Bases(VLDB).Rome:Morgan Kaufmann Publishers,2001.361-370
    [50]Cooper B F,Sample N,Franklin M J,Hjaltason G R,Shadmon M.A fast index for semistructured data.In:Apers PMG,Atzeni P,Ceri S,Paraboschi S,Ramamohanarao K,Snodgrass RT,eds.Proceedings of the 27th International Conference on Very Large Data Bases(VLDB).Rome:Morgan Kaufmann Publishers,2001.341-350
    [51]Zhang C.Relational databases for XML indexing[Ph.D.Thesis].Wisconsin:University of Wisconsin-Madison,2002
    [52]孔令波,唐世渭,杨冬青,等.XML数据索引技术.软件学报,2005,16(12):2063-2079
    [53]Zhang N,Ozsu MT,Ilyas IF,Aboulnaga A.FIX:Feature-based indexing technique for XML documents.In:Dayal U,Whang KY,Lomet DB,et al.eds.Proceedings of the 32nd International Conference on Very Large Data Bases(VLDB).Seoul:ACM Press,2006.259-270
    [54]Cho SR,Koudas N,Srivastava D.Meta-Data indexing for XPath location steps.In:Chaudhuri S,Hristidis V,Polyzotis N,eds.Proceedings of the ACM SIGMOD International Conference on Management of Data(SIGMOD).Chicago:ACM Press,2006.455-466
    [55]Bird S,Chen Y,Davidson SB,et al.Designing and evaluating an XPath dialect for linguistic queries.In:Liu L,ReuterA,Whang KY,et al.,eds.Proceedings of the 22nd International Conference on Data Engineering(ICDE).Atlanta:IEEE Computer Society,2006.52
    [56]Abiteboul S,Senellart P.Querying and updating probabilistic information in XML.In:Ioannidis YE,Scholl MH,eds.Advances inDatabase Technology,Proceedings of the 10th International Conference on Extending Database Technology(EDBT 2006).Munich:Springer-Vedag,2006.1059-1068
    [57]Baeza-Yates R,Ribeiro-Neto B,et al.Modern Information Retrieval.Pearson Education Limited,1999
    [58]Fuhr N,Groβjohann K.XIRQL:A query language for information retrieval in XML documents.In:Croft WB,Harper DJ,KraftDH,Zobel J,eds.Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR).New Orleans:ACM Press,2001.172-180
    [59]Barg M,Wong RK.Structural proximity searching for large collections semi-structured data.In:Paques H,Liu L,Grossman D,eds.Proceedings of the ACM Conference on Information and Knowledge Management(CIKM).Atlanta:ACM Press,2001.175-182
    [60]Cohen S,Mamou J,Kanza Y,Sagiv Y.XSearch:A semantic search engine for XML.In:Freytag JC,Lockemann PC,Abiteboul S,Carey MJ,Selinger PG,Heuer A,eds.Proceedings of the 29th International Conference on Very Large Data Bases (VLDB).Berlin:Morgan Kaufmann Publishers,2003.45-56
    [61]Curtmola E,Amer-Yahia S,Brown P,Fernandez M.GalaTex:A conformant implementation of the XQuery FullText language.In:Florescu D,Pirahesh H, eds. Proceedings of the 2nd International Workshop on XQuery Implementation, Experience, and Perspectives (XIME-P). Baltimore: ACM Press, 2005. 1024-1025

    [62] Amer-Yahia S, Botev C, Shanmugasundaram J. TeXQuery: A FullText search extension to XQuery. In: Feldman SI, UretskyM, NajorkM, Wills CE, eds. Proceedings of the 13th Conference on World Wide Web (WWW). Manhattan: ACM Press, 2004. 583-594

    [63] Amer-Yahia S, Lakshmanan LV, Pandit S. FleXPath: Flexible structure and full-text querying for XML. In: Weikum G, Konig AC, Deβloch S, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). Paris: ACM Press, 2004. 83-94

    [64] Balmin A, Papakonstantinou Y, Hristidis V. A system for keyword proximity search on XML databases. In: Freytag JC, Lockemann PC, Abiteboul S, Carey MJ, Selinger PG, Heuer A, eds. Proceedings of the 29th International Conference on Very Large Data Bases (VLDB). Berlin: Morgan Kaufmann Publishers, 2003. 1069-1072

    [65] Weigel F, Meuss H, Schulz KU, Bry F. Content and structure in indexing and ranking XML. In: Amer-Yahia S, Gravano L, eds. Proceedings of the 7th International Workshop on the Web and Databases (WebDB). Maison de la Chimie: ACM Press, 2004. 67-72

    [66] Xu Y, Papakonstantinou Y. Efficient keyword search for smallest LCAs in XML databases. In: Ozcan F, ed. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). Baltimore: ACM Press, 2005. 537-538

    [67] Guo L, Shao F, Botev C, Shanmugasundaram J. XRANK: Ranked keyword search over XML documents. In: Halevy AY, Ives ZG, Doan A, eds. Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD). San Diego: ACM Press, 2003. 16-27

    [68] Florescu D, Kossmann D, Manolescu I. Integrating keyword search into XML query Processing. The International Journal of Computer and Telecommunications Networking archive, 2000, 33 (1-6): 119-135

    [69] Carmel D, Maarek YS, Mandelbrod M, Mass Y, Soffer A. Searching XML documents via XML fragments. In: Proceedings of the 26thAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ). Toronto: ACM Press, 2003. 151-158

    [70] Chinenyanga T, Kushmerick N. Expressive and efficient ranked querying of XML data. In: Mecca G, Simeon J, eds. Proceedings of the 4th International Workshop on the Web and Databases (WebDB 2001). Santa Barbara: ACM Press, 2001. 1-6

    [71] Theobald A, Weikum G. The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen CS, Jeffery KG, Pokorny J, eds. Proceedings of the 8th Conference on Extending Database Technology (EDBT). Prague: Springer-Verlag, 2002. 477-495

    [72] Bremer JM, Gertz M. XQuery/IR: Integrating XML document and data retrieval. In: Fernandez MF, Papakonstantinou Y, eds. Proceedings of the 5th International Workshop on the Web and Databases (WebDB). Madison: ACM Press, 2002. 1-6
    [73] Hayashi Y, Tomita J, Kikui G. Searching text-rich XML documents with relevance ranking. In: Proceedings of the SIGIR Workshop on XML and Information Retrieval. 2000
    [74] Schmidt A, Kersten LM, Windhouwer M. Querying XML documents made easy: Nearest concept queries. In: Young DC, ed. Proceedings of the 17th International Conference on Data Engineering (ICDE). Heidelberg: IEEE Computer Society, 2001. 595-604
    [75] Graupmann J, Schenkel R, Weikum G. The SphereSearch engine for unified ranked retrieval of heterogeneous XML and Web documents. In: B6hm K, Jensen CS, Haas LM, et al., eds. Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). Trondheim: ACM Press, 2005. 529-540
    [76] Shasha D, Zhang K. Approximate tree pattern matching. In: Apostolico A, Galil Z, ed. In: Proceedings of the Pattern Matching Algorithms. Oxford University, 1997
    [77] Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science, 2005, 337 (1-3): 217-239
    [78] Amer-Yahia S, Koudas N, Marian A, Srivastava D, Toman D. Structure and content scoring for XML. In: Bohm K, Jensen CS, Haas LM, Kersten ML, Larson P, Ooi BC, eds. Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). Trondheim: ACM Press, 2005. 361-372

    [79] Arvola P, Junkkari M, Kekalainen J. Generalized contextualization method for XML information retrieval. In: Herzog O, Schek H, Fuhr N, et al., eds. Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM). Bremen: ACM Press, 2005. 20-27
    [80] Wolff JE, Florke H, Cremers AB. Searching and browsing collections of structural information. In: Proceedings of the IEEE Advances in Digital Libraries (ADL 2000). Washington: ACM Press, 2000. 141-150
    [81] Guha S, Jagadish HV, Koudas N, Srivastava D, Yu T. Approximate XML joins. In: Franklin MJ, Moon B, Ailamaki A, eds. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD). Madison: ACM Press, 2002. 287-298
    [82] Yang R, Kalnis P, Tung AK. Similarity evaluation on tree-structured data. In: Ozcan F, ed. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). Baltimore: ACM Press, 2005. 754-765
    [83] Augsten N, Bohlen MH, Gamper J. Approximate matching of hierarchical data using pq-grams. In: B6hm K, Jensen CS, Haas LM, Kersten ML, Larson P, Ooi BC, eds. Proceedings of the 31st International Conference on Very Large Data Bases (VLDB). Trondheim: ACM Press, 2005. 301-312
    [84] Schlieder T, Meuss H. Querying and ranking XML documents. Journal of the American Society for Information Science and Technology, 2002, 53 (6): 489-503
    [85] Railing K, Kriegel H, Schonauer S, Seid1 T. Efficient similarity search for hierarchical data in large databases.In:Bertino E,Christodoulakis S,Plexousakis D,et al.,eds.Advances in Database Technology-EDBT 2004,Proceedings of the 9th International Conference on Extending Database Technology(EDBT).Greece:Springer-Verlag,2004.676-693
    [86]Kotsakis E.Structured information retrieval in XML documents.In:Proceedings of the 2002 ACM Symp.on Applied Computing(SAC).Madrid:ACM Press,2002.663-667
    [87]Elisa Bertinoa,Giovanna Guerrinib,Marco Mesitia.A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications.Information Systems,2004,29(1):23-46
    [88]Yuan Xiaojie,Yu Shitao,Shi Jianxing,Chen Qiushuang.Question Classification in Question Answering Based on Real-world Web Data Sets.Journal of Southeast University (English Edition),2008,24(3):272-275(EI:20084411672351)
    [89]袁晓洁,师建兴,宁华,于士涛.问题分类中基于句法和语义信息的特征选择.计算机工程与应用,2008,44(33):144-147
    [90]Li X and Roth D.Learning question classifiers.In:Proceedingsseeding of the 19th International Conference on Computational Linguistics(COLING'02).Taipei,2002:556-562
    [91]文勖,张宇,等.基于句法结构分析的中文问题分类.中文信息学报.2006,20(2):33-39
    [92]Li X,Roth D,and Small K.The role of semantic information in learning question classifiers.In:Proceedings of the 1st International Joint Conference on Natural Language Procsessing.Cambridge University Press:2006,12(3):229-249
    [93]Zhang D,Lee W S.Question classification using support vector machines.In:the 26th ACM SIGIR.2003
    [94]李鑫,杜永萍,等.基于句法信息和语义信息的问题分类.第一届全国信息检索与内容安全学术会议,复旦大学,2004:243-251
    [95]Harabagiu S,Moldovan D,Pasca M,et al.Falcon:Boosting knowledge for answer engines.In:Proceedings of the 9th Text Retrieval Conference,NIST,2001
    [96]Hermjakob U.Parsing and question classification for question answering.In:ACL-2001Workshop on Open-Domain Question Answering.2001
    [97]Lytinen S and Tomuro N.The Use of Question Types to Match Questions in FAQFinder.In:Papers from the 2002 AAAI Spring Symposium on Mining Answers from Texts and Knowledge Bases:2002.46-53
    [98]Han Jiawei,Kamber M.Data Mining Concepts and Techniques.Morgan Kaufmann Publishers,2000
    [99]Mitchell T M.Machine Learning.McGraw-Hill Companies,Inc,1997
    [100]Yang Y M,Liu X.A re-examination of text categorization methods.Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'99),1999:42-49
    [101]Arulampalam G,Bouzerdoum A.A generalized feedforward neural network architecture for classification and regression. Neural Networks, 2003, 16(5-6): 561-568
    [102] Good I J. The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT Press, 1965
    [103] Langley P, Iba W and Thompson K. An analysis of bayesian classifiers. In: Proceedings of the tenth National Conference on Artificial Intelligence, AAAI Press and MIT Press, 1992: 223-228
    [104] Barber D and Williams C K I, Gaussian Proceedings for Bayesian Classification via Hybrid Monte Carlo, M. C. Mozer, M. I. Jordan, and T. Petsche, eds., Advances in Neural Information Procsessing Systems 9. MIT Press, 1997
    [105] Rish I . An empirical study of the naive bayes classifier. In: Proceedings of IJCAI-01 Workshop on Empirical Methods in Artificial Intelligence, 2001
    [106] Dudani S. The distance-weighted k-nearest neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics SMC-6 (4): 1976, 325-327
    [107] Ronglu L, Yunfa H. A density-based method for reducing the amount of training data in kNN text classification. Jour-nal of Computer Research and Development, 2004, 41 (14): 539-545
    [108] Cristianini N and Taylor J S. An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK, 2000
    [109] Zhang D, Lee W S. Question classification using support vector machines. In: the 26th ACM SIGIR. 2003
    [110] Miller G A, Beckwith R, Fellbaum C, et al. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography 3 (4), 1990: 235-244
    [111] Fellbaum, Christiane. WordNet: an electronic lexical database. Cambridge, Mass: MIT Press, 1999
    [112] Asanoma N. Alignment of Ontologies: WordNet and Goi-Taikei. In: Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations. Pittsburgh, Pennsylvania, 2001
    [113] Wong P W and Fung P. Nouns in HowNet and WordNet: An Analysis of Semantic Relations . In : Proceedings of the 1st International Conference on Global WordNet. Mysore, India, 2002
    [114] DorrB, LevowGA, LinD , etal. Large Scale Construction of Chinese-English Semantic Hierarchy. Technical Report LAMP TR 040, UMIACS TR 2000-17, CS TR 4120, University of Maryland, College Park, MD. 2000
    [115] Bonnie Dorr, Large-scale dictionary construction for foreign language tutoring and interlingual machine translation, Machine Translation, 1997
    [116] Dorr B J, Anne G L, Lin D, et al. Chinese-English Semantic Resource Construction, 2nd International Conference on Language Resources and Evaluation (LREC2000), Athens, Greece, 2000: 757-760

    [117] Qiang Z, Songyan F. Build a relation network representation for HowNet. In: Proceedings of 2000 International Conference on Multilingual Information Procsessing, Urumqi, China, 139-145
    [118] Wee G K, PingWai W. Annotating information structures in Chinese texts using HowNet. ACL 2000 HK
    [119] Yang E, Zhang G and Zhang Y, The Research of Word Sense Disambiguation Method Based on Co-occurrence Frequency of HowNet, Communication of COLIPS, 8 (2) 1999: 129-136
    [120] Son cut R, Brill E. Automatic question answering using the web: Beyond the Factoid. Information Retrieval, 2006, 9 (2): 191-206
    [121] Hyo J O, Chung H L, Hyeon J K, et al. Descriptive question answering in encyclopedia. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, 2005: 21-24
    [122] Yu Shitao, Yuan Xiaojie, Shi Jianxing. Knowledge Presentation Model for QnA Web Forums. Journal of Southeast University (English Edition), 2007, 23 (3): 369-372 (EI: 20074310888115)
    [123] Hovy E H, HermjakobU, Lin C Y, et al. Using Knowledge to Facilitate Pinpointing of Factoid Answers.Proceedings of the COLING-2002 conference. Taipei, Taiwan. 2002
    [124] Hovy E H, Hermjakob U, Lin C Y. The Use of External Knowledge in Factoid QA. Proceedings of the TREC-10 Conference. NIST, Gaithersburg, MD, 2001: 166-174
    [125] Tang H, Lee CW, Jiang TJ, et al. Query Term Selection Strategies for Web-based Chinese Factoid Question Answering. Proceedings of TAAI, 2006
    [126] Bian J, Liu Y, Agichtein E, et al. Finding the right facts in the crowd: Factoid question answering over social media. In: Proceedings of WWW2008, 2008
    [127] Dominguez-Sal D, Surdeanu M. A Machine Learning Approach for Factoid Question Answering. Proceeding of Artificial Intelligence and Applications, 2007
    [128] Ka Kan Lo, Wai Lam. Using semantic relations with world knowledge for question answering. In: Proceedings of the 15th Text REtrieval Conference. Gaithersburg, Maryland, 2006

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700