基于语义技术的智能搜索引擎研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
互联网作为全球最大的数据信息库,随着其覆盖范围和领域的不断扩大,存储在互联网上的数据也在海量增长。搜索引擎帮助用户从海量的数据中抽取出潜在的、有价值的信息。在针对特定领域的垂直搜索引擎的基础上,更加高效的智能化的搜索引擎的研究也就成为发展的必然。
     本文通过语义Web技术为搜索引擎注入基于知识和本体概念的自然语言理解能力。搜索引擎构建于知识库之上,通过语义化的索引器构建集知识与互联网数据为一体的索引库。用户的查询经过分词、语义推理和查询扩展处理,以规范化形式在索引库中进行搜索。搜索结果综合了Page Ranking算法、词义语义分析因素、检索内容与网页特征相关性三个要素进行排序得到。采用这种方法的搜索引擎弱化了用户表达模糊对搜索的影响;克服了关键字机械匹配的缺点;使得事物间不再孤立的存在,而是以相互关联的形式表现出来;同时能够达到知识的系统化整合。
Internet is the largest database of the world. More and more fields are covered, the information contented by the internet are growing constantly and rapidly. Search engine will help user to search valuable and underlying information. Based on the fields focused vertical information retrival techniques, the research on more efficient and intelligent search engine become to inexorable trend.
     This paper introduces a method, which gives the ability of nature language understanding to search engineby using semantic web techniques. The search engine built on knowledge base. It use semantic indexer to construct an index database which contain knowledge and web data of the web. User's query are processed through tokenizing, semantic reasoning and query extention, result in a standard form and be used to query index database. Search results are ranked by according to Page Ranking algorithm, semantic analysis factor and the relativity of web page's characteristic. By this method, the search engine weakens the affection of vague expression, overcome the shortage of mechanical keywords matching process; Things no more isolate but appear in relationship; Knowledge is integrated in system approach.
引文
[1]K.K.NAMBIAR:Theory of Search Engines[J].Computers and Mat,hematics with Applications 42(2001),pp:1523-1526
    [2]Baden Hughes,Amol Kamat.A Metadata Search Engine for Digital Language Archives[J]D-Lib Magazine,11(2),2005-2
    [3]Can.F,Nuray.R,Sevdik,A.B:Automatic performance evaluation of Web search engines[J].In formation Processing and Management.Vol.40,No.3,2004,pp.49 5-51
    [4]Davies J,Fencel D,Vanharmelen F:The Semantic Web:Ontology-driven[C]11JOHN WILEY.Knowledge Management,[s.l.]:[s.n.],2002
    [5]W3C Semantic Web Activity.http://www.w3 c.or g/2001/sw
    [6]Fensel Dieter,Horrocks Ian,Van Harm Elenf,et al:OIL:An Ontology Infrastructure for the Semantic Web[J].IEEE Intelligent Systems,16(2),2001.
    [7]Gil Alexander Shif.In Brief:DAML:The Foundation of an Intelligent Web[J].IEEE Distributed Systems Online 2(1):2001
    [8]Ian Horrocks,Peter F.Patel-Schneider:The Generation of DAML+OIL[J].Description Logics 2001.
    [9]Grigoris Antoniou,Frank van Harmelen:Web Ontology Language:OWL[J].Handbook on Ontologies in Information Systems.Springer-Verlag 2003
    [10]RDF入门推荐标准.http://zh.transwiki.org/cn/rdfprimer.htm
    [11]张体首,蔡明.语义搜索引擎概念模型[J].微电子学与计算机,2007,24(3):171-177
    [12]E.Bozsak,Marc Ehrig,Siegfried Handschuh,et al:KAON-Towards a large s cale Semantic Web[J].E-Commerce and Web Technologies,Third International Conference,EC-Web 2002,Aix-en-Provence,France,September 2-6,2002,Proceedings,volume 2455 of Lecture Notes in Computer Science,pp.304-313.Sp ringer,2002.
    [13]S.Soderland.Learning information extraction rules for semistructured and free text.Machine Learning,1999:1-44
    [14]http://crawler.archive.org/Heritrix 官方网站
    [15]http://hadoop.apache.org Hadoop 官方网站
    [16 http://lucene.apache.org/ Lucene 官方网站
    [17]陆汝钤,姬广峰.关于常识的研究[J].世纪之交的知识工程与知识科学,2001(9):469-506
    [18]G.A.Miller,R.Bechwith,C.Fellbaum,D.Gross,K.Miller.1993.Introduction to WordNet:An On-line Lexical Database.
    [19]Mahesh,K.1995.Ontology Development for Machine Transla-tion:Ideology and Methodology.Technical report,new mexicostate university,Computing Research Laboratory.
    [20]Mahesh,K.and S.Nirenburg 1995.A Situated Ontology for Practical NLP.In Proceedings of the IJCAI-95 Workshop on Basic Ontological Issues in Knowledge Sharing,Aug.19-21,1995,Montreal.
    [21]Uschold M.Building Ontologies:Towards Unified Methodology[J].Inexpert systems 96,1996(3)
    [22]Gruber T.Towards principles for the design of ontologies used for knowledge sharing.International Journal of Human-Computer Studies 1995,43(5/6):907-928
    [23]Steffen Staabm Rudi Studer,et al.Knowledge Processes and Ontologies[J].IEEE Intelligent System,2001,16(1):26-34
    [24]Airio E,Jarvelin K,Saatsi P,et al.CIRI:an ontology-based query interfae for text retrieval[C]Proc of the 11~(th) Finnish Artificial Intelligence Conference.Vantaa,Finland,2004:73-82.
    [25]T.Joachims.Text categorization with support vector machines:learning with many relevant features.In:Proc.10th EuropeanCo nf er enceo nM achineL earning,1998:137-142
    [26]陈林、杨丹、赵俊芹.基于语义理解的智能搜索引擎研究 计算机科学2008V01.35No.6
    [27]Wen Kunmei,Lu Zhengding,Li Ruixuan,et al.A semantic search conceptual model and application in security access control[C]Proc of the First Asian Semantic Web Conference.Beijing,China,2006:366-376.
    [28]尤昉,李涓子,王作英.基于《知网》的中文信息结构抽取研究[J].计算机工程与应用,2002,38(18):56-58
    [29]Stojanovic N,Studer R,Stojanovic L.An approach for the Ranking of Query Results in the semantic web.In:Fcnscl,D.,Sycara,K.P.,MyloPoulos,J.(eds.):The semantic web - ISWC 2003,2~(nd) lntl.Semantic web Conf.Lecture Notes in Computer science,Borlin:Springer Verlag,2003:500-516.
    [30]沈复兴.智能知识推理系统的研究[D].北京:北京师范大学2004.
    [31]周强,冯松岩.构建知网关系的网状表示[J].中文信息学报,2000,14(6):21-27
    [32]夏幼明.基于描述逻辑的语义Web知识推理研究[D].云南:云南师范大学2005.
    [33]Mayfield J Finin Tim.Information retrievel on the semantic web:integerating inference and retrieval[C].SIGIR,Workshop on the Semantic Web,Toronto,2004:461-468.
    [34]Popescu Ana-Maria,Eziom Oren.Extracting product features and opinions from reviews[C]Proc of the Conference on Human Language Technology and Empirical Methods in Nature Language Processing,Morristown,NJ,USA:Association for Computational Linguistics,2005:440-448.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700