基于查询重写和关联搜索的本体查询算法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在互联网迅速发展、Web信息资源急剧增长的背景下,传统的信息检索由于信息源的固有特点和检索技术的缺陷,无法找到语义上匹配的信息,导致漏检、误检现象,从质量和效率上都不能满足用户需求。人们试图将各种先进的思想和方法引入信息检索领域中,促进其理论和技术的深层次变革。
     语义网的创始人——Tim Berners-Lee倡导的下一代万维网,旨在赋予Web上的信息以语义支持,通过本体技术,在资源之间建立起机器可处理的各类语义联系。语义网是当代万维网的扩展和延伸,它能够提高异构系统之间的互操作性,促进知识共享。语义网的核心——本文论,是人工智能领域的一种先进的知识表示技术,它通过明确定义概念和概念间的关系反映事物或现象的本质。基于本体的语义检索作为智能信息检索技术的一个重要领域,利用本体构建可共享、概念化的知识空间,实现对知识内容的结构化描述,具有一定的语义处理能力和较好的自然语义理解能力,而且可以处理概念关系逻辑,因而对于提高信息检索的质量和促进信息资源的利用率和共享,具有重要的研究价值。
     本文主要工作如下:
     分析和总结了信息检索技术概念、发展状况和存在问题与不足,介绍了基于本体的语义检索的研究现状。
     通过对现有本体技术理论和语义查询技术进行分析和研究,在本小组已有研究成果的基础上,提出了一个基于本体的语义检索系统的模型,详细描述了系统的设计思想、主要功能和运行流程。
     对本体查询涉及的主要技术和实现方法进行了深入的研究,探讨了本体持久化、语义推理、SPARQL语言及其通过Jena的实现。在此基础上,提出了基于查询重写和关联搜索的本体查询算法,详细论述了算法的思想和流程,分析了算法的可行性和实现难点。
     本文最后开发实现了一个基于查询重写和关联搜索算法的系统原型OSea,该系统以任意本体为查询载体,具有多种检索模式,并利用基于带权语义距离的语义相关度进行结果排序,验证了本文研究的算法的有效性和可行性。
     本文通过对本体技术,特别是本体检索所涉及的若干关键技术进行了深入的分析研究,对于解决信息检索效率不高、信息利用率低的问题,提出了一套有效的解决方案,该模型不仅能够处理基于本体的语义关系,能处理概念之间的关系逻辑,并且能进行语义扩展。通过实验证明应用了该算法的语义检索模型,在使用方法、查询效率和效果上相对于传统的信息检索有一定优越性。最后本文实现了该模型的原型——OSea系统,通过实践验证了系统模型的可行性。
In present web, as the high-speed increase of information resources, those search engines built on traditional query technique can not get rid of their limitation on semantic processing. One critical challenge is how to implement semantic-based information retrieval and sharing for web-scale data resource. New idea and methods are introduced into the field to reform its theory and technology.
     Semantic Web is the next generation of Web, which is advocated by Tim Berners-Lee. In Semantic Web, every resource are identified by an URI (United Resource Identification), are related to each other by a formalized definition in Ontology. Semantic Web is believed to the expansion and extension of Contemporary World Wide Web, where information resources will be expressed in a clear and formal way, which can be realized easily between people and PC. Therefore sematic web can improve the interoperability between heterogeneous systems and promote knowledge sharing. Ontology is an advanced technology of representing knowledge in AI., where specified concepts and relationships are used to describe information. Ontology-based semantic retrieval as one of the important direction of intelligent information, which use ontology to build sharable and conceptual data resource, not only have semantic processing capacity and NLI, but also are able to process logic of related concepts. So ontology-based semantic retrieval has import research value.
     The main achievement of this paper contains:
     It first analyzes the concept of information retrieval, analyzes its current state and existing shortages. Then it introduces the new hot topics of semantic web.
     By thoroughly research on ontology’s fundamentals and retrieval related techniques, based on the achievement of our team, a model for ontology-based semantic retrieval system is proposed. The design、main function and running mechanism of this model are explained.
     Give thoroughly studies on many ontology retrieval related technologies, including persistence of ontology, semantic inference、SPARQL query language for ontology and how to implement them all by Jena. Based on those researches, a query rewriting and semantic association based algorithm for ontology retrieval are brought forward. This algorithm’s performance and implement are discussed in detail.
     Finally a prototype of the model– OSea system is developed to implement the algorithm. The result of testing and analyze for this system indicate that the model and algorithm proposed in this paper is effective and feasible.
引文
[1] Google.HTTP://www.google.com/intl/en
    [2] Berners-Lee T.RDF and the Semantic Web.Technical Report,
    [3]王存刚基于Ontology的智能检索系统研究(硕士论文)中国海洋大学,2006
    [4] Steven M.Beitzel, Eric C.Jensen, Abdur Chowdhury Fusion of efective retrieval strategies in the same information retrieval system.Journal of the American Society for Information Science and Technology. 2004,55 (10):846-858
    [5] Yahoo.http://www.yahoo.com
    [6] OpenDirectory.http://dmoz.org
    [7] Snap.http://www.snap.com
    [8]邹景华.语义万维网在智能信息检索中的应用研究(硕士学位论文)重庆大学,2005.
    [9]陈康,武港山.基于Ontology的inxi检索技术研究.中文信息学报.2005,19(2)
    [10]吴金红一种给予本体论的知识检索原型系统情报杂志2004,23 (11):45-49
    [11] Berners-Lee T.RDF and the Semantic Web.Technical Report, XML2000 Conference in Washington,D.C.,USA,2000.
    [12] Arpirez J, Perez A G, Lozano A, etc. (Onto)Zagent: An Ontology-based WWW Broker to Select Ontologies. In: Gomez-Perez A, Benjamins V R, eds. Proceedings of the Workshop on Application of Ontologies and Problem-Solving Methods UK. 1998:16-24
    [13] Ontobroker. http://ontobroker.aifb.uni-karlsruhe.de
    [14] SKC. http://infolab.stanford.edu/skc
    [15] Joseki http://www.joseki.org/
    [16] Sparql-Query http://www.w3.org/TR/rdf-sparql-query/
    [17] Sesame http://www.aduna-software.com/
    [18] RDF http://www.w3.org/RDF
    [19] AllegroGraph http://agraph.franz.com/
    [20]邓志鸿,唐世渭,杨冬青.基于Ontology的多Agent分布式数字图书馆资源信息发现服务模型之研究.计算机工程,2002,28(6):37~38
    [21]金芝.基于Ontology的自动需求获取.计算机学报,2000,23(5):493~499
    [22]陆汝钤.世纪之交的知识工程与知识科学.北京:清华大学出版社,2001
    [23]曹存根.国家知识基础设施的意义.中国科学院院刊,2001,16(4):255~259
    [24]陆汝钤,石纯一,张松懋,等.面向Agent的常识知识库.中国科学(E),2000,30(5):453 ~463
    [25] MingGuo,ShanpingLi, JinxiangDong,etall.Ontology-based product data integration. In:Proc of the 17th Int. Confon Advanced Information Networking and Applications(AINA). Xi.an,China: IEEE Computer Society,2003(1):530~533
    [26]万捷,滕至阳.本体论在基于内容信息检索中的应用.计算机上程.2003, (3)
    [27]徐振宁等.基于本体的语义信息查询系统的研究与实现.计算机上程.2002, (12)
    [28]武成岗等.基于本体论和多主体的信息检索服务器.计算机研究与发展.2001,(6)
    [29] Tim Berners-Lee. Semantic Web Road Map. http://www.w3.org/DesignIssueslSemantic
    [30] C. Goble, D. Roure. The Grid: An Application of the Semantic volume 31,Issue 4.Decemeber 2002:65-70.ACM SIGMOD Record
    [31] S. McIlraith, T. C. Son, and H. Zeng. Semantic web services. IEEE Intelligent Systems, Special Issue on the Semantic Web, 16(2): 46-53,2001
    [32] R. Stuter, R. Volt, G . Stammer et al. Semantic Web State of the Art and Future lDirections.Kunstliche Intelligent 3(special Issue on the Semantic Web).2003:5-9
    [33] Borst W N. Construction of Engineering Ontologies for Knowledge Sharing andReuse. PhD thesis.University Twente, Enschede, 1997:67-72.
    [34] Sean Bechhofe , Frank Van Harmelen, Jom Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider,Lynn A.Stein. OWL Web Ontology Language Reference. http://www.w3.org/TRlowl-ref/
    [35]许云,樊孝忠,张峰,基于知网的语义相关度计算2005(5)
    [36] Guarino.N, Masolo.C, Veter.G OntoSeek:Content-based access to the Web, IEEE Intelligent Systems 1999 14(3) :70-80
    [37] Jerome Euzenat Eight Questionsa bout Semantic Web Annotations IEEE Intelligent Systems 2002:55-62
    [38] Evren Sirin,Bijan Parsia:SPARQL-DL:SPARQL Query for OWL-DL.3rd OWL Experiences and Directions Workshop(OWLED-2007),2007:1一10
    [39]许云,樊孝忠,张峰,基于知网的语义相关度计算2005(5)
    [40] Philip McCarthy. http://www.ibm.com/developerworks/cn/java/j-jena/
    [41]宋炜,张铭.语义网简明教程.高等教育出版社2004.6

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700