基于本体技术的语义检索及其语义相似度研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的发展和Internet上信息量的激增,信息检索系统作为网络信息平台的一个重要组成部分,在用户获取准确的网络信息过程之中发挥着重要的作用。传统的信息检索仅仅是基于语法层面上的简单匹配,缺乏对知识的表示、处理和理解能力,其实质在于信息资源缺少统一的语义描述,用户难以查询到与需求相关的信息,难以实现相关信息的语义融合,问题的关键在于将信息检索从传统的基于语法的简单匹配提升到基于语义知识层面。
     语义Web(Semantic Web)是WWW的发明人Tim Berners-Lee倡导的下一代万维网,致力于以计算机可处理形式表示信息。语义Web的目的是让计算机能够“理解”Web上的信息,并在“理解”的前提下更好地处理和利用这些信息,为人类提供更好的服务。本体具有良好的概念层次结构和对逻辑推理的支持,能够通过概念之间的关系来表达概念语义的能力,实现语义上的信息表示,可以很好地应用于信息检索。基于本体的信息检索不同于传统的关键词检索,利用本体知识库强化了概念之间的内在联系,通过逻辑推理可以发掘概念之间隐含的和不明确的信息,实现语义智能信息检索。
     首先对传统信息检索技术进行了分析研究,导致其检索质量低下的根本原因在于传统信息检索采用基于语法的匹配方式,缺乏检索信息的语义理解,探讨了将本体技术应用于信息检索,实现语义智能信息检索。
     其次分析研究了语义Web和本体技术,包括它的来源定义、框架结构、研究现状和应用等。语义Web是对现有万维网的扩展和进化,基于元数据和本体的语义和知识的表达,提供充分的丰富的语义信息使得机器可以理解,达到机器可以自动处理信息的能力。另外详细分析了本体技术在电信领域的应用情况,包括基于本体的网络系统管理集成信息模型、语义Web技术应用于上下文感知的智能移动Web服务和电信领域本体的构建等。
     接着重点研究分析了基于本体的语义智能信息检索的关键技术,包括本体技术、智能信息检索方法、领域本体构建和系统流程等。基于对传统信息检索技术的不足和本体技术,设计了基于领域本体的语义智能检索系统。分析了当前互联网上的手机商品在线网站的检索系统,提出了基于本体的语义智能检索系统框架模型,构建了实验系统的手机商品本体,并进行了智能信息检索系统的语义推理分析。
     在前面技术理论和系统技术设计的基础上,实现了基于本体的手机商品语义检索系统(MPPSRS)。该实验系统以手机商品领域为智能检索对象,通过本体的语义推理处理,可以充分发掘检索信息之间隐含的关联信息,为用户提供了良好的语义检索服务,从而在根本上解决传统信息检索中资源对象语义信息缺乏的问题,更加准确和全面地查询到用户需要的手机商品信息,实现语义智能信息检索。
     然后分析了当前概念相似度研究现状,结合本体技术,在构建的领域本体的基础上,提出了一种改进的基于领域本体的语义相似度的计算模型,该模型结合基于距离的语义相似度和基于属性的语义相似度,其中基于距离的语义相似度综合考虑并利用了本体类的层次关系中的多种影响因素,如语义重合度、语义层次深度、语义距离、语义密度以及相应的调节因子等,来计算领域本体内部概念之间的语义相似度。
     最后结合上一章具体探讨的改进的基于领域本体的语义相似度计算模型,设计并实现了基于本体的电子镇流器/荧光灯管产品检索推荐系统(BLPRRS)。分析了某公司的实际需求,基于本体技术,结合该公司产品特点,在抽取公司研发和销售的电子镇流器和荧光灯管产品,构建了电子镇流器和荧光灯管的本体库的基础上,实现了实验系统。通过调整实验系统中相应的各个调节因子,并将实验数据与专家主观判断进行比较,分析并验证了改进的语义相似度计算方法的效果,表明基于本体的语义相似度计算模型可以帮助扩展检索概念,提供有效的产品检索结果。
With the development of network technology and rapid increasing information on Internet, information retrieval system plays an important role at communication between users and resource on the network. The traditional information retrieval is only based grammar match, which lack of the presentation, handling and understanding of knowledge. The key problem is that information resource is lack of semantic description, so that it is hard for users to retrieve the information which they really want and impossible to associate information resource with semantic feature. The essential solution to this problem lies in the information retrieval from the traditional grammar-based level upgraded to knowledge-based semantic level.
     Semantic Web is an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Ontology has the good hierarchical structure of concepts and the support of logical reasoning, and semantic information can be realized through the semantic relationship of concepts. Ontology technology can be well applied to information retrieval. Ontology-based information retrieval is different from the traditional keyword search. Semantic Intelligent Information Retrieval can be realized because Ontology knowledge base strengthens the intrinsic link of the concepts and the implied and unclear information can be deduced through logical reasoning.
     This paper analyzed the traditional information retrieval technology and got that the reason of the low quality of its retrieval fundamentally lies in the traditional information retrieval based on the matching syntax and lack of the semantics of information retrieval. And this paper put forwarded the Ontology technologies to be applied to information retrieval. Another way, Ontology technology applied in the field of telecommunications applications was analyzed in detail, including Ontology-based network management system integrated information model, Semantic Web technologies in the context-aware smart mobile Web services and ontology construction in telecommunications field.
     Then this paper focuses on the analysis of several key technologies of ontology-based semantic intelligent information retrieval, including ontology technology, the method of Semantic Intelligent Information Retrieval, domain ontology building process, and system process. Based on analysis of traditional information retrieval technology and ontology technologies, Ontology-based Semantic Intelligent Retrieval System was designed. After analysis of the current information retrieval system of on-line mobile phone product shop on the Internet website, the semantic intelligent retrieval system framework model based on ontology was proposed. Then mobile phone product ontology was constructed for the experimental system, and the semantic reasoning was analyzed in Semantic Intelligent Information Retrieval.
     After that, Mobile Phone Product Semantic Retrieval System (MPPSRS) was developed based on the technology theory and system design in previous sections. Mobile phone product was the intelligent retrieval object in this experimental system. Through the semantic reasoning based on ontology, we can fully explore the retrieval of information which users implied. This system offered a good semantic retrieval services which fundamentally solve the shortage of traditional information retrieval in which information resource was lack of semantic information, and this system provided users the more accurate and comprehensive retrieval result as users' inquiries and achieved Semantic Intelligent Information Retrieval.
     At last but important two sections in this paper, traditional concept semantic similarity computation models was analyzed, and based on domain ontology, a reformative semantic similarity algorithm was put forwards, which integrated semantic similarity based on distance and semantic similarity based on attribute. For distance-based semantic similarity, several important elements which are implicated in domain ontology were taken into account, such as semantic ancestor, semantic depth, semantic distance, semantic density, related adjustment factors and so on. Then an ontology base from an actual company, Ballasts and Lamps, was developed and a semantic similarity retrieval experimental system, Ballasts & Lamps Product Retrieval Recommendation System (BLPRRS) was developed. And the experimental result demonstrated this semantic similarity computation model could help to extend the query concepts sets and provide an effective product retrieval result.
引文
[1]Antoniou Grigoris,Van Harmelen Frank.A semantic web primer[M].MA:The MIT Press,2004.
    [2]宋炜,张铭.语义网简明教程[M].北京:高等教育出版社,2004.
    [3]Berners-Lee Tim,Hendler James,Lassila Ora.The semantic web[J].Scientific American,2001,284(5):34-43.
    [4]Berners-Lee Tim.Semantic web road map[J/OL].http://www.w3.org/DesignIssues/Semantic.html,1998-10-14.
    [5]Berners-Lee Tim.Semantic web architecture[J/OL].http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0,html,2001-07-20.
    [6]Decker S,Van Hermelen F,Broekstra J,et al.The semantic web-on the respective roles of xml and rdf[J].IEEE Internet Computing,2000,4(5):63-74.
    [7]Manola Frank,Miller Eric.RDF primer[J/OL].http://www.w3.org/TR/2004/REC-rdf-primer-20040210/,2004-02-10.
    [8]Gruber Thomas R.A translation approach to portable ontology specifications[J].Knowledge Acquisition.1993,5(2):199-220.
    [9]Studer R,Benjamins V R,Fensel D.Knowledge Engineering,Principles and Methods[J].Data and Knowledge Engineering,1998,25(1-2):161-197.
    [10]L McGuinness Deborah,van Harmelen Frank.OWL web ontology language overview[J/OL].http://www.w3.org/TR/2004/REC-owl-features-20040210/,2004-02-10.
    [11]朱礼军,陶兰,黄赤.语义万维网的概念、方法及应用[J].计算机工程与应用,2004,3:79-83.
    [12]Miller Eric.The w3c's semantic web activity:an update[J].IEEE Intelligent Systems,2004,19(3):95-97.
    [13]Zhou Linda,Wagner Gerd.Semantic web research community:a column dedicate to presentation of research group worldwide[J].AIS SIGSEMIS Bulletin,2004,1(2):119-122.
    [14]Ding Ying,Fensel Dieter,Stork Hans-Georg.The semantic web:from concept to percept[J].Austrian Artificial Intelligence Journal,2003,21(4).
    [15]Guarino N,Masolo C,Veter G.Ontoseek:content-based access to the web[J].IEEE Intelligent Systems,1999,14(3):70-80.
    [16]Shun Simon Buckingham,Motta Enrico,Domingue John.Scholonto:an ontology-based digital library server for research documents and discours[J].International Journal Digital Libraries,2000,3(3):237-248.
    [17]Fensel Dieter,Hendler Jim,Lieberman Henry,et al.Spinning the semantic web introduction[J/OL],http://www.dfki.de/~wahlster/Publications/Introduction to Spinning_the_Semantic_Web.pdf,2003-2-25.
    [18]Martin David,Burstein Mark,Hobbs Jerry,et al.Owl-s:semantic markup for web services[J/OL].http://www.w3.org/Submission/OWL-S/,2004-11-22.
    [19]白同强,刘磊.语义Web的研究与展望[J].吉林大学学报(信息科学版),2004,22(2):154-159.
    [20]杜小勇,李曼,王大治.语义Web与本体研究综述[J].计算机应用,2004,24(10):14-16.
    [21]李善平,尹奇韡,胡玉杰等.本体论研究综述[J].计算机研究与发展,2004,41(7):1041-1052.
    [22]邓志鸿,唐世渭,张铭等.Ontology研究综述[J].北京大学学报(自然科学版),2002,38(5):730-738.
    [23]Lopez de Vergara J E,Villagra V A,Asensio J I,et al.Ontologies:Giving Semantics to Network Management models[J].IEEE Network,2003,17(3):15-21.
    [24]Lavinal E,Desprats T,Raynaud Y.A Conceptual Framework for Building CIM-based Ontologies[C].Proceeding of the IFIP/IEEE Eighth International Symposium(IM2003).Colorado:Colorado Springs,2003.135-138.
    [25]黄卿贤,胡谷雨.基于本体的网络管理知识模型[J].北京邮电大学学报,2003,26(增刊):47-51.
    [26]Y A K Wong,Chen An Chi,N Paramesh,P Rav.Ontology Mapping for Network Management Systems.In:Network Operations and Management Symposium.2004:885-886
    [27]Keeney J,Carey K,Lewis D,et al.Ontology-based Semantics for Composable of Autonomic Elements[C].Proceeding of Workshop on AI in Autonomic Communications at 19th International Joint Conference on Artificial Intelligence.Edinburgh,Scotland,2005.
    [28]战照鹏,付长龙,姚全珠.基于语义Web技术的上下文感知系统架构[J].计算机工程与应用,2005,41(14):94-97.
    [29]Sheshagiri M,Sadeh N,Gandon F.Using Semantic Web Services for Context-Aware Mobile Applications[C].Proceeding of MobiSys 2004Workshop on Context Awareness.Boston,2004.
    [30]Sadeh N,Gandon F,Kwon O B.Ambient Intelligence:The MyCampus Experience[R].School of Computer Science,Carnegie Mellon University,Technical Report CMU-ISRI-05-123,July 2005.
    [31]Norbert Weissenberg,Agnes Voisard,Rudiger Gartmann.Using Ontologies in Personalized Mobile Applications[C].Proceeding of the 12th annual ACM international workshop on Geographic information systems,2004.2-11.
    [32]Wahlster Wolfgang.SmartWeb:Mobile Applications of the Semantic Web[C].Proceeding of Informatik.2004.50-51.
    [33]Li John.Communication Ontology[EB/Oh].http://reliant,teknoledge,com/DAML/Communi-cations,owl.
    [34]Philippe A Martin.A Small Taxonomy in the Communication Uomain[EB/OL].http://meganesia,int.gu.edu.au/~phmartin/WebKB/kb/comms.html.
    [35]Helin Heikki,Mikko Laukkanen.Wireless Network Ontology[C].Proceeding of the Wireless World Research Forum 9th Meeting.Zurich,2003.
    [36]Asuncion Gomez-Perez,G.Van Heijst.Knowledge Sharing and Reuse:Ontologies and Applications,Tutorial on Ontological Engineering:IJCAI'99.
    [37]刘群,李素建.基于《知网》的词汇语义相似度计算[J].计算语言学及中文信息处理,2002,7(2):59-76.
    [38]G.Bisson,Why and how to define a similarity measure for object based representation systems.Towards very Large Knowledge Base,1995,236-246
    [39]Higgins D.Which statistics reflect semantic? Rethinking synonymy and word similarity[C].Proc.Of International Conference on Linguistic Evidence,2004.
    [40]Budan Itsky A,Hirst G.Evaluating Wordnet-based measures of lexical semantic relatedness[J].Computational Linguistics,2004,1(1):1-49.
    [41]梅翔,孟祥武,陈俊亮等.SSCM:一种语义相似度计算方法[J].高技术通讯,2007,17(5):458-463.
    [42]黄果,周竹荣.基于领域本体的概念语义相似度计算研究[J].计算机工程与设计,2007,28(10):2460-2463.
    [43]徐德智,王怀民.基于本体的概念间语义相似度计算方法研究[J].计算机工程与应用,2007,43(8):154-156.
    [44]李鹏,陶兰,王弼佐.一种改进的本体语义相似度计算及其应用[J].计算机工程与设计,2007,28(1):227-229.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700