专利信息检索系统中本体半自动构建的研究与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本体具有良好的概念层次结构、支持逻辑推理,因而在信息检索领域得到了越来越广泛的应用,大大提高了信息检索的查全率和查准率。在此研究背景下,本文针对本体的半自动构建进行了深入研究,目的在于构建一个可应用于专利检索系统中的领域本体,通过本体在系统中的应用来提高系统检索效率、增强与用户的交互性以及实现专利情报的发现。本文主要工作内容如下:
     首先,通过对国内外著名专利检索系统的调研,总结得出专利检索系统尚可进一步完善的三个方面:检索效果、与用户的交互性、专利情报发现,确立了在专利系统中构建并应用本体的目标。
     然后,提出了一种构建专利领域本体的方案,对本体构建的难点问题——概念的获取、概念间关系的获取提出了解决方案,设计了专利检索系统中本体的半自动构建,按照本体构建的顺序将系统分为三大功能模块:构建本体雏型模块、概念发现模块、关系发现模块,对这三个模块进行了设计并实现。在本体半自动构建的概念抽取过程中,提出了单个词语在多个文档中权重计算的解决方法,用以计算获得领域内专利信息的主要特点特征。
     在已有专利检索系统的基础上设计并实现了基于本体的专利检索,最后通过对基于本体的语义检索方式与基于关键字匹配的检索方式进行检索效果对比,前者检索结果的查全率、查准率比后者有所提高,在结果显示、专利情报发现方面也有所改进,实验证明本文所提出的本体半自动构建方案可行,本体在检索系统中的应用在一定程度上完善了系统。
As ontology has a good concept hierarchical structure and supports the logic reasoning, it obtained more and more widespread application in the information retrieval domain, and it raised the recall and precision of the information retrieval greatly. Under such research background, this paper conducted a deep research on semiautomatic construction of ontology, in order to construct a domain ontology which can be applied in the patent retrieval system, and to raise the system retrieval efficiency, enhance the interactive with users and realize the discovery of patent information by the application of ontology in system .The main work of this paper is as follows: Firstly, through the investigation and study of domestic and foreign famous patent retrieval systems, we concluded three aspects which the patent retrieval system can still be further consummated: the efficiency of retrieval, the intercommunion with the user, and the discovery of patent information. Based on the three aspects, the target of constructing and applying ontology in the patent information retrieval system was established.
     Secondly, this paper proposed a method of constructing the patent domain ontology semiautomaticly, which solves difficult problems in the process of ontology construction: acquiring concepts and acquiring relationships between concepts. An ontology semi-automatic construction system in patent retrieval system was designed. The system can be divided into three major functional modules according to the order of ontology construction, was achieved. The three major functional modules are: constructing ontology prototype module, acquiring concepts module, and acquiring relationships module.And in the process of acquiring concepts, this article proposed a method to calculate the weight of one word in a lot of documents.
     At last, this article designed and realized the patent information retrieval based-on ontology. This paper proposed the design method of the retrieval module, and realized the semantic retrieval based on ontology with Jena,which is a develpment tool of ontology application.Comparing the ontology-based semantic retrieval method with keyword matching searching method, we find that the former's retrieval result is better than the latter in the recall and the precision of retrieval. The results show that the facet of patent information finding has also been improved. The experiments prove that the method of ontology semi-automatic proposed by this paper is feasible, and application of ontology in the retrieval system perfect the system to some extent.
引文
[1] 郭炜强,戴天,文贵华.基于领域知识的专利自动分类[J].计算机工程,2005,31(23): 52-54
    [2] 欧洲专利局专利检索系统 esp@cenet [DB/OL].http://ep.espacenet.com
    [3] 美国专利商标局专利检索系统 USPTO [DB/OL].http://www.uspto.gov/
    [4] DelPhion 知识产权网[DB/OL].http://www.delphion.com/simple
    [5] 世界知识产权组织专利数据库 WIPO[DB/OL].http://www.wipo.int/
    [6] 英国 Derwent 专利数据库[DB/OL].http://www.derwent.co.uk/
    [7] 日本专利局数据库 JPO[DB/OL].http://www.jpo.go.jp/
    [8] 德国专利商标局DEPATISnet数据库[DB/OL].http://www.dpma.de/suche/suche.html
    [9] 中国国家知识产权局专利检索系统 SIPO[DB/OL].http://www.sipo.gov.cn/sipo/zljs/
    [10] 中国知识产权网专利检索系统[DB/OL].http://www.cnipr.com/
    [11] Ontoligua 网站[EB/OL].http://www.ksl.stanford.edu/software/ontolingua/
    [12] OntoSaurus 网站[EB/OL].http://www.isi.edu/isd/ontosaurus.html
    [13] WebOnto 网站[EB/OL].http://www.kmi.open.ac.uk/projects/webonto/
    [14] Protégé 网站[EB/OL].http://protege.stanford.edu/
    [15] WebODE 网站[EB/OL].http://webode.dia.fi.upm.es/WebODEWeb/index.html
    [16] OntoEdit 网站[EB/OL].http://www.ontoknowledge.org/tools/ontoedit.shtml
    [17] OilEd 网站[EB/OL].http://oiled.man.ac.uk/
    [18] Castano S, De Antonellis V, De Capitani S,et al.Semi-automated Extraction of Ontological Knowledge from XML Datasources.In Proc. IEEE DEXA 2002 of Int.Workshop on Electronic Business Hubs (WEBH2002),2002:852-860.
    [19] Partick Clerk, Padraig CunningHam, Conor Hayes. Ontology Discovery for the Semantic Web Using Hierarchical Clustering [EB/OL].http://semwebmine2001.aifb. uni-karlsruhe.de /online.html
    [20] Sophie Le Moigno, Jean Charlet, Didier Bourigault.Terminology extraction from text to build an ontology in surgical intensive care [EB/OL].http://www.sop.inria.fr/ acacia/WORKSHOPS/ECAI2002-OLT/Proceedings/LeMoigno.pdf
    [21] Adam Farquhar, Richard Fikes, Wanda Pratt.Collaborative Ontology Construction for Information Integration [R].Technical Report KSL-95-10.Knowledge Systems Laboratory,Stanford University,1995
    [22] 顾慧翔等.基于领域本体和知识推理的语义互联网应用[J].上海交通大学学报,2004(4):586-585
    [23] 梁 邦 勇 等 . 面 向 Web 服 务 的 分 布 式 本 体 系 统 [J]. 计 算 机 工 程 与 应用,2003(11):98-103
    [24] 潘明阳.航海信息本体研究[J].大连海事大学学报,2003(S1):73-77
    [25] 缪涵琴.基于本体的专利信息检索系统的设计与实现[D].苏州大学,2007
    [26] 李志.基于数据集成中本体自动构建的研究[D].中南大学,2005
    [27] 张娜.基于本体的语义智能检索系统研究[D].西安工业大学,2007
    [28] Nicola Guarino. Understanding Building and Using Ontologies:A Commentary to Using Explicit Ontologies in KBS Development[J].International Journal of Human and Computer Studies,1997,46:293-310
    [29] Fernundez M, Pazos, et al.Ontology of tasks and methods[J].IEEE Intelligent System and Their Applications,1999,14(1):37-46.
    [30] 高茂庭,王正欧.Ontology 及其应用[J].计算机应用,2003,23(S2):31-33
    [31] Wache H,Vogele T,Visser U,et al.Ontology-Based Integration of Information-A Survey of Existing Approaches [A].In Proc of IJCAI 2001,Workshop on Ontologies and Information Sharing[C],2001:108-117
    [32] Resource Description Framework (RDF) Model and Syntax Specification [EB/OL]. http://www.w3.org/TR/REC-rdf-syntax
    [33] 周 竞 涛 , 王 明 微 .XML+RDF — 实 现 Web 数 据 基 于 语 义 的 描 述 [EB/OL]. http://www-900.ibm.com/developerWorks/cn/xml/x-xmlrdf/index.shtml
    [34] The DARPA Agent Markup Language [EB/OL].http://www.daml.org/
    [35] Mike.OWL Web ontology language reference [EB/OL].http://www.w3c.org/TR/ owl-ref
    [36] Davies J, Weeks R.QuizRDF: Search Technology for the Semantic Web [A]. Proceedings of the 37th Hawaii International Conference on System Sciences[C], IEEE Comput.Soc.Press,2004:112-119
    [37] Dublin Core Technology [EB/OL].http://www.metadata.com.cn/BulinCore.htm
    [38] Fensel D,Angele J,Decker S,et al.On2broker:Semantic-Based Access to Information Sources at the WWW [A].In World Conference on the WWW and Internet(WebNet 99) [C],1999:366-371
    [39] 杜剑峰. 网络信息集成系统的研究[D].中山大学,2002
    [40] GENEONTOLOGY [EB/OL].http://www.geneontology.org/doc/GO.indices.html
    [41] Berners Lee T, Hendler J, Lassila O.The semantic Web[J]. Scientific American,2001,284(5):34-43
    [42] 苗壮等.本体的半自动构建技术.解放军理工大学学报,2006(7):426-431
    [43] Faure D, Poibeau T. First experiments of using semantic knowledge learned by ASIUM for information extraction task using INTEX [A].Berlin: Proceedings of the Workshop on Ontology Learning. 14th European Conference on Artificial Intelligence[C],2000:7-12
    [44] Maedche A,Staab S.Ontology Learning for the Semantic Web [J].IEEE Intelligent Systems, Special Issueon the Semantic Web,2001,16(2):72-79
    [45] M.A.Hearst. WordNet: an electronic lexical database [M].Cambridge: MIT Press, 1998:132-152
    [46] Gupta, K.M, Aha. An architecture for engineering sublanguage WordNets [A]. Proceedings of the First International Conference On Global WordNet[C], 2002: 207-215
    [47] JANNINK J. Thesaurus entry extraction from an online dictionary[A].Sunnyvale: Proceedings of Fusion 99[C],1999:110-138
    [48] Suryanto H, Compton P.Discovery of ontologies from knowledge bases[A].British Columbia:Proceedings of the 1st Internationl Conference on Knowledge Capture[C],2001:171-178
    [49] D.Taniar. Web Information Systems [M].London:Idea Group Publishing,2004:25-58
    [50] 余传明.基于本体的语义信息系统研究——理论分析与系统实现[D].博士论文,武汉大学,2005
    [51] 刘群,李素建.基于《知网》的词汇语义相似度计算[A].第三届汉语词汇与医学研讨会[C],2002,7(2):59-76
    [52] 丁晟春,顾德访.Jena 在实现基于 Ontology 的语义检索中的应用研究[J].现代图书情报技术, 2005,10:5-9
    [53] McBride B.An Introduction to RDF and the Jena RDF API [EB/OL].http://jena. sourceforge.net/ tutorial/RDF_APL/index. html,Accessed Oct. 17,2004

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700