基于本体和贝叶斯网络的Deep Web集成系统研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Deep Web integrated system based on ontology and Bayesian network
  • 作者:朱国进 ; 黄琪琪
  • 英文作者:ZHU Guojin;HUANG Qiqi;School of Computer Science and Technology,Donghua University;
  • 关键词:Deep ; Web查询接口集成系统 ; 属性提取 ; 语义本体树 ; 贝叶斯网络
  • 英文关键词:Deep Web query interface integrated system;;attribute extraction;;semanic ontology tree;;Bayesian network
  • 中文刊名:DLXZ
  • 英文刊名:Intelligent Computer and Applications
  • 机构:东华大学计算机科学与技术学院;
  • 出版日期:2018-02-28
  • 出版单位:智能计算机与应用
  • 年:2018
  • 期:v.8
  • 语种:中文;
  • 页:DLXZ201801003
  • 页数:8
  • CN:01
  • ISSN:23-1573/TN
  • 分类号:12-19
摘要
Deep Web指无法简单地通过搜索引擎或网络爬虫能够检索到的隐藏在后台数据库中,而往往这些内容具有丰富的信息和数据。获取Deep Web所蕴含的丰富信息的有效方法是构建Deep Web集成框架,而查询接口作为Deep Web的唯一访问接口,所以Deep Web集成系统的关键就是构建Deep Web集成接口。研究的目标是通过自动构建特定领域的本体来表示Deep Web接口信息,从而能够自动识别该领域Deep Web接口来建立索引,提取数据库中丰富的资源。在没有人为干预的情况下展开整个过程。本文的方法能完全自动地提取Deep Web接口信息并派生领域本体,并通过本体贝叶斯网络识别新Deep Web接口,进行匹配。在特定领域,通过一种新的自动从Deep Web接口中提取属性方法,通过Word Net构建成本体语义树,运用得到的领域语义本体树结合贝叶斯网络完成领域分类,并在分类后进行查询接口与集成接口的模式匹配。本文提出的方法通过对比人工提取属性构成的语义树在分类和模式匹配的结果进行对比,验证了该方法的可用性和适用性。
        Deep Web refers to the hidden in the background database that can't be retrieved by search engine or Web crawler,but often have rich in information and data. The effective way to get rich information contained in the Deep Web is to build a Deep Web integration framework,and the query interface is the only access interface of Deep Web. So the key of Deep Web integration system is to build Deep Web integration interface. The goal is to automatically express the Deep Web interface information by building specific domain ontology,which could automatically identify the Deep Web interface in this field to index and extract abundant resources in the database. The whole process is carried out without human intervention. The proposed method can extract Deep Web interface information and derive domain ontology automatically,and identify the new Deep Web interface through ontology Bayesian network to match. In certain areas,through a new automatic attribute extraction method from the Deep Web interface,the research constructs ontology semantic tree by WordNet,meanwhile combined with the semantic domain ontology tree based on Bayesian network to complete the field of classification,and the mode matching of query interface and integrated interface in classification.The proposed method achieves the results of classification and pattern matching by comparing the results of classification and pattern matching,and verifies the usability and applicability of the method.
引文
[1]Bergman M K.White paper:The deep web:Surfacing hidden value[J/OL].Journal of electronic publishing,2001,7(1)[2001-09-24].http://dx.doi.org/10.3998/3336451.0007.104.
    [2]中商情报网.2013-2014年中国互联网产业发展研究年度总报告[EB/OL].[2014-03-03].http://www.askci.com.
    [3]刘伟,孟小峰.Deep Web数据集成问题研究[R].北京:WAM DM,2006.
    [4]袁柳,李战怀,陈世亮.基于本体的Deep Web数据标注[J].软件学报,2008,19(2):237-245.
    [5]LIN Ling,ZHOU Lizhu.Web database schema identification through simple query interface[M]//LACROIX Z.RED 2009.Berlin/Heidelberg:Springer·Verlag,2010,6162:18-34.
    [6]DOU D,MCDERMOTT D V,QI P.Ontology and translation on the semantic Web[M]//SPACCAPIETRA S.Journal on Data Semantics II.Berlin/Heidelberg:Springer·Verlag,2004,3360:35-57.
    [7]ROITMAN H,GAL A.Onto Builder:Fully automatic extraction and consolidation of ontologies from Web sources using sequence semantics[M]//GRUST T,et al.Current trends in database technology-EDBT 2006.EDBT 2006.Lecture Notes in Computer Science.Berlin/Heidelberg:Springer,2006,4254:573-576.
    [8]黄黎.基于知识模型推理的Deep Web数据源分类研究[D].苏州:苏州大学,2009.
    [9]牟晓伟.Deep Web数据源发现与分类技术研究[D].长春:长春工业大学,2015.
    [10]苏晓珂,张勇敢,黄青松.Deep Web查询接口的复杂模式匹配[J].石河子大学学报(自然科学版),2007,25(1):122-124.
    [11]龚桂芬.基于查询接口的Deep Web模式匹配方法研究[D].苏州:苏州大学,2011.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700