基于RDF的网格数据查询集成系统研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的高速发展,分布式已成为当前信息处理、传播与交换的重要的网络体系结构。各种数据资源在不同地点、以不同方式存储在分布异构的数据库中,由此导致了分布式网络中数据资源的共享、集成等问题。
     为了更好地解决数据资源的分布性与异构性问题,在对现有异构数据集成系统和各类技术分析比较的基础上,提出了一个基于网格技术与RDF技术的数据查询集成方案。方案采用网格技术来解决数据源的分布性和访问异构性,并采用RDF技术来解决数据源间的语法异构和语义异构性。利用此方案,构建了基于RDF的网格数据查询集成系统(RGDS),分析了该系统中各主要模块的功能及它们之间的联系。
     在网格数据集成系统中,数据源质量千差万别,存在数据源的选择问题。为此,定义了数据源相关信任度的概念,以此来衡量数据源服务的质量,给出了数据源信任度的计算方法,提出了基于信任度的数据源贪婪选择算法,并通过实验验证了该方法在数据源选择上的有效性。
     最后介绍了基于RDF的网格数据查询集成系统(RGDS)的实现,包括网格数据源服务的注册与管理,并通过一个实例详细阐述了RGDS系统的流程与各功能模块的具体实现,简单介绍了基于RDF的数据查询结果的推理方法。
With the development of the network technology, the distributed structure has beenthe important network architecture for information processing、transmitting andexchanging. All kinds of data sources are stored in the distributed and heterogeneousdatabases in different locations and with different forms. So this situation leads to theproblems of data sharing and integration in the distributed network.
     The data integration systems and technologies in existence are introduced. In orderto solve the distributed and heterogeneous problems of the data sources, a new solutionfor data integration based on grid services and RDF technology is given. Grid servicesare used to solve the heterogeneous problem of data accessing, and RDF technology isused to solve the heterogeneous problems of grammar and semantics. According to thesolution, we establish a RDF-Based Grid-Data Query and Integration System (RGDS), and analyze the main modules' functions and their relationships.
     Because of the differences of data source quality, it is a problem that how to select thecorrect and appropriate data source in the data query system. So we define the trust valueto evaluate the quality of data source. The greedy-selecting algorithm based on trustvalues of data sources is proposed. A preliminary experiment is carried out to evaluatethe efficiency of the proposed algorithm.
     Finally, the realization procedures of RDF-Based Grid-Data Query and IntegrationSystem (RGDS) are illustrated by an instance. Also, the reasoning of RDF data results isintroduced simply.
引文
1 Sriram Raghavan, Hector Garicia-Molina. Integrating Diverse Information Manggement System: A Brief Survey. http://www.almaden.ibm.com/cs/people/rsriram/pubs/integsurvey.pdf
    2 吕行.基于XML的异构数据源集成系统研究与应用.学位论文.2004年7月.
    3 A. Levy. Logic-based Techniques in Data Integration. Logic Based Aritficial Intelligence, Edited by Jack Minker. Kluwer Publishers. 2000.
    4 Reinoso Castillo. Ontology-Driven Information Extraction and Integration from Antonomous, Heterogeneous, Distributed data sources-A Federated Query-Centric approach. Master Thesis. Artificial Intelligence Research Laboratory, Departement of Computer Science, Iowa State University. 2002
    5 Agustina Buccella, Alejandra Cechich. An Ontology Approach to Data Integration. JCS&T Vol.3 No.2, October 2003.
    6 Cheng Hian Goh. Representing and Reasoning about Semantic Conflicts int Heterogeneous Information Sources. Phd, MIT, http://ccs.mit.edu/ebb/peo/mad.html
    7 Nancy Wiegandn Naijun Zhou. Extending XML Web Querying to Heterogeneous Grospatial Information. http://www.digitalgovemmem.org/dgrc/dgo2003/cdrom/PAPERS/intemet_web2/wiegand.pdf
    8 Zhan Cui, Dean Jones, Paul O'Brien. Issues in Ontology-based Information Integration. http://www.csd.abdn.ac.uk/~apreece/ebiweb/papers/cui.pdf
    9 S.Cluet, C.Delobel, J.Simeon, K.Smaga. Your Mediators Need Data Conversion! In Proceeding of ACM SIGMOD Conference on Management of Data, Seattle, Washington, 1998. 177-188
    10 S.Cluet, J.Simeon, YATL: A Runctional and Declarative Language for XML. Draft Manuscript, Mar 2000
    11 Ian Foster, Carl Kesselman. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, Inc. San Francisco, California. 1999
    12 Ian Foster, Carl Kesselman, Steven Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Oraganizations. International Journal of Supercomputer Applications. 2001
    13 Ian Forster. Intemet Computing and the Emerging Grid. http://www.nature.com/nature/webmatters/grid/grid.html
    14 Ian Forster, Carl Kesselman. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Presented at GGF4. http://www.globus.org/reserach/papers.html
    15 都志辉,陈渝,刘鹏.网格计算.第一版.北京.清华大学出版社.2002.8
    16 I.Foster, D.Gannon. The Open Grid Services Architecture Platform. 2003.2
    17 I.Foster, C.Kesselman, J.M.Nick, S.Tuecke. Grid Services for Distributed System Integration. IEEE June 2002, 37-46.
    18 B.Jacob, L.Ferreira, N.Bieberstein, C.Gilzean, J.Girard, R.Strachowski, S.Yu. Enabling Applications for Grid Computing with Globus. http://www.redbooks.ibm.com/redpieces/pdfs/sg246936.pdf 2003.5
    19 http://www.globus.com
    20 N.Chue Hong, A.Krause, S.Malaika, G.McCance, N.W Paton, G.Ticcardi. Grid Database Service Specification. http://www.gridforum.org/Meetings/ggf7/drafts/DAIS_GGF7StatementSpec.pdf 2003.2
    21 M.Antonioletti, M.Jackson. OGSA-DAI Product Overview. http://www.ogsa-dai.org/docs/current/OGSA-DAI-USER-UG-PRODUCT-OVERVIEW.pdf 2003.5
    22 A.Borkey. Architecture. http://www.ogsadai.org.uk/docs/R5/doc/background/2003.7,2005.4
    23 柴晓路,梁宇路.Web Services技术、架构和应用.第一版.北京.电子工业出版社.2001
    24 史湘宁,米强,凌云翔.Web Services体系结构研究.计算机与现代化.2005.2:26-28.
    25 W3C. XML1.0 http://www.w3.org/TR/REC-xml. 2002.3
    26 陈石.XML技术及其应用.计算机应用研究.2002.No.3:116
    27 瞿裕忠,张剑锋等.XML语言及相关技术综述.计算机工程.2000,26(12):5
    28 RDF Primer http://www.w3.org/TR/rdf-primer/
    29 RDF Concepts and Abstract Syntax http://www.w3.org/TR/rdf-concepts/
    30 RDF Vocabulary Description Language 1.0: RDF Schema http://www.w3.org/TR/rdf-schema/
    31 RDF/XML Syntax Specification http://www.w3.org/TR/rdf-syntax-grammar/
    32 http://dublincore.org/documents/
    33 W.Allcock,A.Chervenak,I Foster. The Data Grid: Towards An Architecture for the Distributed Management and Analysis of Large Scientific Datasets. Journal of Network and Computer Applications. 2003,23(3):187-200
    34 http://www.sparql.com/
    35 Gail L R. Reputation Information System: A Reference Model. In, Big Island, Hawaii, 2005.449-458.
    36 Despotovic Z, Aberer K. P2P reputation management: Probabilistic estimation vs. Social networks. Computer Networks.(n 4):485-500.
    37 C.Kevin, G.Hector. Boolean Query Mapping Across Heterogeneous Information Sources. IEEE Transactions on Knowledge and Data Engineering. 1996,Vol.8(4):515-521
    38 C.Yu, L.Popa,. Constraint-based XML Qurey Rewriting for Data Integration. In:Proc. of the 2004 ACM SIGMOD International Conference on Management of Data. 2004
    39 高军,唐世渭,杨冬青.数据集成中XML数据查询语义重写.计算机研究与发展.2002.39(4):434-442.
    40 M.Benedikt, C.Y.Chan, W.Fan, J.Freire, R.Rastogi. Captuting both Types and Constaaints in Data Integration. In: SIGMOD. 2003.
    41 M.Fernandez, A.Morishima, D.Suciu. Efficient Evaluation of XML Middle-ware Queries. In: Proc. of ACM SIGMOD Conference on Management of Data, Santa Barbara, 2001.
    42 Wolfgang Nejdl, Boris Wolf, etc. EDUTELLA: A P2P Networking Infrastructure Based on RDF. ACM 1-58113-449-5/02/00005. WWW2002, Hawaii. 2002.5.
    43 http://www.globus.org/ogsa/
    44 卢正鼎,李兵,肖卫军等.基于CORBA/XML的多数据库系统研究与实现.计算机研究与发展.2002,39(4):443-449.
    45 李胜利,林海华,石柯,徐彬.基于网格的数据库查询并行化研究.计算机工程与科学.2004,26(11).
    46 胡春明,怀进鹏,孙海龙.基于Web服务的网格体系结构及其支撑环境研究,软件学报,2004,15(7):1064-107
    47 李东升,李春江等.数据网格环境下一种动态自适应的副本定位方法,计算机研究与发展,2003,40(12):1775-178
    48 廖华明,陈伯羽等.信息网格中元数据层次化结构模型的研究和应用,计算机研究 与发展,2003,40(12):1694-169
    49 史龙,都志辉.网格数据库管理模型与策略,清华大学.2003.
    50 王意洁,肖侬等.数据网格及其关键技术研究,计算机研究与发展,2002,39(8):943-947
    51 Uche Ogbuji.知识管理的基本XML和RDF技术.http://www-900.ibm.com/developerWorks/cn/xml/rdf/part8/index.shtml
    52 宓永迪,夏勇.资源描述框架(RDF)的应用.大学图书馆学报,2001(2).
    53 姜恩波.RDF原理、结构初探.现代图书情报技术.2001(5).
    54 Adali S, Candan K. Query Caching and Optimization in Distributed Mediator System. In: Proc. of SIGMOD'96. Montreal, Canada. 1996,6.
    55 Nie Z, Kambh S. Joint Optimization of Cost and Coverage of Query Plans in Data Integration. In: Proc. of the 10th Intl. Conf. on Information and Knowledge Management. Atlanta, Georgia. 2001,11.
    56 王莉苹,杨寿保.网格环境中的一种信任模型.计算机工程与应用.2004
    57 汪进,杨新,刘晓松.一种新型的网格行为信任模型.计算机工程与应用.2004,39(21):62-64.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700