CGSP中异构数据库整合的研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,网格作为一种新兴的技术备受世界科学界的关注,被称为下一代互联网。网格可以将地理上分布不同、系统异构、性能各异的各种资源,包括硬件资源和软件资源,形成虚拟组织,通过高速互连网络连接起来形成广域范围的资源共享和协同计算环境。随着网格技术的发展,它所处理的数据也逐渐变得越来越复杂和巨大,许多网格应用服务越来越需要对大型的异构的数据库进行访问,这就迫切需要一种能够访问和集成异构数据库的中间件,本文讨论的问题也正是在这种背景下应运而生。
     论文以中国教育科研网格支撑平台ChinaGrid Support Platform项目为背景,提出了一种构建在网格环境下异构数据库访问与集成的基础架构。通过对开放网格服务架构—数据访问与集成(OGSA-DAI)的研究,在其原有的核心基础上进行扩展,提出了虚拟表、物理表和临时中间数据库的概念,设计和实现了SQL查询语句的解释器和分发器,通过文件流的方式来处理海量数据的查询,并扩展了数据传输模块,使得它与整个CGSP系统紧密结合起来。对于用户来说,我们通过WebService的方式提供统一的接口访问,可以方便地共享、查询和使用资源,最重要的是提供了对多个异构数据库之间的分布式联合查询功能。
     论文首先对网格的概念与相关技术、网格的现行标准和发展状态及网格与数据库之间的关系进行了介绍。通过比较,本论文指出传统集群计算及P2P等分布式计算不是网格。其次,论文介绍了ChinaGrid Support Platform项目,并详细说明了中国教育科研网格支撑平台(CGSP)中与异构数据库相关联的关键模块。最后,论文提出了一种新的可行的异构数据库访问与集成的方法,给出了相关概念的定义,详细说明了该模块的整体架构和内部结构与实现,给出相关的实验性能结果并讨论。
In recent years, grid computing has been focused by scientific communities as a new emerging technology, and we called it as the Next Generation Internet. Grid can integrate lots of geographically distributed heterogeneous resources, including hardware and software resources, to compose virtual organizations, which are connected by high performance network in a broad range. Grid emphasizes the resources sharing and the collaborative work environment. As grid technology developing, more and more grid-based applications need a grid middleware which can access large, heterogeneous data resources. It's just the background of this paper.
     This paper proposes a new fundamental architecture of the heterogeneous database access and integration in grid, which is based on the ChinaGrid Support Platform project. It presents the concepts of virtual table, physical table and temporary middle database, designs and implements the SQL interpreter and dispatcher of query statements, handles the large data query by file stream, and extends the data transport component based on the functionalities of OGSA-DAI core framework. Meanwhile, it provides uniform webservice-based interfaces, which can help users to share, query and use resources conveniently. Moreover, it is the most important that it provides distributed joint query spanning many the heterogeneous databases.
     This article is organized as follows. Firstly, it introduces concepts and technologies about grid as well as the information of grid specifications and statuses. It discriminates the grid computing from the traditional cluster computing and distributed computing such as P2P systems. Secondly, the article introduces ChinaGrid Support Platform project, and illustrates the related modules of CGSP in detail. Lastly, the article proposes a novel and feasible infrastructure for heterogeneous database access and integration, defines relevant concepts, depicts the fundamental architecture and how to implement it. The performance evaluation and the experiment results will be discussed at the end of this paper.
引文
[1].Foster Ian,K.C.,The Grid 2:Blueprint for a New Computing Infrastructure.2003:Morgan Kaufmann Publishers Inc.
    [2].都志辉,陈渝,刘鹏,网格计算.2002:清华大学出版社.
    [3].M.Antonioletti,M.P.Atkinson,R.Baxter,A.Borley,N.P.Chue Hong,B.Collins,N.Hardman,A.Hume,A.Knox,M.Jackson,A.Krause,S.Laws,N.W.Paton,K.Qi,T.Sugden,D.Vyvyan,P.Watson,A.M.W.,The Design and Implementation of Grid Database Services in OGSA-DAI.2005.17:pp.357-376.
    [4].OGF,开放网格论坛,http://www.ogf.org,2008.
    [5].DAIS-WG,数据访问与集成工作组,http://forge.gridforum.org/projects/dais-wg,2008.
    [6].OGSA-DAI,开放网格体系架构--数据访问与继承,http://www.ogsadai.org.uk,2008.
    [7].Jin,H.ChinaGrid:Making Grid Computing a Reality.In Digital Libraries:International Collaboration and Cross-Fertilization-Lecture Notes in Computer Science.2004:Springer-Verlag.
    [8].Globus,网格平台中间件,http://www.globus.org,2008.
    [9].Haynos,M.,ed.Perspectives on grid:Grid computing -- next-generation distributed computing.2004,IBM DeveloperWorks.
    [10].Heba Kurdi,Maozhen Li,A.H.A.,A Classification of Emerging and Traditional Grid Systems,2008,IEEE Distributed Systems Online.
    [11].OASIS,Organization for the Advancement of Structured Information Standards,www.oasis-open.org,2008.
    [12].Recommendation,G.G.F.D.,Open Grid Services Infrastructure(OGSI)Version 1.0,,D.S.S.Tuecke,K.Czajkowski,I.Foster,J.Frey,S.Graham,C.Kesselman,T.Maguire,T.Sandholm,P.Vanderbilt,Editor.2003.
    [13].Karl Czajkowski,Donald F Ferguson,I.F.,Web Services Resource Properties 1.2,2005,http://docs.oasis-open.org/wsrf/wsrf-ws_resource_properties- 1.2-spec-pr-01.pdf.
    [14].WS-ResourceLifetime,2008,http://www-106.ibm.com/developerworks/library/wsresource/ws-resourcelifetime.pdf.
    [15].WS-Topics,2008,ftp://www6.software.ibm.com/software/developer/library/ws-notification/WSTopics.pdf.
    [16].Karl Czajkowski,Donald F Ferguson,Ian Foster,E.A.,The WS-Resource Framework,.2004,http://www-128.ibm.com/developerworks/library/ws-resource/ws-wsrf.pdf.
    [17].C.Mic Bowman,Peter B.Danzig,E.A.,Harvest:A Scalable,Customizable Discovery and Access System,1995,http://www.codeontheroad.com/papers/Harvest.Jour.pdf.
    [18].Ian Foster,Jeffrey Frey,E.A.,Modeling Stateful Resources with Web Services,2004,http://devresource.hp.com/drc/specifications/wsrf/ModelingState-1-1.pdf.
    [19].Ian Foster,A.I.,On Death,Taxes,and the Convergence of Peer-to-Peer and Grid Computing,in LNCS.2003,Springer Berlin/Heidelberg.pp.118-128.
    [20].OGSA-DAI,Distributed Query Processor,2008,http://www.ogsadai.org.uk/about/ogsa-dqp/.
    [21].The FirstDIG Project,First Data Investigation on the Grid,2008,http://www.epcc.ed.ac.uk/~firstdig/.
    [22].陈小武,潘章晟,赵沁平,网格环境中模式复用的异构数据库访问和集成方法.Journal of Software,2006.17(11):pp.2224-2233.
    [23].Ian.Foster,C.Kesselman,S.T.,The Anatomy of the Grid:Enabling Scalable Virtual Organizations.International Journal Supercomputer Applications[J],2001.15(3):p.200-222.
    [24].Web Service Interoperability Organization(WS-I),2008,http://www.ws-i.org/.
    [25].W.Allcock,E.A.,GridFTP:Protocol Extensions to FTP for the Grid,as GFD-R.020,2004,http://www.ggf.org/documents/.
    [26].JavaCC,The Java Compiler Compiler,https://javacc.dev.java.net/.
    [27].Ant,A Java Based Tool for Managing Project,http://ant.apache.org/.
    [28].SOAP,The Simple Object Access Protocol,http://www.w3.org/TR/soap/.
    [29].JProfiler,A Good Java Profiler Tool,http://www.ej-technologies.com/.
    [30].Kottha,S,Abhinav,K.,Muller-Pfefferkorn,R.,Mix,H.Accessing Bio-Databases with OGSA-DAI-A Performance Analysis.In:Dubitzky,W.,Schuster,A.,Sloot,P.M.A.,Schroeder,M.,Romberg,M.(eds.)GCCB 2006.LNCS(LNBI),vol.4360,Springer,Heidelberg(2007).
    [31].B.Dobrzelecki,M.Antonioletti,J.M.Schopf,A.C.Hume,M.Atkinson,N.P.Chue Hong,M.Jackson,K.Karasavvas,A.Krause,M.Parsons,T.Sugden,and E.Theocharopoulos.Profiling OGSA-DAI Performance for Common Use Patterns. Proceedings of the UK e-Science All Hands Meeting 2006, September 2006.
    [32].M. Antonioletti, M. Atkinson, R. Baxter, A. Borley, N. P. Chue Hong, P. Dantressangle, A. C. Hume, M. Jackson, A. Krause, S. Laws, M. Parsons, N. W. Paton, J. M. Schopf, T. Sugden, P. Watson and D. Vyvyan. OGSA-DAI Status and Benchmarks. Proceedings of the UK e-Science All Hands Meeting 2005, September 2005.
    [33].M. Jackson, M. Antonioletti, N.P. Chue Hong, A.C. Hume, A. Krause, T. Sugden and M. Westhead. Performance Analysis of the OGSA-DAI Software. Proceedings of the UK e-Science All Hands Meeting 2004, September 2004.
    
    [34].WebRowSet XML Schema Definition, http://java.sun.com/xml/ns/jdbc/webrowset.xsd.
    [35].Semantic Grid, The Semantic Grid Community Portal, http://www.semanticgrid.org/
    [36].De Roure, D, Jennings, N R, Shadbolt, N R. The semantic grid: past, present, and future, Proceedings of the IEEE. Vol. 93, no. 3, pp. 669-681. Mar. 2005.
    
    [37].Hai Zhuge. China's E-Science Knowledge Grid Environment, IEEE Intelligent Systems, vol. 19, no. 1, pp. 13-17, Jan/Feb, 2004.
    [38].De Roure D, Jennings N R, Shadbolt N. The Semantic Grid:A Future e-Science Infrastructure. In Grid Computing : Making the Global Infrastructure a Reality. Editied by Berman F, Hey A J G, Fox G, eds.John Wiley & Sons, pp. 437-470. 2003.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700