数据网格中分布式副本定位技术研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据网格是网格环境下共享和管理存储资源和分布式数据资源的大规模、可扩展架构,它适应数据密集型应用对网格环境下数据共享和处理的需要,给用户提供了透明访问远程异构数据资源的机制。副本管理是数据网格中一个重要的组成部分,创建数据副本可以降低远程访问该数据的网络延迟及带宽消耗,还可以提高网络的负载平衡,同时能够提高数据的安全性和可靠性,以及系统的容错性等。良好的数据副本管理策略是提高数据网格服务质量的一个重要方面,副本管理策略中包括副本的创建、副本的选择和副本的定位等。其中副本定位是提高系统性能的重要环节。本文主要对副本管理策略中的副本定位策略展开了研究,主要工作如下:
     1.通过对目前已有的副本定位技术的研究,本文提出了PM-Chord算法,该算法改进和扩展了基于Chord算法的P2P副本定位机制。PM-Chord算法具有如下新特性:
     (1)改变了数据的存储方式,按照前缀匹配原则存储数据,分离了节点查询与数据查询,在折半查找的基础上进一步运用前缀匹配原则查询数据。
     (2)增加了前继副本机制,以解决数据网格中的查询“热点”问题,并平衡系统查询负载和提高系统的稳定性、可靠性。
     2.在网格中间件Globus的基础上,以PM-Chord为核心算法,基本实现了副本定位系统PM-RLS。并对系统的性能进行了测试与分析,同时与Chord算法进行了对比。结果表明PM-RLS具有很高的副本定位效率和良好的稳定性、可靠性。
Data Grid architecture provides a large-scale, scalable infrastructure for the management of storage resources and data that are distributed across grid environments. Based on the requirements to data sharing and managing of data-intensive computing application, it provides the mechanisms for transparently remote accessing to heterogeneous data resources. Replica management is one of the critical parts in data grids. The replica created could reduce the network delay and bandwidth consumes when accessing to the data and improves load balance of network. It could also improve security, reliability and system tolerance of the data. Excellent replica management strategies are important to improve the QoS in data grids. Replica management strategies include replicas creation strategies, replicas selection strategies and replicas location mechanism and so on. Replica location mechanism is an important tache in improving performance of the system. This paper investigates on replica location strategies of replica management in data grids and the main work are as follows:
     1. A PM-Chord (Prefix Matching-Chord) method which improves and expands a replica location strategy based on Chord theory in P2P field is proposed in this paper by studying some replica location strategies exist. PM-Chord has the following new features.
     (1) Change the method of the data storage and separate the node search and data search which use prefix matching principle based on binary search.
     (2) Add the predecessor replication mechanism to solve the hot spots question in data grids and improve the reliability of the system.
     2. Realized PM-RLS which use PM-Chord method as replica location strategy on the Globus basis. Analysis and experiments show that not only PM-Chord has better performance than Chord at the replica location and can achieve reliability, stability.
引文
[1]Sudharshan Vazhkudai,Steven Tuecke and Ian Foster.Replica Selection in the Globus Data Grid[J].International Journal of Supercomputing Applications,2001,12(3):200-222
    [2]Leanne Guy,Peter Kunszt,Erwin Laure.Replica Management in Data Grids[J].CERN,European Organization for Nuclear Research CH-1211 Geneva 23,Switzerland,2002,11(3):124-136
    [3]M.Ripeanu,Ian Foster.A decentralized,adaptive,replica location service[C].Proc of HPDC-11.Edinburgh,Scotland:IEEE computer society Press,2002
    [4]A.Cherevenak,E.Deelman,Ian Foster et al.Giggle:A framework for constructing scalable replica location services[J].Proc of supercomputing 2002(SC2002).Baltimore,USA:IEEE Computer Society Press,2002
    [5]Dongsu Nam,Sangjin Jeong.Tree-based replica location scheme(TRLS)for data grids[C]USA:The 6~(th)Conference on International Advanced communication Technology,2004
    [6]Dongsheng Li,Nong Xi et al.Dynamic self-adaptive replica location method in data[C].USA:Proceeding of the IEEE International Conference on Cluster Coputing(CLUSTER'03),2003
    [7]Ian Foster.Internet Computing and the Emerging Grid[M].Nature Web Matters,2000
    [8]Ian Foster,C Kesselman.The Grid:Blueprint for a Future Computing Infrastructure[M].San Francisco,USA:Morgan Kaufmann Publishers,1999
    [9]I Foster.The Grid:A New Infrastructure for 21st Century Science[J].Physics.Physics Today,2002,55(2):42-52
    [10]J.Nieplocha,R.Harrison.Shared memory NUMA programming on the I-WAY[M].On High Performance Distributed Computing,IEEE Computer Society Press,1996
    [11]M.Norman.Galaxies collide on the I-WAY:An example of heterogeneous wide-area collaborative supercomputing[J].International Journal of Supercomputer Applications,1996,12(3):131-140
    [12]Ian Foster,J.Geisler,S.Tuecke.MPI on the I-WAY:A wide-area,multi-method implementation of the Message Passing Interface[C].Proceedings of the 1996 MPI Developers Conference,IEEE Computer Society Press,1996,10-17
    [13]F.Berman.Grid Computing - Making the Global Infrastructure a Reality.Chapter 36,Chichester:John Wiley & Sons,2003
    [14]都志辉.网格计算[M],清华出版社,2002
    [15]S.Rajsbaum.ACM AIGACT News Distributed Computing Colunm 8[N].HP Cambridge Research Laboratory,One Cambridge Center,Cambridge,2002
    [16]D.B.Skillicorn.Motivating Computational Grids[C].Proceedings of the 2~(nd)IEEE/ACM International Symposium on Cluster Computing and the Grid(CCDRID),2002
    [17]GridPP.Building a computing Grid for particle physics,http://www.gridpp.ac.uk/
    [18]Seti@home project:http://setiathome.ssl.berkeley.edu
    [19]European Union DataGrid Project:http://eu-datagrid.web.cern.ch/eu-datagrid/
    [20]The International Virtual Data Grid Laboratory:http://ivdgl.org
    [21]The Particle Physics Data Grid(PPDG):http://ppdg.net
    [22]SDSC's Storage Resource Broker project webpage at http://www.npaci.edu/DI-CE/SRB/index.html
    [23]C.Baru,R.Moore,A.Rajasekar,M Wan.The SDSC Storage Resource Broker[P].IBM CASCON'98,1998
    [24]Kazuhiro Matsuo,Sung Lee,Jonathan Agre.Comparison of Data Grid Solutions[J].FUJITSU.55,2,169-185(03,2004)
    [25]中国虚拟天文台,http://www.china-vo.org/
    [26]史美林,向勇,杨光信.计算机支持的协同工作理论与应用[M],电子工业出版社.2000.12
    [27]Ian Foster,C Kesselman,and S Tuecke.The Anatomy of the Grid:Enabling Scalable Virtual Organization[J].International Journal of High Performance Computing Applications,15(3),2001
    [28]Ian Foster,C.Kesselman,J.Nick,S.Tuecke.The Physiology of the Grid:An Open Grid Services Architecture for Distributed Systems Integration[J].January,2002
    [29]OGSA 规范,http://www.gridforum.org/ogsi-wg/drafts/GS Spec draft03 2002-0717.pdf
    [30]WSRF 规范,Karl Czajkowski,Donald F Ferguson,Ian Foster,Jeffrey Frey,Steve Graham,Igor Sedukhin,David Snelling,Steve Tuecke,William Vambenepe.The WS-Resource Framework.http://www.gridforum.org
    [31]Marty Humphrey,Glenn Wasson,Ian Foster et al.State and Events for Web Services:A Comparison of Five WS-Resource Framework and WS-Notification Implementations[M].14th IEEE International Symposium on High Performance Distributed Computing(HPDC-14),Research Triangle Park,NC,24-27 July 2005
    [32]WSRF,Available:http://www.globus.org/wsrf/
    [33]Gnutella Protocol Specification version 0.4.Http://www.clip2.com.GnutellaProtocol04.pdf,2001
    [34]Napster Website.http://www.napster.com,2001
    [35]I.Clarke,O.Sandberg,B.Wiley,and T.Hong.Freenet:A Distributed Anonymous Information Storage and Retrieval System[C].Proc.of ICSI Workshop on Design Issues in Anonymity and Unobservability,Berkeley,California,June 2000
    [36]FastTrack Product Description.http://www.fasttrack.nu/index int.html,2002
    [37]Ion Stoica,Robert Morris,David Liben-Nowell,David R.Karger,et al.Chord:A Scalable Peer-to-peer Lookup Protocol for Internet Applications[M].IEEE/ACM Transactions on Networking,2003
    [38]A.Rowstron and P.Druschel.Pastry:Scalable,Distributed Object Location and Routing for Large-scale Peer-to-Peer Systems[C].Proc Of IFIP/ACM Middleware,Heidelberg,Germany,2001
    [39]S.Ratnasamy,P.Francis,M.Handley,et al.A Scalable Content-Addressable Network.Proc of ACM SIGCOMM,2001
    [40]Ben Y.Zhao,Ling Huang,Jeremy Stribling et al.Tapestry:A Resilient Global-scale Overlay for Service Deployment[J].IEEE Journal on Selected Areas in Communications(JSAC),2004,22(1):41-53
    [41]Karger.D,Lehman.E,Leighton F.Consistent Hashing and random trees:Distributed caching protocols for relieving hot spots on the World Wide Web[C]29~(th)Annual ACM Symposium on Theory of Computing,1997

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700