中国教育科研网格数据管理中分布式副本定位模型的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网格作为一项新兴的分布式高性能计算技术正在人类生活中发挥着越来越重要的作用。网格中的数据管理是其关键技术之一,它需要一个高效的副本定位模型来解决网格所处的广域网环境下的副本定位问题。
     中国教育科研网格(ChinaGrid)公共支撑平台(CGSP)是一个支持多个网格应用的通用性平台,在这个平台,数据按应用的不同分布在不同的逻辑域中。在CGSP的开发过程中,发现网格数据管理中的副本定位请求有两个特点,一是在绝大多数情况下,由于不同逻辑域间的安全因素,副本的映射会存放在它的数据文件所属的逻辑域中;二是在绝大多数情况下,用户对副本定位的请求在副本映射所产生的逻辑域发起。因此如何利用这两个特点,保证副本定位的安全性和效率成为了一大挑战。目前提出的众多副本定位解决方案都很难满足一个多应用逻辑域共存的网格环境下的副本定位需求。在新型的分布式副本定位模型,RSS(Replica Service System)中,副本服务点按逻辑域内和逻辑域间组织成一个分布式的多重环拓扑结构。在RSS模型中,副本映射按其逻辑域属性分为全局副本映射GRM(Global Replica Mapping)和本地副本映射LRM(Local Replica Mapping)。副本的定位过程考虑副本的逻辑域属性返回相应的副本映射,以满足用户的副本定位需求。RSS具有本地性,自组织性和域内负载平衡的优点。Boundary-Chord是RSS的核心算法,作为一个分布式哈希表(DHT)算法,Boundary-Chord降低了副本定位时物理层和逻辑层路由跳数,保证了对象分布的可控性。
     通过仿真模拟试验,RSS与已有的分布式副本定位模型P-RLS(Peer-To-Peer Replica Location Service)相比,具备更高的副本定位性能。测试的结果也表明,Boundary-Chord与其他分布式哈希算法相比在副本定位方面具有一定的优势。
The emerging grids need an efficient replica location model to solve the replica location problem.
     In the experience of developing the ChinaGrid Supporting Platform (CGSP), a grid middleware that addresses building a uniform platform supporting multiple grid-based applications, we found characteristics of locality in the process of replica location. One is that replica mappings are stored in the logical domains they belong to with high probability for security reasons. Another is that a query for replica mappings is initialized with high probability in the logical domain where the replica mappings are generated. Therefore, it has become the main challenge to build a replica location mechanism that can make use of these properties of locality to guarantee replica location performance and security. Some previous works have been done to build a replica location mechanism, but they are not suitable for replica location in a grid environment with multiple applications like ChinaGrid. In this paper, we present a distributed replica location model, Replica Service System (RSS). In the model, two kinds of replica mappings, Global Replica Mapping (GRM) and Local Replica Mapping (LRM) are defined based on domain properties. RSS can locate these replica mappings to reply to users’query about replica locations, and it has the merits of locality awareness, self-organization, and domain load balancing, Boundary-Chord is the key algorithm of RSS. It has the merits of statistically less message routing hops on both application-level and IP-level, and the manageability of data placement.
     In simulation experiments, RSS outperforms present distributed replica location model, P-RLS. Simulation results show that the algorithm has better performance than other structured DHT solutions to the replica location problem.
引文
[1] B. Tierney, W. Johnston, J. Lee et al. A Data Intensive Distributed Computing Architecture for Grid Applications. Future Generation Computer Systems, 2000, 16(5): 473-481
    [2] 吴松, 金海. 存储虚拟化研究. 小型微型计算机, 2003, 24(4): 728-732
    [3] Foster, I. The globus toolkit for grid computing. In: Proceedings of the 1st International Symposium on Cluster Computing and the Grid. Brisbane. 2001. Washington, DC: IEEE Computer Society, 2001. 2-2
    [4] I. Foster, C. Kesselman, J. Nick, et al. Grid Services for Distributed System Integration. Computer, 2002, 35(6): 37-46
    [5] Ellert, M, Konstantinov, A., Konya, B. et al. The NorduGrid project: Using Globus toolkit for building GRID infrastructure. In: Nuclear Instruments and Methods in Physics Research, Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. Moscow. 2003. Moscow: Elsevier, 2003. 407-410
    [6] Patrick C. Moore, Wilbur R. Johnson, Richard J. Detry. Adapting globus and kerberos for a secure ASCI grid. In: Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM). Denver. 2001. New York: ACM Press, 2001. 21-21
    [7] C. Milligan, S. Selkirk. Online Storage Virtualization: The Key to Managing the Data Explosion. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences. Hawaii, 2002. Washington, DC: IEEE Computer Society, 2002. 3052-3060
    [8] L. Dailey Paulson, Distributed Storage Challenges Data Glut. IEEE Computer, 2002, 35(8): 23-23
    [9] M. Ripeanu and I. Foster, “A decentralized, adaptive replica location mechanism”, Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing, 2002(HPDC-11 2002). 2002. 24 -32
    [10] A. L. Chervenak, N. Palavalli, S. Bharathi, C. Kesselman, and R. Schwartzkopf, “Performance and Scalability of a Replica Location Service”, Proceedings of High Performance Distributed Computing Conference (HPDC-13), 2004,1: 182-191
    [11] M. Cai, A. Chervenak, and M. Frank, “A Peer-to-Peer Replica Location Service Based on A Distributed Hash Table”, Proceedings of the 2004 ACM/IEEE conference on Supercomputing, IEEE Computer Society, 2004. 56
    [12] B. Y. Zhao, J. D. Kubiatowicz, and A. D. Joseph, “Tapestry: An infrastructure for fault-resilient wide-area location and routing”, Berkeley Technical Report UCBCSD-01-1141, 2001
    [13] A. Rowstron and P. Druschel., “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems”, Proceedings of International Conference on Distributed Systems Platforms (Middleware), 2001. 135-141
    [14] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S.Shenker, “A Scalable Content-Addressable Network”, Proceedings of ACM SIGCOMM, 2001. 161-172
    [15] F. Kaashoek and David R. Karger, “Koorde: A Simple Degree-optimal Hash Table”, Proceedings of 2nd International Workshop on Peer-to-Peer Systems (IPTPS '03), February, 2003. 26
    [16] N. Harvey, M. Jones, S. Saroiu, M. Theimer, and A. Wolman, “SkipNet: A Scalable Overlay Network with Practical Locality Properties”, Proceedings of Fourth USENIX Symposium on Internet Technologies and Systems (USITS'03), March 2003. 113-126
    [17] “GT4: Globus Toolkit 4”, http://www.globus.org.
    [18] “The Earth Systems Grid”, http://www.earthsystemsgrid.org.
    [19] P. Avery and I. Foster, “The GriPhyN Project: Towards Petascale Virtual Data Grids” 2001. http://www.griphyn.org.
    [20] E. Deelman, J. Blythe, Y. Gil, and C. Kesselman, “Pegasus: Planning for Execution in Grids”, GriPhyN Project Technical Report 2002-20.
    [21] E. Deelman, “Mapping Abstract Complex Workflows onto Grid Environments”, Journal of Grid Computing, 2003, 1(1): 25-39
    [22] “LIGO-Laser Interferometer Gravitational Wave Observatory”,http://www.ligo.caltech.edu/
    [23] S. Ratnasamy, S. Shenker, and I. Stoica, “Routing Algorithms for DHTs: Some Open Questions”, First International Workshop on Peer-to-Peer Systems (IPTPS’02), 2002. 45-52
    [24] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup service for internet applications”, Proceedings of ACM SIGCOMM 2001. 149-160
    [25] David A Patterson, John L Hennessy. “Computer Architecture: A Quantitative Approach”. 1996
    [26] M Nelson, B Welch, J Ousterhout. “Caching in the Sprite Network File System”. ACM Transaction on Computer Systems, 1988, 6(1)
    [27] Mark Baker, John Hartman, Michael Kupfer et al. “Measurements of Distributed File System”. In: Proceedings of the 13th Symposium on Operation System Principles(SOSP), 1991, 10
    [28] P Sarkar, J Hartman. “Efficient Cooperative Caching Using Hints”, In Proceeding of Third Symposium on Operating System Design and Implementation(OSDI), 1996
    [29] M Dahlin, T Anderson, D Patterson et al. “Cooperative Caching: Using Remote Client Memory to Improve File System Performance”, Proceedings of OSDI 1994-11
    [30] D Wessels, K Claffy. “Internet Cache Protocol (ICP). version 2”, Internet Eng Task Force RFC 2186, 1997-09
    [31] P Cao, J Zhang, K Beach. “Active Cache: Caching Dynamic Contents on the Web Distributed System”, 1999
    [32] Rajasekar, A.; Wan, M.; Moore, R. “MySRB and SRB - Components of a Data Grid”, In Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing. Edinburgh. Edinburgh. 2002. San Francisco: Institute of Electrical and Electronics Engineers Inc, 2002. 301-310
    [33] H. Jin, W. Gong, S. Wu, M. Xiong, L. Qi, C. Wang, “An Efficient Data Management System with High Scalability for ChinaGrid Support Platform”, Proceedings of 6thInternational Workshop, Advanced Parallel Processing Technologies (APPT 2005), Hong Kong, China, 2005. 282
    [34] H. Jin, “ChinaGrid: Making Grid Computing a Reality”, Digital Libraries: International Collaboration and Cross-Fertilization, LNCS, Springer-Verlag, 2004, 3334: 13-24
    [35] B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, “Data Management and Transfer in High-Performance Computational Grid Environments”, Parallel Computing 2002, 28(5) : 749 - 771
    [36] W. Allcock, J. Bester, J. Bresnahan, et al. GridFTP Protocol Specification. GGF GridFTP Working Group Document, 2002
    [37] A. Iamnitchi, M. Ripeanu, and I. Foster, “Locating Data in (small-world?) Peer-to-Peer Scientific Collaborations”, the 1st International Workshop on Peer-to-Peer Systems (IPTPS'02). 2002, LNCS Hot Topics series, Springer-Verlag, 2429: 232-241
    [38] B. Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors”, Communications of ACM, ACM press, 13(7): 422-426
    [39] P. Keleher, S. Bhattacharjee, and B. Silaghi, “Are Virtualized Overlay Networks Too Much of a Good Thing?”, Proceedings of International Workshop on Peer-to-Peer Systems (IPTPS ’02), 2002,2429: 225-231
    [40] P. Ganesan, K. Gummadi, and H. Garcia-Molina, “Canon in G Major: Designing DHTs with Hierarchical Structure”, Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04), 2004. 263-272
    [41] L. Gao, “On inferring autonomous system relationships in the internet”, IEEE/ACM Transactions on Networking, 2001, 9(6) : 733-745
    [42] E. W. Zegura, K. Calvert and S. Bhattacharjeei, “How to Model an Internetwork”, Proceedings of IEEE Infocom'96, San Francisco, CA.
    [43] A. Iamnitchi, M. Ripeanu, and I. Foster, “Locating Data in (small-world?) Peer-to-Peer Scientific Collaborations”, the 1st International Workshop onPeer-to-Peer Systems (IPTPS'02). 2002, LNCS Hot Topics series, Springer-Verlag, Vol.2429, pp. 232-241
    [44] “J-Sim”: http://www.j-sim.org/
    [45] 张 宇, 张宏莉, 方滨兴. Internet 拓扑建模综述. 软件学报, 2005, 15(08): 12-20
    [46] Hai Jin, Chengwei Wang, Hanhua Chen, “Boundary-Chord: A Novel Peer-to-Peer Algorithm for Replica Location Mechanism in Grid Environment”, Accepted by proceedings of the 8th International Symposium on Parallel Architectures, Algorithms, and Networks (ISPAN 2005), December 7-9, 2005, Las Vegas, Nevada, USA, pp 262-267.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700