支持动态更新的网格数据传输技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据密集型科学和工程应用常常需要在网格环境中传输海量的数据,数据传输的性能和灵活性是重要的指标。本文研究支持动态更新的网格数据传输技术,一方面提出两种数据传输方式,以提高网格数据传输的性能;另一方面研究网格数据传输中的动态更新技术,以提高网格数据传输的灵活性。具体说来,本文研究内容有如下几个方面:
     (1)本文提出了多副本带状数据传输方式。这种数据传输方式同时使用分布在网格中的同一数据的多个副本进行数据传输,并能屏蔽不同副本所在服务器支持的数据传输协议的差异和副本所在域的差异。实验证明这种数据传输方式能提高数据传输性能。
     (2)当网格应用程序使用多维数据时,查询和传输是两个基本操作。为了支持高效的查询操作,多维数据常常分块存储,这使得查询返回的数据为物理上不连续的数据,实验表明直接传输这种数据性能较低。本文研究了这个问题,并提出使用逻辑链接文件的方法解决此问题。通过将物理上不连续的数据虚拟链接为一个逻辑链接文件,并使用针对逻辑链接文件的数据传输方法,多维数据的传输性能得以提高。
     (3)任务关键型网格应用程序要求数据传输软件具备在线演化能力,即在数据传输过程不中断的情况下对软件进行动态更新。为满足上述需求,本文提出了一套动态更新实现方案,包括一套可动态更新软件构建规范Du-OSGi规范、一个动态更新框架Du-OSGi框架和一组动态更新工具集。将该套方案应用到网格数据传输中,可使得数据传输软件具备在线演化能力,并且更新过程中数据传输性能几乎不受影响。
Data-intensive scientific and engineering applications require the transfer of large amount of data in grid, in which process the performance and efficiency are two important indicators. This thesis researches on grid data transferring and its dynamic updating technologies. On one hand, two data transferring approaches are proposed, both of which can enhance the transferring performance. On the other hand, dynamic updating of grid data transfer is researched and solution is proposed. The main contents of the thesis include the following:
     (1) This thesis proposes striped transferring approach based on multiple replicas, which performs data transferring by simultaneously using multiple replicas and shielding the protocol differences and domain differences among replicas. Experiments prove that the data transferring performance is enhanced while adopting the approach.
     (2) When grid applications operate on multidimensional data, query and transfer are two widely used operations. To support flexible query operation, a storage structure based on partition the multidimensional data is adopted. However, this will cause decrease of the transfer performance. This paper investigates this problem and proposes the logically linked file (LLF) approach to solve it. By virtually linking the qualified non-continuous data sets to a LLF and providing a specific transfer mechanism for LLF, the transfer performance is improved.
     (3) Mission-critical applications in grid require the online evolvement ability of the data transferring software, namely, the ability to dynamic update the software while the transfer is ongoing. This thesis proposes a solution to the problem, including a dynamic updatable software composing specification named the Du-OSGi Specification, a dynamic update framework named the Du-OSGi Framework, and a set of dynamic updating tools. After applying the solution to data transferring, the data transferring software acquires the online evolvement ability. Experiments show the data transferring performance will not be influenced by updating.
引文
[1] I. Foster, C. Kesselman. The Grid: Blueprint for a Future Computing Infrastructure, 1998.
    [2] I. Foster, C. Kesselman. The Grid 2: Blueprint for a New Computing Infrastructure, 2004.
    [3] I. Foster, C. Kesselman, S. Tuecke. The anatomy of the grid: enabling scalable virtual organizations, International Journal of Supercomputer Applications, 2001, 15(3): 200-222
    [4] I. Foster. What is the Grid? A Three Point Checklist, Grid Today, 2002(6).
    [5] 徐志伟,冯白明,李伟. 网格计算技术. 2004.5
    [6] IBM网格计算. http://www-900.ibm.com/cn/grid/
    [7] I. Foster, C. Kesselman, S. Tuecke. Cluster computing and the grid. Proceedings First IEEE/ACM International Symposium. 2001. 6-7
    [8] I. Foster, C. Kesselman, J. Nick, S. Tuecke. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002
    [9] I. Foster, H. Kishimoto, A. Savva, D. Berry, A. Djaoui, A. Grimshaw, B. Horn, F. Maciel, F. Siebenlist, R. Subramaniam, J. Treadwell, J. Von Reich. The Open Grid Services Architecture, Version 1.0. Informational Document, Global Grid Forum (GGF), January 29, 2005.
    [10] I. Foster, D. Gannon, H. Kishimoto, J. Von Reich. Open Grid Services Architecture Use Cases. Information Document, Global Grid Forum (GGF), October 28, 2004.
    [11] K. Czajkowski, D. F. Ferguson, I. Foster, J. Frey, S. Graham, I. Sedukhin, D. Snelling, S. Tuecke, W. Vambenepe. The WS-Resource Framework. March 5, 2004.
    [12] 都志辉, 陈渝, 刘鹏. 网格计算. 2002.10
    [13] B. Jacob. Grid Computing: What are the key components? IBM DeveloperWorks. 2003
    [14] The Distributed-Parallel Storage System (DPSS) project. http://www.didc.lbl.gov/DPSS/
    [15] The High Performance Storage System. http://www.hpss-collaboration.org/
    [16] The SRB project. http://www.sdsc.edu/srb/index.php/Main_Page
    [17] W. Allcock, editor. GridFTP Protocol Specification (Global Grid Forum Recommendation GFD.20). March 2003.
    [18] B Allcock, L Liming, S Tuecke, A Chervenak. GridFTP: A Data Transfer Protocol for the Grid. Grid Forum Data Working Group on GridFTP, 2001
    [19] M. Hicks, S. Nettles. Dynamic Software Updating. ACM Transactions on Programming Languages and Systems, 2005, 27(6):1049-1096
    [20] 彭定, 傅秀芬, 谢翠萍, 高冉, 侯文国. 网格数据传输协议探讨. 微型机与应用. 2004.6
    [21] I. Foster, C. Kesselman. The Globus Project: A Status Report. Proc. IPPS/SPDP '98 Heterogeneous Computing Workshop, pp. 4-18, 1998.
    [22] I. Foster. Globus Toolkit Version 4: Software for Service-Oriented Systems. IFIP International Conference on Network and Parallel Computing, Springer-Verlag LNCS 3779, pp 2-13, 2006.
    [23] The Globus Alliance. http://www.globus.org
    [24] B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal, S. Tuecke. Data Management and Transfer in High Performance Computational Grid Environments. Parallel Computing Journal, Vol. 28 (5), May 2002, pp. 749-771.
    [25] The SGGC project. http://dev.globus.org/wiki/Incubator/SGGC
    [26] W.E. Allcock, I. Foster, R. Madduri. Reliable Data Transport: A Critical Service forthe Grid. Building Service Based Grids Workshop, Global Grid Forum 11, June 2004.
    [27] R. Madduri, C. Hood, W. Allcock. Reliable File Transfer in Grid Environments. LCN'02. 2002
    [28] W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu, I. Raicu, I. Foster. The Globus Striped GridFTP Framework and Server. Proceedings of Super Computing 2005 (SC05), November 2005
    [29] Globus Alliance: GridFTP: System Administrator's Guide. Available at http://www.globus.org/toolkit/docs/4.0/data/gridftp/admin-index.html. 2007
    [30] H. Stockinger, A. Samar, B. Allcock, I. Foster, K. Holtman, and B. Tierney. File and Object Replication in Data Grids. Journal of Cluster Computing, 5(3)305-314, 2002
    [31] Globus Alliance. RLS. Available at http://www.globus.org/toolkit/docs/4.0/data/rls. 2007
    [32] A. Chervenak, N. Palavalli, S. Bharathi, C. Kesselman, R. Schwartzkopf. Performance and Scalability of a Replica Location Service. Proceedings of the International IEEE Symposium on High Performance Distributed Computing (HPDC-13), June 2004.
    [33] A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, A. Iamnitchi, C. Kesselman, P. Kunst, M. Ripeanu, B, Schwartzkopf, H, Stockinger, K. Stockinger, B. Tierney. Giggle: A Framework for Constructing Sclable Replica Location Services. Proceedings of Supercomputing 2002 (SC2002), November 2002.
    [34] Globus Alliance. Data Replication Service. Available at http://www.globus.org/toolkit/docs/4.0/techpreview/datarep. 2007
    [35] S. Vazhkudai, S. Tuecke, I. Foster. Replica Selection in the Globus Data Grid. Proceedings of the First IEEE/ACM International Conference on Cluster Computing and the Grid (CCGRID 2001), pp. 106-113, IEEE Computer Society Press, May 2001.
    [36] E Thomsen. OLAP solutions: building multidimensional information systems. 2002
    [37] J. Robinson, The KDB tree: A search structure for large multidimensional dynamics indexes. In Proc. ACM SIGMOD Conf., Ann Arbor, MI, April 1981 pp. 10 – 18
    [38] J. Nievergelt, H. Hinterberger, K. Sevcik, The Grid File: An Adaptable Symmetric Multikey File Structure, ACM Transactions on Database Systems, 1984
    [39] K. Hinrichs, Implementation of the grid file: Design concepts and experience, BIT Numerical Mathematics, 1985
    [40] A. Kumar, G-tree: a new data structure for organizing multidimensional data, Knowledge and Data Engineering, IEEE Transactions on, 1994
    [41] O. Frieder, M. Segal. Dynamic program updating in a distributed computer system. In IEEE Conf. on Software Maintenance, pages 198{203, Phoenix, AZ, October 1988.
    [42] M. Solarski, H. Meling. Towards upgrading actively replicated servers on-the-y. InWorkshop on Dependable On-line Upgrading of Dist. Systems [97].
    [43] T. Ritzau, J. Andersson. Dynamic deployment of Java applications. In Java for Embedded Systems Workshop, London, May 2000.
    [44] R. Govindan, C. Alaettino, and D. Estrin. A framework for active distributed services. Technical Report 98-669, ISI-USC, 1998.
    [45] R. Hall, D. Heimbeigner, A. Hoek, A. Wolf. An architecture for post-development configuration management in a wide-area network. In Intl. Conf. on Distributed Computing Systems, May 1997.
    [46] R. Weiler, Automatic upgrades: A hands-on process. Information Week, March 2002.
    [47] The OSGi Alliance. http://www.osgi.org. 2007
    [48] The OSGi Alliance, OSGi Service Platform Core Specification. 2006
    [49] The Equinox project. http://www.eclipse.org/equinox. 2007
    [50] 樊宁. 网格体系结构概述. IBM DeveloperWorks. 2006

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700