计算网格中基于数据迁移的负载平衡方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
  • 英文题名:Load Balancing Research Based-on Data Migration in Computational Grid
  • 作者:孟繁二
  • 论文级别:硕士
  • 学科专业名称:计算机系统结构
  • 学位年度:2004
  • 导师:胡亮
  • 学科代码:081201
  • 学位授予单位:吉林大学
  • 论文提交日期:2004-04-01
摘要
计算网格是先进的网络技术特别是高速网络技术和先进的计算结构的产物,是一种硬件和软件的综合体系结构。从硬件观点看:一个计算网格是地理上分布的异构的和动态的各种高性能计算资源,包括远端计算机、网络、存储装置、各种科学仪器、可视及虚拟现实显示设备以及个人计算机等资源的组合。从软件观点看, 计算网格是一个中间件,它集成上述资源,使其变为用户桌面上的功能非常强大的一个独立的计算机资源,从而使用户可以不受地理边界限制, 透明地、无缝地、有效地使用该资源,以解决目前仅靠本地资源不可能解决的各种复杂问题。采用计算网格,将不再仅局限于客户/服务员模式,而使一种全新的面向计算的应用成为可能,并最终主导网络高性能计算,就像目前的WWW支持面向信息的应用开发一样。
    计算网格系统的主要功能是提供给网格用户一个透明的、分布式的、共享的、安全和容错的高性能计算环境,在此环境中的用户能够共享文件、计算资源对象、丰富的数据信息以及昂贵的仪器设备,不必由用户自己决定在何处执行自己的程序以及进行必要的程序和数据文件的拷贝等操作,完全由计算网格系统自动完成。由于系统把用户置于一个相同的虚拟环境中,因此可以更有效地实现异地、跨学科的不同用户的协同工作。另外由于在计算网格系统环境中的程序并行运行以及离线网站资源的使用,因此可以具有更高的应用性能,计算网格系统也提供一个简单的程序设计环境和模型,最终导致用户获得更高的编程生产率。
    计算网格的任务调度和负载平衡是计算网格构建的关键部分。在计算网格系统中, 调度系统是十分重要的. 良好的调度系统可以使机群中全部汇聚起来的处理能力高效地在众多用户间分配. 调度系统决定了整个机群系统的效率, 尤其是对用户提交的大计算量任务, 一个好的调度系统可以大大地加快任务的执行速度.
    本文针对计算网格中的任务调度和基于数据迁移的负载平衡策略进行了较深入的研究,对计算网格的应用有一定的借鉴意义和应用价值。
Computational grid appears as advanced computer network technology and advanced computing architecture and also an integrated framework of computer hardware and software. From the viewpoint of hardware computational grid are consist of all kinds of high performance computing resources such as remote giant computer and also include network resources, storage equipment, all sorts of science apparatus, visual and virtual reality display equipment and so on, even though personal computers. From the viewpoint of software it is a middleware. It integrated all kinds of resources mentioned above and make them become a “one” independent computer resource that is on the user’s table and it is so powerful that it can solve nearly all kinds of problems, which can not be settled by the methods and resources used today. Users can use these resources transparently and seamlessly. Besides that these users are not limited by geographical boundary. By computational grid, it is possible that we can use a newly applications ways and means based on computing and not localized to the mode of Client/Server. We can predict that grid computing will dominant high performance computing ultimately just like applications development of information oriented was supported by WWW technologies.
    The main function of computational grid system is provides an environment of transparent, distributed, share, secure and fault-tolerant for grid users. These users in the environment can share resources such as computing resources, network resources, datum resources, expensive equipments and so forth. How and when and where a program can be done not be decided by user but by computational grid system itself. On the other hand all the users are in the same virtual environment so they can do coordinate work realizing even though they are in the different pleases and spans different subjects. On the other hand there are high applications performance in the grid computing for the reason that programs can exercise concurrently and resources off the line can be used. A simple programming environment and model also can be provided by grid and the results is higher programming productivity can be obtained by grid users.
    Task scheduling and load balancing based on data migration are the most impotent parts for the design of computational grid. In a computational grid, a scheduling sysytem is very important,a good scheduling sysytem can manage the whole processing capability that the hosts in the grid efficiently. A scheduling sysytem can decide the efficiency of the grid, especially for a task with a lot of computation, a good scheduling sysytem can provide a high executing speed and reduce the executing time of the task..
    
    Task scheduling and load balancing based on data migration are researched in this paper, The research work is benefit to further researches for task scheduling and load balancing technology of computational grid.
引文
I. Foster and C. Kesselman(eds.) The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufman, San Francisco,1999.
    [2] John Bresnahan, Ian Foster, Joseph Insley, Brian Toonen,Steven Tueke. Communication Services for Advanced Network Applications. Proceedings of the International Conference on Parallel and Distributed Processing Technique and Applications 1999, Volume IV, PP 1861-1867.
    [3] Jack Dongarra, Hans Meuer, Horst Simon, Erich Strohmaier. High Performance Computing Today. FOMMS 2000: Foundations of Molecular Modeling and Simulation Conference.
    [4] NCSA Project: http://www.ncsa.uiuc.edu/
    [5] NPACI Project: http://www.npaci.edu/
    [6] Maxine D. Brown, Thomas DeFanti, Michael A. McRobbie, Alan Verlo, Dana Plepys, Donald F. McMullen, Karen Adams, Jason Leigh, Andrew E. Johnson, Ian Foster, Carl Kesselman, Andrew Schmidt, Steven N. Goldstein. The International Grid(iGrid): Empowering Global Research Community Networking Using High Performance International Internet Service. http://www-fp.mcs.anl.gov/~foster/papers.html
    [7] G.C. Ballintijn, M.van Steen, A.S. Tanenbaum. Scalable Naming in Global Middleware. Proc. 13th Int’l Conf. On Parallel and Distributed Computing System(PDCS-2000). Las Vegas, August 2000, PP. 624-631
    [8] D L Eager, E D Lazowska and J Zahorjan. Adaptive Load Sharing in Homogeneus Distributed Systems. IEEE Trans. on Software Engineering, 1986, 12(5):662-675.
    [9] F C Lin and R M Keller. The Gradient Model Load Balancing Method. IEEE Trans. on Software Engineering, 1987, 13(1):32-38.
    [10] L M Ni, C-W Xu and T B Gendreau. A Distributed Drafting Algorithm for Load Balancing. IEEE Trans. on Software Engineering, 1985, 11(10):1153-1161.
    [11] C J Wang, P Krueger and M T liu.IntelligentJob Selection For Distributed Scheduling, 13th ICDCS, 1993
    [12] A Hac and T J Johnson. A Study of Dynamic Load Balancing in a Distributed System. ACM SOGCOMM Symposium on Communications, Architectures and Protocols, 1986: 348-356.
    [13] LSF Administrator’s Guide, Platform Computing Corporation, August,
    
    
    1995.
    [14] S H Russ et al. Hectiling: An Integration of Fine and Coarse-Grained Load-Balancing Strategies. Proceedings of the Seventh International Symposium on high Performance Distributed Computing, Chicago, Illinois, July 28-31, 1998. 106-113.
    [15] M Sullivan, D Anderson. Marionette: A System for Parallel Distrbuted Programming Using the Master/Slave Model. 9th ICDCS, 1989:181-188.
    [16] V S Sunderam, G A Geist, J Dongarra and R Manchek. The PVM Concurrent Computing System: Evolution, Experience and Trends. Parallel Computing, 1994, 20: 531-545.
    [17] P Krueger and R Chawla. The Stealth Distributed Scheduler.11th ICDCS, May 1991。
    [18] S F Hummel, E Schonberg and L E Flynn. Factoring: A Practical and Robust Method for Scheduling Parallel Loops. Communications of the ACM, Volume 35, No.8, Aug. 1992, pp.90-101.
    [19] L Bomans and D Roose. Benchmarking the iPCS/2 Hypercube Multiprocessor. Concurrency: Practice and Experience,1989,1(1):3-18.
    [20] Linux Documentation Project. http://sunsite.unc.edu/mdw/linux.html.
    [21] A Kumar, M Singhal and T L Ming. A Model for Distributed Decision Making: An Expert System for Load Balancing in Distributed Systems. Proceedings of 11th Symp. Oper. Syst., IEEE, 1987:507-513.
    [22] O Kremien and J Kramer. Methodical Analysis of Adaptive Load Sharing Algorithms. IEEE Transactions on Parallel and Distributed Systems, Vol.3, No.6, November 1992:747-760.