数据网格QoS保障与资源优化关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
网格是一种先进的信息技术基础设施,目的是有效整合Internet上广泛分布的各种计算资源、存储资源、通信资源、信息资源等,向用户提供虚拟、统一、透明的计算环境。数据网格作为网格计算领域的一个分支,已经得到学术界的极大关注。数据网格是指广域范围内,对大规模的数据集进行分布式管理、分析及使用的一个综合体系结构。数据网格实现了安全、可靠和有效的数据传输、访问、存储和副本管理等操作,并提供到不同存储系统的统一的接口,从而使得数据密集型高性能计算及科学研究成为可能。
     Ian foster指出网格最基本的一个特征就是“提供非凡的服务质量(QoS)”。为了保障数据网格具有较高的QoS,需要克服网络以及网格节点的诸多不稳定因素,而资源(能力)预留、副本部署、缓冲区机制、并行数据传输和数据存储与恢复是解决这类问题的主要手段和当下研究的热点问题。海量数据存储和传输导致大量网络传输、存储资源、节点资源的不必要浪费,致使高峰时刻网格服务接纳率的急剧降低,带来整体QoS的下降。目前多数研究更关注于从某方面提升服务质量,而较少的考虑资源的优化调度问题,从而在保证QoS的同时一定程度上降低了网格系统整体资源的使用率。
     本文从“保障QoS是基础,优化资源的使用是目标”这一宗旨出发,深入研究了如何在保障数据网格QoS的同时对使用的资源进行有效优化的问题。论文将数据网格最基本的功能(数据存储和数据传输)在服务级QoS层面分解为5个主要子服务(传输、存储、缓存、节点选择、资源预留),针对不同的子服务采用了特定手段达到了QoS保障与资源优化的双重目标。具体为:
     (1)多副本部署可以提升数据的可靠性和数据服务带宽,降低网络负载,基于多副本的并行传输算法可以极大提升传输速度,保障数据服务的QoS,但是多个完整副本部署对存储空间和网络传输的消耗极大。本文首先提出了一个数据的分布式存储模型,存储模型在存储空间使用上具有较大的优势(存储优化),同时具有P完整性,可以保证在任意P个节点失效时数据仍然完整;基于存储模型给出了一并行传输调度器,在双副本冗余度时调度器可以适应节点间速度的较大差异,以调度器为基础给出了一个并行传输算法,配置合理参数,算法可以达到基于多个完全副本的并行传输速度。
     (2)为了保障数据存储的可靠性,基于并行传输的动态数据恢复是数据网格应具备的能力。在优化使用存储空间前提下,不但要保障数据存储的基本QoS指标:可靠性和可用性,同时还必须兼顾数据的易用性。本文基于分布式存储模型,结合节点失效性、动态恢复过程和数据交换中心策略,以本文提出的调度器、并行传输算法并结合泊松分布定理,提出了一个动态的数据恢复模型。数据恢复模型较双副本存储具有更低的数据失效概率,较纠删码策略具有更强的易用性。
     (3)为了克服网络的不稳定性,数据缓存是网络应用较为常用的一个主要策略。考虑数据的海量特性以及资源有限性,在数据缓存服务中需要优化配置缓冲区大小,并应考虑诸多因素,包括:数据源节点的失效性、参与服务的节点集合、各节点的传输速度、任务对数据失效时间的约束以及对整体失效的要求等。本文通过引入有限缓冲区模型,从数据消耗者角度出发,以多副本存储和并行传输模式为基础,推导出一个服务失效模型,该模型有效表述了影响服务失效的各种参数间的量化关系,进行了仿真实验,将模型理论值与实验值进行了对比分析,取得了较好的结果,达到了缓冲区和服务节点的优化配置,缓存服务QoS保障的目标。
     (4)基于多副本进行并行数据传输的一个重要问题是:在满足服务可靠性、传输时间等QoS约束的前提下,如何能够合理选择节点资源。本文提出了两个模型,模型能够对网格节点的传输速度、可靠性、传输距离、网络状态、带宽约束等因素进行综合决策,从而给出最优的服务节点集合,使达到合理使用节点资源、降低网络负载、降低服务请求的容忍度、提升高峰时网格系统的接纳率与服务质量并保证一次服务代价最小的多重优化目标。
     (5)资源预留是保障数据网格任务顺利完成的基本前提,预留请求的接纳率直接影响到服务的QoS。资源能力从宏观上更好的描述了资源的数量和使用情况,对于资源的预留服务提供了强有力的支撑。合理的调配资源能力,可以降低资源能力碎片,提升高峰时刻网格系统的吞吐量和接纳率,使达到优化资源、保障QoS的双重目标。本文提出了基于并行加速比和四元法资源能力预留策略,与先前机制相比使网格系统可以根据预留请求的综合信息进行主动决策,并对预留请求进行一定的资源能力变换,进一步优化资源的使用,有效降低资源能力碎片,提升高峰时刻服务接纳率。
     个体服务QoS保障与资源优化可以提升网格系统整体资源的利用率,从而可以在高峰时刻提升服务请求的接纳率和个体服务的QoS。
Grid is an infrastructure of advanced information technology. It aims to effectively integrate a variety of widely distributed computing resources, storage resources, communication resources, information resources, and to provide users with a virtual, unified, transparent computing environment. Data grid as a branch of grid computing has been of great concern to academics. Data grid is an integrated architecture that in wide area can effectively manage, analyse and use distributed data sets. Data grid will achieve safe, reliable and efficient data transmission, access, store and copy management operations, and provides a unified interface to different storage systems, so makes data-intensive high performance computing and scientific research be possible.
     Ian foster pointed that one of its basic features is "to provide exceptional quality of service (QoS)". In order to guarantee the data grid has a higher QoS, the grid system must overcome many uncertainties of network and grid nodes. Currently the technologies of resource (capacity) reservation, replica deployment, buffer strategy, parallel transmission and data storage and recovery are the primary means of solving such problems and the hot research issues. Mass data storage and transmission result in the unnecessary waste of network transmission capacity, storage and node resources, and result in the dramatic reduction of acceptance rate for grid service in peak hour and the decline in the overall QoS. At present, most researches focus more on enhancing the quality of services from some aspects, but less consider the optimal scheduling of resources, thus while the QoS is ensured, much price will be paid at the same time.
     The thesis based on the purpose of "the guarantee of QoS is a basis, and the optimization of resource is goal" deeply studies the problem that how to guarantee the QoS while effectively use the resource of grid. In this thesis we divide the basic functions (data storage and data transfer) of data grid into five sub service (transport, storage, cache, node selection, resource reservation) from service level of QoS, for different sub-services specific strategies are used and dual objective of QoS guarantee and resource optimization are achieved at the same time. Specifically:
     (1) Multi-replica deployment can be used to improve the reliability of data and service bandwith, decrease the workload of network. The algorithms based on multi-replica can be used to increase the transmission speed further, can guarantee the QoS of data service. But multi-replica causes a waste of storage space and network transmission capacity. In this thesis a distributed storage model is proposed first, the model has a large advantage of the use of storage space (memory optimization), and also has the characterristic of "P integrity", i.e., when P nodes fail, the complete data can be got from the remaining nodes. A parallel transmission scheduler is put forward based on storage model, When double redundance is used, the scheduler can adapt to big differences of transmission speed of replica nodes. Based on storage model and scheduler a parallel transmission algorithm is proposed, When reasonable parameters are configured, the algorithm can achieve the transmisison performance that the algorithms based on full-replica can get.
     (2) In order to guarantee the realibility of data storage, the dynamic data recovery based on parallel transmission is the basic capacity that the data grid should have. In the premise of optimizing the use of storage space, not only the basic QoS of data storage (reliability and usability) should be guaranteed, but also the availability (ease of use) must be considered. A dynamic data recovery model (DDRM) is proposed based on storage model, scheduler, parallel transmission algorithm, node failure, dynamic recovery process and data exchange center strategy. DDRM has lower data failure probability comparing with double-replica and greater availability comparing with erasure codes stragegy.
     (3) Data buffer is a key strategy that can be used to overcome the instability of the network. Taking into account the characteristics of mass data and limited resources, the size of buffer should be optimized in cache service, meanwhile the following factors should be considerd: the failure probability of service nodes, set of service nodes, transmission speed of service node, constraint of failure time of task and the whole service failure probability. By introducing limited buffer model, from the perspective of the data consumer, a service failure model is proposed based on parallel transmission mode. The model effectively represents the quantitative relationship of various parameters that can impact the service failure. In simulative experiment, the theoretical values of model and experimental value are compared, and good results are got, menwhile the objectives of buffer optimization, service node optimizaiton, and the guarantee of QoS for caching services, are chieved.
     (4) An important problem for parallel transmission based on multi-replica is: how to select replica node under the condition of meeting the QoS constraints of service reliability, transmission time etc. Two node selection models are proposed, the models consider the parameters of transmission speed of node, reliability, transmission distance, network status, bandwidth and other factors, taking those parameters as input, the model can be used to output the optimal service node set. So the multiple optimization objectives of rational use of node resources, reducing the network load, reducing the tolerance of service request, enhancing the acceptance rate in peak time and ensuring the minimum cost of one service, are achieved at the same time.
     (5) Resource reservation is the basic premise of ensuring the successful completion of grid task, the acceptance rate of reservation requests has direct affect on QoS. Resource capacity effectively represents the amount and status of resource from a macro point of view, provides a strong support for reservation services of resource. Reasonable allocation of resource capacity can reduce resource capacity debris, and enhance the throughput and acceptance rate in peak time, and achieve the dual goals of resource optimization and QoS guarantee. Two resource capacity reservation strategies--parallel speedup and four-tuple, are put forward, comparing with previous researches, the strategies proposed can make the grid system make active decision according the comprehensive information of reservation requests, and can make certain resource capacity transformation to reservation requests, so the use of resource capacity can be optimized further.
     In this thesis, the most basic functions (data storage and data transmission) of data grid are divided into five detailed services, i.e., transmission, storage, buffer, node selection and resource capacity reservation. For different services specific strategy is adopted to achieve the objectives of QoS guarantee and resource optimization.
引文
[1]杨新刚.基于网格环境的资源存取的研究.四川大学硕士论文, 2003,1-10
    [2]李静.数据网格的资源管理相关策略及算法研究.重庆大学博士学位论文, 2007:1-5
    [3] Ian Foster. The Grid: Computing without Bounds. Scientific American Magazine,2003:1-8
    [4]马力遥.移动数据网格的设计与实现.电子科技大学硕士论文, 2006, 1-15
    [5]黄飞雪,李志洁.网格资源的经济配置模型.北京:科学出版社, 2010
    [6] Chervenak A, Foster I, Kesselman C et al. The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications. 2000, 23(3): 187-200
    [7] Foster I, Kesselman C, Tuecke S. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications.2001,15(3):200-222
    [8] European Data Grid project.Available at: Http://www.eudatagrid.org
    [9]吴长泽.数据网格中高可用性副本管理及性能优化研究.重庆大学博士论文,2007:1-15
    [10] Ai Lihua, Luo Siwei. Job-attention Replica Replacement Strategy. Proceedings of the 8th ACIS International Conference on Software Engineering, Artificial Intelligence,Networking and Parallel/Distributed Computing. 2007: 837-840
    [11] Esther Pacitti, Patrick Valduriez, Marta Mattoso. Grid Data Management: Open Problems and New Issues. Journal of Grid Computing, 2007,5:273–281
    [12]王霓虹,滕志霞.网格技术研究与发展趋势综述.信息技术, 2007,(2):1-4
    [13] Abderezak Touzene, Hussein Al Maqbali. PERFORMANCE OF LOAD BALANCING FOR GRID COMPUTING. Proceedings of the 25th IASTED International Multi-Conference, 2007,2:98-102
    [14] Vazhkudai S, Schopf J. M. Using regression techniques to predict large data transfers[J].The International Journal of High Performance Computing Applications.2003,17(3):249-268
    [15] H.A. James, K.A. Hawick. Scientific Data Management in a Grid Environment. Journal of Grid Computing, 2005,3:39–51
    [16]邓智群.基于网格的存储系统关键技术研究.西北工业大学博士论文, 2005:1-30
    [17]刘宴兵.网格高性能调度及资源管理技术.北京:科学出版社, 2010
    [18]林海华.网格环境下数据库查询的原型与实现.华中科技大学硕士论文, 2004:10-18
    [19] Globus: http://baike.baidu.com/view/1138237.htm
    [20] BEinGRID - Business Experiments in GRID: http://beingrid.terradue.com/news/_images.asp?id=23
    [21]马永征.科学数据网格资源调度技术研究.中国科学院研究生院(计算技术研究所)博士论文,2005:1-10
    [22]云计算与网格计算的概念: http://wenku.baidu.com/view/7fe148ebe009581b6bd9ebad.html
    [23]邹德清.具有QoS保障的服务网格关键理论与技术研究.华中科技大学博士论文. 2004: 29-30
    [24] Kouzes R.T, Anderson G.A, Elbert S.T, et al. The Changing Paradigm of Data-Intensive Computing.IEEE Computer. 2009, 42(1):26-34
    [25]王意洁,肖侬,任浩等.数据网格及其关键技术研究.计算机研究与发展, 2002(8): 943~947.
    [26]许骏,柳泉波,李玉顺等.面向服务的网格计算:新型分布式计算体系与中间件.北京:科学出版社,2010
    [27]刘宴兵,尚明生,肖云鹏.网格高性能调度及资源管理技术.北京:科学出版社, 2010
    [28]殷锋.基于QoS的校园网格中关键技术研究.四川大学博士论文,2006:1-10
    [29] Ivona Brandic, Sabri Pllana and Siegfried Benkner. Specification, planning, and execution of QoS-aware Grid workflows within the Amadeus environment. Concurrency Computat.: Pract. Exper. 2008, 20:331–345
    [30] Sanya Tangpongprasit, Takahiro Katagiri, Kenji Kise etal. A time-to-live based reservation algorithm on fully decentralized resource discovery in Grid computing. Parallel Computing, 2005, 31:529–543
    [31] Junzhou Luo, ZhiangWu, Jiuxin Cao etal. Dynamic multi-resource advance reservation in grid environment. J Supercomput, 2008, 1-17
    [32] Erik Elmroth, Johan Tordsson. Grid resource brokering algorithms enabling advance reservations and resource selection based on performance predictions. Future Generation Computer Systems, 2008,24: 585–593
    [33] A. Kaplan, G.C. Fox and G. von Laszewski, GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing. Proc Grid Computing Environments, Supercomputing Workshops, Reno, NV, USA, November 2007:1-10
    [34] A. Zissimos, K. Doka, A. Chazapis and N. Koziris. GridTorrent: Optimizing data transfers in the Grid with collaborative sharing. in Proceedings of the 11th Panhellenic Conference on Informatics (PCI2007), Patras, Greece, May 2007:1-12
    [35] Athanasia Asiki, Katerina Doka, Ioannis Konstantinou, et al. A Distributed Architecture for Multi-Dimensional Indexing and Data Retrieval in Grid Environments. In Proceedings of the Cracow 2007 Grid Workshop (CGW'07), Krakow, Polland, October 16-17, 2007:1-8
    [36] Jiaying Zhang and Peter Honeyman. A replicated file system for Grid computing. Concurrency Computat. Pract. Exper. 2008, 20:1113–1130
    [37] Chieh-Wen Cheng, Jan-JanWu, Pangfeng Liu. QoS-aware, access-efficient, and storage-efficient replica placement in grid environments. J Supercomput, 2008:1-22
    [38] Shaik Naseera, T. Vivekanandan, and K.V. Madhu Murthy. Data Replication Using Experience Based Trust in a Data Grid Environment. ICDCIT 2008, LNCS 5375, 2008, 39-50
    [39] Yu Xiangzhan, Wu Guanjun and Wang Dong. An Disaster Tolerance Model Based on Dataflow Replication. Proceedings of the 2008 IEEE International Conference on Information and Automation June 20 -23, 2008, Zhangjiajie, China:1-5
    [40] Zheng Zhou, Elisa Talini, Jorge Documet etal. Design and Implementation of a Web-based Data Grid Management System for Enterprise PACS Backup and Disaster Recovery. Proc. of SPIE, 2007,6516:1-7
    [41] Li Chunlin, Li Layuan. An optimization approach for decentralized QoS-based scheduling based on utility and pricing in Grid computing. CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE,2007,(19):107-128
    [42] Qaisar Rasool, Jianzhong Li, George S. Oreku. A Load Balancing Replica Placement Strategy in Data Grid. IEEE,2008:1-6
    [43] Ranganathan K, Foster I. Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications. Proceeding of 11th IEEE International Symposium on High Performance Distributed Computing. IEEE Computer society Press, 2002:352-358
    [44] Wolfgang Hoschek, Javier Jaen-Martinez, Asad Samar. Data Management in an International Data Grid Project. LNCS 1971, 2000:1-14
    [45] Rajasekar, A. . Wan, M. and Moore, R etal. Data grids, collections, and grid bricks. Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings. 20th IEEE/11th NASA Goddard Conference, San Diego, CA, USA. 2003:2-9
    [46]李波,赵东风,沈斌.支持资源预留的网格计算仿真平台.系统仿真学报,2006,18(增刊2):1-4
    [47] S Chapin,D Katramatos,J Karpovich et al.Resource management in Legion.Future Generation Computer Systems.1999,15(5):583~594.
    [48] Junwei Cao, Stephen A. Jarvis, Subhash Saini. ARMS: An agent-based resource management system for grid computing. Scientific Programming,2002,10(2):135-148
    [49] Klaus Krauter, Rajkumar Buyya, Muthucumaru Maheswaran. A taxonomy and survey of grid resource management systems for distributed computing. Software: Practice and Experience,2001,32(2):135-164
    [50] Chao Li, Yuebin Bai, Yujun Chen. A QoS Oriented Network Service Architecture for Grid Applications. Future Generation Communication and Networking,2007,2:356– 361
    [51] Li Chunlin, Li Layuan. The use of economic agents under price driven mechanism in grid resource management. Journal of Systems Architecture. 2004,50:521–535
    [52] Marco A. S. Netto and Rajkumar Buyya. Rescheduling Co-Allocation Requests based on Flexible Advance Reservations and Processor Remapping. 2008 IEEE:1-8
    [53] Erik Elmroth and Johan Tordsson. Grid Resource Brokering Algorithms Enabling Advance Reservations and Resource Selection Based on Performance Predictions. Future Generation Computer Systems, 2006,11:1-9
    [54] Czajkowski, K. . Foster, I. and Kesselman, C. Resource co-allocation in computational grids. The Eighth International Symposium on High Performance Distributed Computing, 1999:219-228
    [55] Mumtaz Siddiqui, Alex Villaz′on and Thomas Fahringer. Grid Capacity Planning with Negotiation-based Advance Reservation for Optimized QoS. SC2006, Florida, USA. IEEE: 1-16
    [56]张萌.一种基于网格服务质量的柔性预留机制.大连理工大学学报, 2005,45增刊:1-4
    [57]李波,赵东风,沈斌.支持资源预留的网格计算仿真平台.系统仿真学报. 2006,18(增刊2):1-4
    [58] Sanya Tangpongprasit, Takahiro Katagiri, Kenji Kise etc. A time-to-live based reservation algorithm on fully decentralized resource discovery in Grid computing. Parallel Computing, 2005, (31):529-543
    [59] ZHU Cheng, ZHANG Wei-Ming, LIU Zhong etal. A Grid Resource Discovery Scheme Based on Resource Classification. Journal of Computer Research and Development, 2004,12:1-8
    [60] Chunming Hu, Jinpeng Huai, Tianyu Wo. Flexible Resource Reservation Using Slack Time for Service Grid. Proceedings of the 12th International Conference on Parallel and Distributed Systems, IEEE, 2006:1-8
    [61]石柯,王庆春,吴松.数据网格中一种基于副本和缓存的元数据管理系统.计算机研究与发展, 2004,41(12):2206-2210
    [62]李田来.基于GLOBUS的数据网格副本管理关键问题研究.山东师范大学硕士论文.2008:5-10
    [63] Yasser Mansouri, Reza Monsefi. Optimal Number of Replicas with QoS Assurance in Data Grid Environment. Second Asia International Conference on Modelling & Simulation,IEEE,2008:1-6
    [64] Pangfeng Liu, Jan-Jan Wu. Optimal Replica Placement Strategy for Hierarchical Data Grid Systems. Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06),IEEE,2008:1-4
    [65] Mohammad Shorfuzzaman, Peter Graham and Rasit Eskicioglu. Popularity-Driven Dynamic Replica Placement in Hierarchical Data Grids. 2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies,IEEE, 524-531
    [66]杨俊.数据网格环境下基于市场经济的副本服务模型.哈尔滨商业大学学报(自然科学版),2009,25(1):1-4
    [67] Jun Feng, Marty Humphrey. Eliminating Replica Selection - Using Multiple Replicas to Accelerate Data Transfer on Grids. Proceedings of the Tenth International Conference on Parallel and Distributed Systems (ICPADS’04). IEEE,2004:1-8
    [68]应益峰,宋广华.网格环境下基于多Replica的数据传输.计算机工程,2007,33(1):1-3
    [69]劳仲安,郑珂,肖侬等. gFDT :用于数据网格的高速数据传输模块.计算机工程与科学,2009,31(A1):1-4
    [70] The Globus Project GridFTP Universal Data Transfer for the Grid : http://www-fp.globus.org/datagrid/deliveables/C2WPdraft3.pdf.
    [71] Jun Feng, Lingling Cui, Glenn Wasson. Toward Seamless Grid Data Access: Design and Implementation of GridFTP on .NET. Grid Computing Workshop,2005. IEEE computer society:164-171
    [72] Wantao Liu, Rajkumar Kettimuthu, Brian Tieman. GridFTP GUI: An Easy and Efficient Way to TransferData in Grid. GridNets 2009, LNICST 25. 2010, 57-66
    [73]袁满,胡庆,李鹤飞等.网格环境下GridFTP传输机制的研究与实现.计算机应用, 2007,27:1-3
    [74] R.S. Bhuvaneswaran, Yoshiaki Katayama, and Naohisa Takahashi. Coordinated Co-allocator Model for Data Grid in Multi-sender Environment. ICSOC 2006, LNCS 4294. 2006:66-77
    [75]胡燏翀.基于网络编码的分布式存储容错机制研究.中国科学技术大学博士论文. 2010:3-5
    [76] Yuchong Hu, YinlongXu, Xiaozhao Wang, ChengZhan,PeiLi. Cooperative Recovery of Distributed Storage Systems from Multi-Losses with Network Coding. IEEE Journal on Seleeted Areas in Communications (JSAC) Februar, 2010,28(2):1-5
    [77] Rashedur M. Rahman, Ken Barker, Reda Alhajj. Replica Placement Strategies in Data Grid. J Grid Computing: 2008,6:103-123
    [78] R. Bhagwan, K.Tati, Y.Cheng, S.Savage, and G. M. Voelker. Total Recall: System Support for automated availability management. In Proc. of ACM/USENIX NSDI’04, San Francisco, California. March 2004: 337-350
    [79] R.Rodrigues and B. Liskov. High availability in DHTs: Erasure coding vs. replication In IPTPS05, 2005:1-8
    [80] Rashedur M, Rahman, Ken Barker. Replica Placement Strategies in Data Grid. J Grid Computing. 2008, 6:103–123
    [81]付伟,肖侬,卢锡城.个体QoS受限的数据网格副本管理与更新方法.计算机研究与发展, 2009, 46(8):1408-1415
    [82] R.S. Bhuvaneswaran, Yoshiaki Katayama, and Naohisa Takahashi. Dynamic Co-allocation Scheme for Parallel Data Transfer in Grid Environment. Dynamic Co-allocation Scheme for Parallel Data Transfer in Grid Environment, IEEE,2005:1-6
    [83] Chao-Tung Yang, I-Hsien Yang, Kuan-Ching Li. A Recursive-Adjustment Co-allocation Scheme in Data Grid Environments. ICA3PP 2005, LNCS 3719. 2005:1-10
    [84] Chao-Tung Yang, I-Hsien Yang and Kuan-Ching Li. Improvements on dynamic adjustment mechanism in co-allocation data grid environments. J Supercomput,2007,40:269-280
    [85] Chao-Tung Yang, Shih-YuWang, William Cheng-Chung Chu. Implementation of a dynamic adjustment strategy for parallel file transfer in co-allocation data grids. J Supercomput,2010,54:180-205
    [86] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google File System.SOSP'03, October 19–22, Bolton Landing, New York, USA,2003:1-15
    [87] The hadoop distributed file system: Architecture and design http://hadoop.apache.org/common/docs/r0.18.0/hdfs_design.pdf
    [88] Ruay-Shiung Chang, Po-Hung Chen. Complete and fragmented replica selection and retrieval in Data Grids. Future Generation Computer Systems,2007,23:536-546
    [89] Sudharshan, Vazhkudai. Enabling the Co-Allocation of Grid Data Transmissions[A]. Proceedings of the Fourth International Workshop on Grid Computing[C], IEEE Computer Society, 2003:1-8
    [90] Sudharshan Vazhkudai. Distributed Downloads of Bulk, Replicated Grid Data. Journal of Grid Computing,2004,2:31-42
    [91] H. Weatherspoon and J. Kubiatowicz. Erasure coding vs. replication: A quantitative comparison. In Proc. of IPTPS’02, Cambridge, Massachusetts, March 2002:2-8
    [92] G. Utard and A. Vernois. Data durability in peer to peer storage systems. In proc. of IEEE GPZPC’04, Chicago, Illinois, April 2004:1-5
    [93] W. K. Lin, D. M. Chiu, and Y. B. Lee. Erasure code replication revisited. In Proc. Of IEEE PZP’04, Zurich, Switzerland, Pages 90-97, August2004:4-9
    [94] R.Rodrigues and B. Liskov. High availability in DHTs: Erasure coding vs. replication In IPTPS05, 2005.
    [95] Farsite: Federated, available, and reliable storage for an incompletely trusted environment http://research.microsoft.com/Farsite/2006.
    [96] Yi Fu, Zhigang Hu, and Qingjun Zhang. Bayesian Network based QoS Trustworthiness Evaluation Method in Service Oriented Grid. The 9th International Conference for Young Computer Scientists. IEEE,2008:293-298
    [97]洪高旗,刘芳,刘波. GridDa En数据网格中的元数据预取与缓存策略.计算机工程与科学,2009,31(A1):335-339
    [98] Dr.P.Bala Krishna Prasad, Mr.G.Murali,G.Gurukesava Das et al. Congestion Controlling for Streaming Media Through Buffer Management and Jitter Control[J]. IJCSNS International Journal of Computer Science and Network Security. 2009,9(2):1-10
    [99] Raju Rangaswami, Dimitrijevic Z., Chang E. etal. MEMS-based disk buffer for streaming media servers[J]. ACM Transactions on Storage,2007,3(2):1-31
    [100] Distributed Geo-rectification of Satellite Images using Grid Computing: http://www.comp.nus.edu.sg/~teoym/pub/03/ipdps03-slides.pdf
    [101] Lei Zhang, Yuehui Chen, Runyuan Sun. A Task Scheduling Algorithm Based on PSO for Grid Computing. International Journal of Computational IntelligenceResearch,2008,4(1):37-43
    [102] Saeed Parsa, Reza Entezari-Maleki. RASA: A New Grid Task Scheduling Algorithm. International Journal of Digital Content Technology and its Applications. 2009,3(4):91-99
    [103] Jun Feng, Marty Humphrey. Eliminating Replica Selection - Using Multiple Replicas to Accelerate Data Transfer on Grids. Proceedings of the Tenth International Conference on Parallel and Distributed Systems (ICPADS’04), IEEE computer society:1-8
    [104] Erik Elmroth and Johan Tordsson. A Grid Resource Broker Supporting Advance Reservations and Benchmark-Based Resource Selection. Lecture Notes in Computer Science,2003:1-10
    [105] Dafei Yin, Bin Chen, Yu Fang. A Fast Replica Selection Algorithm for Data Grid. 31st Annual International Computer Software and Applications Conference(COMPSAC 2007), IEEE, computer society:1-4
    [106] Gaurav K, Umit C, Tahsin K, et al. A Dynamic Scheduling Approach for Coordinated Wide-Area Data Transfers using GridFTP. PARALLEL AND DISTRIBUTED SYSTEM, 2008,1-12
    [107] Husni Hamad E. AL-Mistarihi and Chan Huah Yong. Response Time Optimization for Replica Selection Service in Data Grids. Journal of Computer Science,2008,4(6):487-493
    [108] In Kee Kim, Yong Beom Ma, Jong Sik Lee. A daptive Quantization-based Communication Data Management for High-Performance Geo-computation in Grid Computing. Proceedings of the Fifth International Conference on Grid and Cooperative Computing Workshops,1730 Massachusetts Ave., NW Washington, DC USA, Oct. 2006:470-476
    [109] Sanya Tangpongprasit, Takahiro Katagiri, Kenji Kise etc. A time-to-live based reservation algorithm on fully decentralized resource discovery in Grid computing[J]. Parallel Computing, 2005,(31):529-543
    [110] Peng Xiao, Zhigang Hu, Xi Li etc. A Novel Statistic-based Relaxed Grid Resource Reservation Strategy[A]. The 9th International Conference for Young Computer Scientists[C], IEEE Computer Society. 2008:703-707
    [111] Chunming Hu, Jinpeng Huai, Tianyu Wo. Flexible Resource Reservation Using Slack Time for Service Grid[A]. Proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS'06)[C], IEEE Computer Society. 2006:1-8
    [112]胡春明,怀进鹏,沃天宇.一种基于松弛时间的服务网格资源能力预留机制.计算机研究与发,2007,44(1):20-28
    [113]田东,陈蜀宇,陈峰.网格资源提前预留中用户资源需求量预测模型.华中科技大学学报(自然科学版), 2006,34(增刊1):1-3
    [114] Lars-Olof Burchard. On the Performance of Computer Networks with Advance Reservation Mechanisms ICON2003, Sydney, Australia, IEEE Computer Society, 2003:449-454
    [115] Jianbing Xing, Chanle Wu, Muliu Tao etal. Flexible Advance Reservation for Grid Computing. Grid and Cooperative Computing– GCC 2004:241-248
    [116] Richard McClatchey, Ashiq Anjum etal. Data Intensive and Network Aware (DIANA) Grid Scheduling. J Grid Computing, 2007,5:43-64
    [117] LIN WeiWei, QI DeYu, LI YongJun. Independent Tasks Scheduling on Tree-Based Grid Computing Platforms. Journal of Software, 2006, 17(11):2352-2361
    [118] Large Scale System Configuration Workshop. CERN and the DataGrid Project. http://homepages.inf.ed.ac.uk/group/lssconf/files2001/datagrid-edinburgh.pdf
    [119] Richard McClatchey, Ashiq Anjum etal. Data Intensive and Network Aware (DIANA) Grid Scheduling. J Grid Computing, 2007,5:43-64
    [120] Ke Shi. A Replication and Cache based Distributed Metadata Management System for Data Grid. Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, IEEE Computer Society,2007.20-25
    [121] Yulai Yua1, Yongwei Wu etal. Dynamic Data Replication based on Local Optimization Principle in Data Grid. The Sixth International Conference on Grid and Cooperative Computing(GCC 2007), IEEE Computer Society, 2007, 1-7