云存储环境下数据副本管理策略研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着Internet技术的快速发展,互联网上数据如海啸般迎面扑来,这直接映射人类处理数据能力的相对匮乏。云存储以简单的“按需付费”模式,为用户提供了低成、高可靠性的数据和在线存储服务。
     数据副本管理是任何存储系统一个重要的组成部分,对具有高可靠性、高扩展性及大容量存储等特点的云存储系统来说更为重要。目前关于云存储系统中数据副本管理的理论研究基本上处于初级阶段,因此有很多理论问题值得进一步探讨。合理的数据副本量不仅可以降低远程访问数据时的网络延迟及带宽消耗,还可以提高网络的负载均衡,同时能够提高数据的安全性、可靠性及系统的容错性等。而良好的数据副本管理策略也是提高云存储服务质量Qos的一个重要方面。
     以提高云存储服务质量为目的,本文主要对云存储系统数据副本管理策略中的数据副本量控制策略、数据副本选择策略、数据副本一致性维护策略等三个方面展开了研究。本文的主要创新点有:
     (1)提出了基于双重约束的副本创建策略和基于双重约束的副本删除策略,并详细分析推算了约束标准:可用性标准和可预测访问频度标准,给出了两者的算法描述,从理论上说明了该策略的可行性。
     (2)提出了基于灰预测的数据副本选择策略,用户在选择数据副本时主要考虑到数据副本响应时间,根据存储节点的历史信息预测数据副本响应时间,为用户选择最优副本提供依据,以提高云端用户的QOS。
     (3)提出了基于优化副本链的数据副本一致性维护策略,在创建数据副本从节点时就建立数据副本链,通过加锁机制避免数据副本节点更新时冲突问题,完成了数据副本链的建立和数据副本链的维护,分析了数据副本一致性维护过程。
With the rapid development of Internet technology, the Internet data is walking towards us such as tsunami, which directly maps that the human relatively lacks the ability to deal with data.Cloud storage with a simple "on-demand"model provides users with a low-cost and high reliability of data and applications which is online storage service.
     Data replica management is an important component of each storage system.It is more important for cloud storage,which has the features of high reliability, versatility, scalability,large storage and so on. But the theory of the data replica management in cloud storage system is still in the initial stage.There are many issues that is worthy of further study. A reasonable amount of data copies can reduce network latency and bandwidth consumption when we access the remote data and can also increase the network load balancing. At the same time is can improve data security,reliability and system fault tolerance, etc. And the worthy of data replica management is an important aspect to improve the quality of cloud service Qos.
     In order to improve the quality of cloud storage services, this paper studys in three aspects. They are the copy controlling strategy, replicas creation strategies and replicas deletion strategies. The main innovations are as follows.
     (1) This paper proposes replicas creation strategies based double bind and replicas deletion strategiess based double bind. And it has detailed analysised the constraint criteria:the standard of ailability and the standard of access to the frequency, giving both the algorithm description.
     (2) This paper has analyzed and summarized the research status of replicas selection strategies, has putted forward my own replicas selection strategies based on Grey selection strategy after analysising the basis of replicas selection strategies,and has derived the prediction equation of data copy.
     (3) Replicas consistency strategies ia an important aspect of ensureing the data available.After Analyzing and summarizing relevant academic research,this paper has proposed the replicas consistency strategies based on optimization data copy chain, has created the replicas chain and maintenanced the replicas chain,and has analysised the process of maintenancing replicas chain.
引文
[1]虚拟化与云计算小组.虚拟化与云计算,电子工业出版社,2010.
    [2]Buyya, R., Yeo, CS, Venugopal, Srikumar. Market-Oriented Cloud Computing[J] Vision, Hype, and Reality for Delivering IT Services as Computing Utilities in 10th IEEE International Conference on High Performance Computing and Communications,2008, p:25-27.
    [3]Kenneth J. Cosh, Robert Burns and Toby Daniel. Content clouds:classifying content in Web2.0[J]. Content clouds,2008,57(9):723-726.
    [4]郑纬民,舒继武.下一代分布式智能网络存储系统的发展趋势[J].电信世界,2004,17(8).16-23.
    [5]http://www. soft6.com/tech/16/168461.html
    [6]姜进磊,孙瑞志,向勇等.云计算,机械工业出版社,2009.
    [7]http://www.-nirvanix.com/
    [8]http://www.cn.cdnetworks.com/
    [9]http://www.googoogle—storage—released.html
    [10]http://skydrive.live.eom/le.org.en/posts/g
    [11]NaoyaHatakeyama. Atmos. Berlin:NazraeliPress,2003.
    [12]Oscar Garcial, Eduardo A. Fancellol, Clovis S. de Barcellosl andC. Armando Duarte2. hp-Clouds in Mindlin's thick plate model [J]. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING,2000; 47:1381-1400.
    [13]Rajkumar Buyya, Economic-based distributed resource management and scheduling for Grid computing, in Thesis.2002, Monash University.
    [14]周可,王桦,李春花.云存储技术及其应用[J].中兴通讯技术,2010,16(4):25-27.
    [15]ZHU B, LI K, PATTERSON H. Avoiding theDisk Bottleneck in the Data Domain
    Deduplication File System [C]//Proceedings of the 6th USENIX Conference on File and
    Storage Technologies (FAST'08), Feb 26-29,2008, San Jose, CA, USA. Berkeley, CA,USA:
    [16]LILLIBRIDGE M, ELNIKETY S, BIRRELL A, etal. A Cooperative Internet Backup Scheme [C]//Proceedings of the 2003 USENIX Annual Technical Conference (USENIX'03), Jun 12-14,2003, San Antonio, TX, USA. Berkeley, CA, USA:USENIX Association, 2003:29-41.
    [17]PAMIES-JUAREZ L, GARCIA-LOPEZ P, SANCHEZ-ARTIGAS M. Rewarding Stability in Peer-to-Peer Backup Systems [C]//Proceedings of 16th IEEE International Conference on Networks (ICON'08), Dec 12-14,2008, New Delhi, India. Piscataway, NJ,USA:IEEE,2008:6p.
    [18]唐箭.云存储系统的分析与应用研究[J].电脑知识与技术,2009,5(20):13—14.
    [19]黄晓云.基于HDFS的云存储服务系统研究[D].大连海事大学.2010.
    [20]Hao, T., Jinde,L.,Jun,L. Describing and verifying web service using pi-calculus.Chinese Journal of Computers,2005,28 (4):635-643
    [21 Storage Networking Industry Association and the Open Grid Forum. Cloud Storage for Cloud Computing. SNIS Publishers, Sep 2009.
    [22]http://www.ccw.com.cn
    [23]王鹤群.云存储的应用[J].记录媒体技术,2008,(5):62—66
    [24]高骥远.全球眼应用于19个行业创新模式服务社会[J].通信信息报,2007,(7):1—2.
    [25]http://www.kansky.net
    [26]Leanne Guy, Erwin Laure, Heinz Stockinger, et al. Replica Management in Data Grids. Technical Report.Global Grid Forum Informational Document, GGF5, Edinburgh, Scotland, July,2002.
    [27]李静,陈蜀宇,吴长泽.一种基于安全的网格数据副本策略模型.计算机应用,2006,26(10).2282~2284
    [28]侯孟书,王晓斌,卢显良等.一种新的动态副本管理机制.计算机科学,2006,33(9).50~51
    [29]Weiyi Meng, Wenxian Wang, Hongyu Sun, Clement Yu. Concept Hierarchy-Based Text Database Categorization[J]. Knowledge and Information Systems, 2002,4(2):132-150.
    [30]K. Ranganathan and I. Foster. Identifying Dynamic Replication Strategies for a High Performance Data Grid[C], Proceeding of the Second International workshop on Grid Computing, Denver, November,2003:75-86
    [31]B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I.Foster, C. Kesselman, S.Meder, V. Nefedova, D. Quesnal, S. Tuecke. "Data Management and Transfer in High Performance Computational Grid Environments", Parallel Computing Journal, Vol.28 (5), May 2002:749-771.
    [32]M. Wldvogel,P.Hurley, and D. Buaer. Dynamic RePliea Mnaagement in Distributed HashTables, IBM Reseaerh RPortRZ-3502, July,2003.
    [33]R Meafee, J Memillan. Auction and Bidding[J]. Journal of Economic Literature,1987,25(6):699-738.
    [34]高田.数据网格中动态复制技术和副本选择策略的研究[D].山东师范大学.2008.
    [35]Zhihong XU, Xiangdan HOU, and Jizhou SUN. ANT ALGORITHM-BASED TASK SCHEDULING IN GRID COMPUTING. Electrical and Computer Engineering,2003. IEEE CCECE 2003. Canadian Conference on Volume 2,4-7 May 2003 Pages(s):1107-1110 vol 2.
    [36]王力.模拟退火算法在结构拓扑优化和复合材料铺层优化中的应用[D].大连理工大学,2008:8-12.
    [37]沈薇,刘方爱.基于模拟退火算法的数据副本选择策略[J]。计算机工程与应用,2006,35(3):144-145.
    [38]陆鄂丰.天津大学.硕士学位论文:数据网格中副本选择技术的研究[D],2007
    [39]李静.数据网格的资源管理相关策略及算法研究[D].重庆大学.2007.
    [40]Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar A. F. De Rose and Rajkumar Buyya CloudSim:a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. SOFTWARE-PRACTICE AND EXPERIENCE, Sep 2010.
    [41]SACHA J, DOWLING J. A gradient topology for master slave rep lication in peer-to-peer environm ents [EB/OL]. http://www.cs.vu.nll~jsacha/pub /db isp2p05/.
    [42]梁鸿,张春明,高元涛.数据网格下副本一致性问题的研究[J].计算机系统应用,2008(01):41--44.
    [43]蒋试伟,欧阳松.基于副本索引的P2P副本一致性维护策略[J].计算机工程,2008,34(19):123-126.
    [44]谢鲲,张大方,谢高岗,等.基于轨迹标签的无结构P2P副本一致性维护算法[J].软件学报,2007,18(1):105-116.
    [45]Andrew S,Tanenbaum,Maarten van Steen. Distributed Systems:Principles and Paradigms[M]. USA:Prentice Hall,2002:229-280

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700