面向数据抗毁性的对等网络数据冗余存储策略研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

About the library

Background
History
Leadership
Organization

Readers' Guide

Opening Hours
Collections
Help Via Email

Publications

Electronic Information Resources

面向数据抗毁性的对等网络数据冗余存储策略研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Data Invulnerability Oriented Redundant Storage in Peer-to-peer Networks
作者：徐浩
论文级别：硕士
学科专业名称：管理科学与工程
中文关键词：数据抗毁性 ; 结构化对等网络 ; 分块复制 ; 动态中心
英文关键词：Data Invulnerability ; Structured P2P Networks ; Blocking Replication ; Central Point
学位年度：2009
导师：肖卫东
学科代码：1201
学位授予单位：国防科学技术大学
论文提交日期：2009-12-01

摘要

在分布式网络化作战环境中引入对等(Peer-to-Peer,P2P)网络,并对战场资源进行冗余存储,可以避免将这些战场资源集中存储在少数几个存储量大、处理能力很强的服务器中,以至于使其成为敌方的重点打击目标。因此,研究面向数据抗毁性的对等网络数据冗余存储策略,有着极其重要的意义。
     当节点在不可预测的情况下失效时,现有的评价指标不能准确评价对等网络中数据被访问的概率,因而需要引入一种与节点有效性无关的评价指标来进行描述,由此本文提出了数据抗毁性这一概念。随后,本文从全复制、分块复制和有固定中心节点的分块复制三个角度对数据抗毁性进行建模分析,并进一步讨论了数据恢复机制对数据抗毁性的影响。
     基于本文所进行的数据抗毁性的描述,在相同的存储开销下,分块复制策略下的数据抗毁性在失效节点数量较少时要高于全复制,而当失效节点数量较多时,全复制策略下的数据抗毁性要更高。因此,本文设计了一个分块复制与全复制相结合的冗余存储策略,对数据进行分块存储的同时在网络中保留一个完整的数据副本,将分块复制与全复制的优势相结合。然而现有的数据存储策略在进行冗余存储时,网络中通常会有固定的中心节点进行这些冗余数据的管理,由中心节点主动发出探询消息,以检测网络中的副本数量、位置等信息。这些节点的存在方面简化了算法,但同时也造成了中心节点的瓶颈效应。而且通过本文描述,采用有固定中心节点的分块复制算法所获得的数据抗毁性要低于全复制和分块复制算法。因此,本文充分考虑到中心节点的瓶颈效应,提出了基于动态中心确认机制的数据存储策略DCDS(Dynamic Center-based Data Storage)及其改进策略IDCDS(Improved DCDS),将数据副本的确认模式改为数据块保存节点主动确认。利用基于DHT的结构化P2P网络的特性,由数据块保存节点将数据D的确认信息发往该数据的key值管理节点,并由该数据的key值管理节点来负责检测网络中数据块的数量。在这种模式下,即使节点失效,根据基于DHT的P2P网络的特点,会有新的节点来管理该key值,那么数据块的确认信息就会发至这个新的节点上,从而确保数据不会随着确认节点的失效而丢失。
     仿真实验表明本文提出的基于动态中心确认机制的数据存储策略在网络节点不断失效情况下,对提高网络中可用数据的数量是有效的。
A peer-to-peer(P2P) networks and redundant storage strategy in a distributed networked operations environment can avoid the centralized storage in the few and high capacity of the host which can easily become the focus of the enemy's attack. Therefore, data invulnerability oriented redundant storage in peer-to-peer networks is of great significance.
     When node failure appears in an unpredictable way, the existing evaluation is not accurate to the probability of data being accessed. Hence, a new evaluation, which is irrelative about the node availability, is introduced. After that, this dissertation models the data invulnerability through three perspectives:replication, blocking replication, blocking replication based on a fixed central point. Besides, data reconstruction strategy is discussed.
     Based on the data invulnerability proposed in this dissertation, blocking replication strategy gains higher data invulnerability than replication strategy when the data storage costs are the same and there are few node failures in the networks. But the replication strategy gains higher data invulnerability than blocking replication strategy when the failure increased. So, the dissertation presents a mixed redundant storage strategy, which stores a full replica in the networks when useing the blocking strategy. However, there is always a central point in current storage strategy which managing the redundant data, such as sending probe message actively and detecting the number of replica. On the one hand, these central point can reduce the complexity of the storage strategy. On the other hand, they become the bottleneck of the efficient storage, and the blocking replication based on a fixed central point gains less than the replication and blocking replication. Thus, a dynamic center-based data storage(DCDS) strategy and a improving-DCDS(IDCDS) strategy are proposed. It changes the passive confirmation of blocking data's storage node into the active confirmation. The blocking data's storage node send confirmation message using distributed hash table(DHT) to the node which manage the blocking's key value. And the latter one will detect the status of the data. When it fails, the confirmation message will be sent to new comer under the DHT-based P2P networks. Therefore the data won't be lost with the node failure.
     Simulation results show that the DCDS strategy and the IDCDS strategy can incerase the number of data in networks efficiently on the premise of node failure continuously.

引文

[1]Jeff Cares.分布式网络化作战：网络中心战基础.北京邮电大学出版社.2006.
    [2]L.Gong. JXTA. A Network Programming Environment. IEEE Internet Computing, Vol.5, No.3,2001:88-95.
    [3]Peer-to-Peer Working Group.http://www.P2Pwg.org.
    [4]Dejan S.Milojicic, Vana Kalogeraki, Rajan Lukose, Kiran Nagaraja, Jim Pruyne, Bruno Richard, Sami Rollins, Zhichen Xu. Peer-to-Peer Computing.Tech.Rep. HPL-2002-57, HP Laboratories Palo Alto, March 2002.
    [5]Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker. Making Gnutellalike P2P Systems Scalable. In SIGCOMM, Karlsruhe, Germany,2003.
    [6]The Gnutella Protocol 0.6[EB/OL]. http://rfc-gnutella.sourceforge.net,2002.
    [7]Ion Stoica, Robert Morris, David Liben-Nowell, David R.Karger, et al. Chord:A Scalable Peer-to-peer Lookup Protocol for Internet Applications. IEEE/ACM Transactions on Networking, Vol.11,No.1, February 2003:17-32.
    [8]S.Ratnasamy, P.Francis, M.Handley, et al. A Scalable Content-Addressable Network. In Proceedings of ACM SIGCOMM'2001, New York, USA,2001.
    [9]A. Rowstron, and P.Druschel. Pastry:Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems, Lecture Notes in Computer Science,2001,2218:329-350.
    [10]B.Y. Zhao, J. Kubiatowicz, and A.D. Joseph. Tapestry:A Fault-Tolerant Wide-Area Application Infrastructure, Computer Communication Review,2002, 32(1):81.
    [11]N.J.A. Harvey, M.B. Jones, S. Saroiu, M. Theimer, and A. Wolman. Skipnet:A Scalable Overlay Network with Practical Locality Properties. In Proceedings of 4th USENIX Symposium on Internet Technologies and Systems, Washington, USA,2003.
    [12]X.Luo, Z.Qin, J. Han, and H.Chen. Dht-Assisted Probabilistic Exhaustive Search in Unstructured P2P Network. In Proceedings of 22nd International Parallel and Distributed Processing Symposium (IPDPS), Miami, Florida USA,2008.
    [13]X.Shi, J.Han, Y.Liu, and L.M.Ni. Popularity Adaptive Search in Hybrid P2P Systems. In Proceedings of 21st International Parallel and Distributed Processing Symposium (IEEE IPDPS), Long Beach, California USA,2007.
    [14]William J.Bolosky, John R.Douceur, David Ely, and Marvin Theimer. Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs. In Proc. the International Conference on Measurement and Modeling of Computer Systems(SIGMETRICS'2000),2000:34-43.
    [15]Robbert van Renesse. Efficient Reliable Internet Storage. Workshop on Dependable Distributed Data Management, October 2004:40-49.
    [16]F.J.Mac Williams and N.J.A.Sloane. The Theory of Error-Correcting Codes, Part I. North-Holland Publishing Company, Amsterdam, New York,Oxford,1977.
    [17]Kavitha Ranganathan, Adriana Iamnitchi, Ian Foster. Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities. In Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID.02),2002.
    [18]Vijay Gopalakrishnan, Bujor Silaghi, Bobby Bhattacharjee, and Pete Keleher. Adaptive Replication in Peer-to-Peer Systems. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04),2004.
    [19]Bong-Jun Ko, DanRubenstein, Distributed Self-Stabilizing Placement of Replicated Resources in Emerging Networks. IEEE/ACM transactions on networking, vol.13, NO.3, JUNE 2005.
    [20]Hiroshi Yamamoto, Daisuke Maruta, Yuji Oie. Replication Methods for Load Balancing on Distributed Storages in P2P Networks. In Proceedings of the 2005 Symposium on Applications and the Internet (SAINT'05),2005.
    [21]S Rajasekhar, B Rong, K Y Lai, I Khalil and Z Tari. Load Sharing in Peer-to-Peer Networks using Dynamic Replication. In Proceedings of the 20th International Conference on Advanced Information Networking and Applications (AINA'06),2006.
    [22]Edith Cohen, Scott Shenker. Replication Strategies in Unstructured Peer-to-Peer Networks. ACM SIGCOMM,2002.
    [23]Jian Zhou, Laxmi N. Bhuyan and Anirban Banerjee. An Effective Pointer Replication Algorithm in P2P Networks. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing(IPDPS),2008.
    [24]Jian Ni, Jie Lin Steven, J. Harrington, Naveen Sharma. Designing File Replication Schemes for Peer-to-Peer File Sharing Systems. In Proceedings of IEEE International Conference on Communications(ICC'08),2008.
    [25]Shinya NOGAMI, Masato UCHIDA, and Takeo ABE. Replication Scheme for Traffic Load Balancing and Its Parameter Tuning in Pure P2P Communication. In Proceedings of Autonomous Decentralized Systems(ISADS 2005),2005:701-706.
    [26]W.K.Lin, C.Ye, D.M.Chiu. Decentralized Replication Algorithms for Improving File Availability in P2P Networks. In Proceedings of Fifteenth IEEE International Workshop on Quality of Service,2007:29-37.
    [27]J.Kubiatowicz, D.Bindel, Y.Chen, S.Czerwinski, P.Eaton, D.Geels, R. Gummadi, S.Rhea, H.Weatherspoon, W.Weimer, C.Wells, and B.Zhao. Oceanstore:An architecture for global-scale persistent storage. In Proceedings of 9th International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS), ACM,2000:190--201.
    [28]Predrag K., Andreas Wombacher, Thomas Risse. Enabling High Data Availability in a DHT. In Proceedings of the 16th International Workshop on Database and Expert Systems Applications (DEXA'05),2005.
    [29]Predrag, Andreas Wombacher, Thomas Risse. Managing and Recovering High Data Availability in a DHT under Churn. In Proceedings of International Conference on Collaborative Computing:Networking, Applications and Worksharing,2006.
    [30]Jian Zhou, Xin Zhang, Laxmi, Bhuyan, Bin Liu. Clustered K-Center:Effective Replica Placement in Peer-to-Peer Systems. In Proceedings of IEEE Global Telecommunications Conference,2007.
    [31]Min Cai, Ann Chervenak, Martin Frank. A Peer-to-Peer Replica Location Service Based on A Distributed Hash Table. In Proceedings of the ACM/IEEE SC2004 Conference,2004:56-56.
    [32]R.Bhagwan, S.Savage, and G.M.Voelker. Replication strategies for highly available peer-to-peer systems. Technical Report CS2002-0726, University of California, San Diego, Nov 2002.
    [33]F.Dabek, M. Kaashoek, D.Karger, R.Morris, and I.Stoica. Wide-area cooperative storage with CFS. In Proceedings of the 18th ACM symposium on Operating System Principles (SOSP),2001.
    [34]A.Rowstron and P.Druschel. Storage management and caching in past, a largescale, persistent peer-to-peer storage utility. In Proceedings of the 18th ACM Symposium on Operating Systems Principles(SOSP'01),2001.
    [35]J.Chu, K.Labonte, and B.Levine. Availability and locality measurements of peer-to-peer file systems. In Proceedings of ITCom:Scalability and Traffic Control in IP Networks, July 2002.
    [36]S.Saroiu, P.K.Gummadi, and S.D.Gribble. A measurement study of peer-to-peer file sharing systems. In Proceedings of MMCN,2002.
    [37]S.Sen and J.Wang. Analyzing peer-to-peer traffic over large networks. In Proceedings of ACM SIGCOMM Internet Measurement Workshop, November 2002.
    [38]H.Weatherspoon, T.Moscovitz, and J.Kubiatowicz. Introspective failure analysis: Avoiding correlated failures in peer-to-peer systems. In Proceedings of the International Workshop on Reliable Peer-to-peer Distributed Systems,2002.
    [39]D.Long, A.Muir, and R.Golding. A longitudinal study of internet host reliability. In Proceedings of the Fourteenth Symposium on Reliable Distributed Systems, September 1995.
    [40]H.Weatherspoon and J.Kubiatowicz. Erasure coding vs. replication:a quantitative approach. In Proceedings of the First International Workshop on Peer-to-peer Systems,2002.
    [41]K.Sripanidkulchai. The Popularity of Gnutella Queries and Its Implications on Scalability. White Paper, Carnegie Mellon University. http://www.cs.cmu.edu/ ～kunwadee/research/P2P/gnutella.html,Mar 2001.
    [42]Charles Blake, Rodrigo Rodrigues. High Availability, Scalable Storage, Dynamic Peer Networks:Pick Two. In Proceedings of The 9th Workshop on Hot Topics in Operating Systems(HotOS).2003.
    [43]Sit E., Haeberlen A., Dabek F., Chun B., Weatherspoon H., Morris R., Kaashoek M. F., and Kuiatowicz J.. Proactive replication for data durability. In Proceedings of IPTPS, Santa Barbara, USA,2006.
    [44]Bhagwan R., Tati K., Cheng Y., Savage S., and Voelker G.. Total recall:system support for automated availability management. In Proceedings of NSDI, San Francisco, USA,2004.
    [45]Chun B., Dabek F., Haeberlen A., Sit E., Weatherspoon H., Kaashoek M. F., Kubiatowicz J., and Morris R.. Efficient replica maintenance for distributed storage systems. In Proceedings of NSDI, San Jose, Canada,2006.
    [46]On G., Schmitt J., and Steinmetz R.. The effectiveness of realistic replication strategies on quality of availability for peer-to-peer systems. In Proceedings of P2P, Linkoping, Sweden,2003.
    [47]Saurabh Tewari, Leonard Kleinrock. Proportional Replication in Peer-to-Peer Networks. In Proceedings of INFOCOM 2006,2006:1-12.
    [48]Haiying Shen. EAD:An Efficient and Adaptive Decentralized File Replication Algorithm in P2P File Sharing Systems. In Proceedings of Eighth International Conference on Peer-to-Peer Computing (P2P'08),2008.
    [49]L.P.Cox, C.D.Murray, and B.D.Noble. Pastiche:Making Backup Cheap and Easy. In Proceedings of 5th Symposium on Operating Systems Design and Implementation(OSDI'02). December 2002:285-298.
    [50]Predrag K., Andreas Wombacher, Thomas Risse. DHT-Based Self-adapting Replication Protocol for Achieving High Data Availability. SITIS 2006.
    [51]Guangping Xu, Gang Wang, Jing Liu. A hybrid redundancy approach for data availability in structured P2P network systems. In Proceedings of 13th IEEE International Symposium on Pacific Rim Dependable Computing.2007.
    [52]Mirko Knoll, Haitham Abbadi, and Torben Weis, Replication in Peer-to-Peer Systems. IWSOS 2008.
    [53]Chang F, Minwen Ji, et al. Myriad:Cost-effective Disaster Tolerance. In Proceedings of FAST,2002
    [54]Mario Blaum, Jim Brady, Jehoshua Bruck, jai Menon. EVENODD:An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures. IEEE transaction on computers, vol.44, NO.2, FEBRUARY 1995.
    [55]Lihao Xu, Vasken Bohossian, Jehoshua Bruck, and David G. Wagner. Low-Density MDS Codes and Factors of Complete Graphs. IEEE transaction on imformation theory, vol.45, NO.6, SEPTEMBER 1999.
    [56]Lihao Xu and Jehoshua Bruck. X-Code:MDS Array Codes with Optimal Encoding. IEEE transaction on imformation theory, vol.45, NO.1, JANUARY 1999.
    [57]James S. Plank, Lihao Xu. Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications. In Proceedings of Fifth IEEE International Symposium on Network Computing and Applications (NCA'06), 2006.
    [58]Michael Luby. LT Codes. In Proceedings of the 43rd Symposium on Foundations of Computer Science,2002:271.
    [59]Alexandros G. Dimakis, P. Brighten Godfrey, Martin J. Wainwright and Kannan Ramchandran. Network Coding for Distributed Storage Systems. IEEE INFOCOM 2007.
    [60]Amin Shokrollahi. Raptor codes. IEEE/ACM Transactions on Networking (TON),2006.
    [61]Evangelos P. Markatos. Tracing a Large-Scale Peer to Peer System:An Hour in the Life of Gnutella. In Proceedings of Second IEEE International Symposium on Cluster Computing and the Grid (CCGRID'02),2002.
    [62]Edith Cohen, Scott Shenker. Replication strategies in unstructured peer-to-peer networks. SIGCOMM 2002:177-190.
    [63]Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, Roger P. Wattenhofer. Farsite:federated,available,and reliable storage for an incompletely trusted environment. In Proceedings of the 5th symposium on Operating systems design and implementation(OSDI'02),2002:1-14.
    [64]John W.Byers,Michael Luby,Michael Mitzenmacher,Ashutosh Rege. A Digital Fountain Approach to Reliable Distribution of Bulk Data. ACM SIGCOMM, 1998.