基于Hadoop的访问热点副本迁移技术
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Replication Migration Technology for Access Hotspots Based on Hadoop
  • 作者:冯钧 ; 王纯 ; 朱康康 ; 魏童童
  • 英文作者:FENG Jun;WANG Chun;ZHU Kang-kang;WEI Tong-tong;College of Computer and Information,Hohai University;
  • 关键词:云存储 ; 大数据 ; 数据迁移 ; 动态副本 ; 访问热点
  • 英文关键词:cloud storage;;big data;;data migration;;dynamic duplication;;access hotspots
  • 中文刊名:JYXH
  • 英文刊名:Computer and Modernization
  • 机构:河海大学计算机与信息学院;
  • 出版日期:2016-01-26 13:42
  • 出版单位:计算机与现代化
  • 年:2016
  • 期:No.245
  • 基金:国家自然科学基金资助项目(61370091;61170200)
  • 语种:中文;
  • 页:JYXH201601022
  • 页数:6
  • CN:01
  • ISSN:36-1137/TP
  • 分类号:111-116
摘要
提出一种云环境下的访问热点负载均衡模型:基于节点的吞吐量与响应时间等主要参考指标,构建节点负载判定模块;文件在HDFS存储的过程中,将文件对应的数据块编号与存储路径相结合,设计存放在数据节点中的数据块到文件目录映射表;提出一种基于节点负载以及节点的存储空间的迁移源节点和目标节点选择方法;基于机架感知的机制,制定一种动态副本迁移方案。最后利用执行器下发指令给相应的数据节点,执行具体的迁移任务以及完善迁移后副本因子等参数信息的调整。通过迅速扩散副本的方式,来增加热点文件的副本数量,使得系统能够对外提供更大的吞吐量,缩短系统反应时间。
        This paper proposes an access hotspots load balance model of cloud environment. Based on the node throughput and response time of the main reference index,this paper constructs the node load determination module. In the process of files storing to HDFS,combining the file corresponding to the data block number with storage paths,the blocks of data stored in the data node are designed to a file directory mapping table. A selection method based on the node load and the storage space of migration of source node and destination node is puts forward. Based on the frame of perception mechanism,a dynamic copy migration scheme is established. Finally using the actuator issued instructions to the appropriate data nodes,it carries on the specific migration tasks and improves the migrated copy factor to adjust parameters,such as information. This paper,by means of copy quickly spread to increase hot file copy number,enables the system to provide greater throughput and shortens the system response time.
引文
[1]Gantz J,Reinsel D.The 2011 Digital Universe Study:Extracting Value from Chaos[EB/OL].http://www.emc.com/collateral/analyst-reports/idc-extracting-value-fromchaos-ar.pdf,20113-06-30.
    [2]The Apache Software Foundation.What Is Apache Hadoop?[EB/OL].http://hadoop.apache.org,2005-08-10.
    [3]He Mei,Xing Ling,Li Guobin.A data migration strategy for HSM based on data value[J].Journal of Information and Computational Science,2011,8(2):312-317.
    [4]Gill B S.On multi-level exclusive caching:Offline optimality and why promotions are better than demotions[C]//Proceedings of the 6th USENIX Conference on File and Storage Technologies.2008:49-65.
    [5]Huang Gangfeng,Yang Maishun,Jing Mingming.Two improved edge coloring algorithms for data migration[C]//Proceedings of 2013 Chinese Intelligent Automation Conference.2013:779-786.
    [6]Dong Bo,Zheng Qinghua,Tian Feng,et al.Performance models and dynamic characteristics analysis for HDFS write and read operations:A systematic view[J].Journal of Systems&Software,2014,93:132-151.
    [7]章文嵩.Taobao海量图片存储与CDN系统[EB/OL].http://wenku.baidu.com/view/697bc964783e0912a2162a57.html,2010-08-27.
    [8]Yang Xi,Yin Yanlong,Jin Hui,et al.SCALER:Scalable parallel file write in HDFS[C]//2014 IEEE International Conference on Cluster Computing.2014:203-211.
    [9]Lee Mu-Woong,Hwang Seung-won.Robust distributed indexing for locality-skewed workloads[C]//Proceedings of the 21st ACM International Conference on Information and Knowledge Management.2012:1342-1351.
    [10]Gao Aiqiang,Diao Luhong.Lazy update propagation for data replication in cloud computing[C]//2010 5th International Conference on Pervasive Computing and Applications.2010:250-254.
    [11]Borthakur D.HDFS Architecture Guide[EB/OL].http://www.doc88.com/p-5741212128527.html,2015-08-10.
    [12]Zhang Gong,Chiu L,Liu Ling.Adaptive data migration in multi-tiered storage based cloud environment[C]//2010IEEE 3rd International Conference on Cloud Computing.2010:148-155.
    [13]Kari C,Kim Y A,Russell A.Data migration in heterogeneous storage systems[C]//2011 31st International Conference on Distributed Computing Systems.2011:143-150.
    [14]Fan Kai,Zhang Dayang,Li Hui,et al.An adaptive feedback load balancing algorithm in HDFS[C]//2013 5th International Conference on Intelligent Networking and Collaborative Systems.2013:23-29.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700