一种不依赖访问热度信息的分布式文件放置算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Distributed File Placement Algorithm without Depending on Popularity Information
  • 作者:薛弘晔 ; 田治武 ; 罗香玉 ; 冯健 ; 王丹
  • 英文作者:XUE Hong-ye;TIAN Zhi-wu;LUO Xiang-yu;FENG Jian;WANG Dan;School of Computer Science and Technology,Xi'an University of Science and Technology;
  • 关键词:分布式文件存储系统 ; 文件访问热度 ; 文件放置 ; 负载均衡
  • 英文关键词:distributed file storage system;;file popularity;;file placement;;load balance
  • 中文刊名:KXJS
  • 英文刊名:Science Technology and Engineering
  • 机构:西安科技大学计算机科学与技术学院;
  • 出版日期:2018-01-18
  • 出版单位:科学技术与工程
  • 年:2018
  • 期:v.18;No.435
  • 基金:西安科技大学博士启动基金(2015QDJ031);; 陕西省教育厅专项科学研究计划项目(15JK1468)资助
  • 语种:中文;
  • 页:KXJS201802045
  • 页数:5
  • CN:02
  • ISSN:11-4688/T
  • 分类号:290-294
摘要
文件放置问题一直是分布式存储领域的研究热点。分布式文件存储系统HDFS随机选择节点完成文件放置,存在访问负载分布不均衡的缺点。研究人员提出大量基于文件访问热度信息的放置算法;但是,文件的访问热度信息是动态变化的,难以准确预测。提出一种不依赖访问热度信息的分布式文件放置算法;该算法仅使用文件的创建时间信息,利用文件已创建时间与访问热度之间的相关性,首先将时间进行区间划分,然后统计出各节点在不同时间区间内所创建文件的数据量,放置过程中保持同一时间区间的数据量在不同节点间大致相同。实验结果表明,该算法不仅可以使各节点的存储负载达到均衡,还能够提升访问负载的均衡,消除因文件访问热度不均而导致的性能瓶颈。
        File placement has always been a research hotspot in the field of distributed storage.The distributed file storage system HDFS places files by randomly selecting nodes,which leads to imbalance in accessing load.Researchers have proposed a large number of placement algorithms based on file popularity.However,file popularity is dynamically changing,and is difficult to accurately predict.A distributed file placement algorithm was proposed without depending on file popularity.According to the creation time of file and the correlation between creation time and file popularity,the algorithm firstly divides the time interval,and then counts the data of each node in different time intervals.It keeps the data of different nodes in the same time interval roughly the same.Experimental results show that the algorithm can balance not only the storage load,but also the access load on each node,and it eliminates the performance bottleneck caused by the uneven distribution of file popularity.
引文
1张广彬,盘骏,曾智强.数据中心2013:硬件重构与软件定义.2014Zhang Guangbing,Pan Jun,Zeng Zhiqiang.Data center 2013:hardware refactoring and software definition.2014
    2杨传辉.大规模分布式存储系统:原理解析与架构实战.北京:机械工业出版社,2013Yang Chuanhui.Large-scale distributed storage system:principle analysis and structure of actual combat.Beijing:Machinery Industry Press,2013
    3 Shvachko K,Kuang H,Radia S,et al.The hadoop distributed file system.IEEE,Symposium on MASS Storage Systems and Technologies.IEEE Computer Society,2010:1-10
    4王永洲,茅苏.HDFS中的一种数据放置策略.计算机技术与发展,2013;(5):90-92Wang Yongzhou,Mao Su.A data placement strategy in HDFS.Journal of Computer Technology and Development,2013;(5):90-92
    5杨俊杰,廖卓凡,冯超超.大数据存储架构和算法研究综述.计算机应用,2016;36(9):2465-2471Yang Junjie,Liao Zhuofan,Feng Chaochao.Research on large data storage architecture and algorithm.Journal of Computer Application,2016;36(9):2465-2471
    6罗鹏,龚勋.HDFS数据存放策略的研究与改进.计算机工程与设计,2014;35(4):1127-1131Luo Peng,Gong Xun.Research and improvement of HDFS data storage strategy.Journal of Computer Engineering and Design,2014;35(4):1127-1131
    7罗军,陈仕强.基于支持向量机的HDFS副本放置改进策略.计算机工程,2015;41(11):114-119Luo Jun,Chen Shiqiang.Implementation strategy of HDFS replica placement based on support vector machine.Journal of Computer Engineering,2015;41(11):114-119
    8张松,杜庆伟,孙静,等.基于预测的云计算热点数据副本因子决策算法.计算机与现代化,2015;(2):62-66Zhang Song,Du Qingwei,Sun Jing,et al.Decision-making algorithm based on cloud computing hotspot data replica factor based on forecasting.Journal of Computer and Modernization,2015;(2):62-66
    9伍文静,程耀东,汪璐,等.面向本地分布式存储系统的动态副本策略.计算机工程与应用,2010;46(12):21-24Wu Wenjing,Cheng Yaodong,Wang Lu,et al.Dynamic replica strategy for local distributed storage systems.Computer Engineering&Applications,2010;46(12):21-24
    10陶永才,张宁宁,石磊,等.异构环境下云计算数据副本动态管理研究.小型微型计算机系统,2013;34(7):1487-1492Tao Yongcai,Zhang Ningning,Shi Lei,et al.Research on dynamic management of data replicas of cloud computing in heterogeneous environments.Journal of Chinese Computer Systems,2013;34(7):1487-1492
    11王宁,杨扬,孟坤,等.云计算环境下基于用户体验的成本最优存储策略研究.电子学报,2014;42(1):20-27Wang Ning,Yang Yang,Meng Kun,et al.A customer experiencebased cost minimization strategy of storing data in cloud computing.Acta Electronica Sinica,2014;42(1):20-27
    12 Tang W,Fu Y,Cherkasova L,et al.Modeling and generating realistic streaming media server workloads.International Journal of Computer&Telecommunications Networking,2007;51(1):336-356
    13柴云鹏,杨楠.冷数据集中的流媒体存储系统节能方法.计算机科学,2012;39(10):148-151Chai Yunpeng,Yang Nan.Cold data concentration:energy saving method for streaming media storage systems.Journal of Computer Science,2012;39(10):148-151

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700