用户名: 密码: 验证码:
Ceph系统中海量气象小文件存取性能优化方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Optimization of massive meteorological small files storage and accessing in ceph System
  • 作者:陆小霞 ; 王勇 ; 雷晓春
  • 英文作者:LU Xiaoxia;WANG Yong;LEI Xiaochun;School of Information and Communication, Guilin University of Electronic Technology;School of Computer and Information Security, Guilin University of Electronic Technology;Guangxi Cooperative Innovation Center of Cloud Computing and Big Data,Guilin University of Electronic Technology;
  • 关键词:Ceph分布式文件系统 ; 小文件 ; 相关性合并 ; 预读取
  • 英文关键词:ceph distributed file system;;small files;;correlation merger;;prepare reading
  • 中文刊名:GLDZ
  • 英文刊名:Journal of Guilin University of Electronic Technology
  • 机构:桂林电子科技大学信息与通信学院;桂林电子科技大学计算机与信息安全学院;桂林电子科技大学广西云计算与大数据协同创新中心;
  • 出版日期:2019-06-14 15:37
  • 出版单位:桂林电子科技大学学报
  • 年:2019
  • 期:v.39;No.160
  • 基金:国家自然科学基金(61662018,61661015);; 中国博士后科学基金(2016M602922XB);; 广西云计算与大数据协同创新中心项目(YDQ17001)
  • 语种:中文;
  • 页:GLDZ201901011
  • 页数:6
  • CN:01
  • ISSN:45-1351/TN
  • 分类号:65-70
摘要
为解决Ceph在处理海量气象小文件时,由于集群数据双倍写入会导致存储性能下降问题,提出了一种Ceph系统中海量气象小文件存取性能优化方法。该方法通过分析文件历史访问日志得到气象小文件间的关联概率,然后依据关联概率设计出文件合并算法将相关联的小文件合并后再存储到Ceph集群;访问文件时,根据文件块的利用率和相关率来衡量合并后小文件间的相关性,并根据其相关性进行文件预读取,减少用户与集群的交互以提高小文件的访问效率。实验表明,该方法与现有方法相比,能明显提高Ceph系统中海量气象小文件的存储效率和访问效率。
        In order to solve the problem of the storage performance degrades due to double writing of cluster data when Ceph is dealing with massive meteorological small files. This paper proposes an optimization method for accessing the mass meteorological small files in Ceph system. By analyzing the history file access log to get the association probability between meteorological small files, and then based on the association probability of document merging algorithm to design a small file associated with the relevant storage and then to Ceph; When reading a large number of meteorological small files through the utilization of the file block and the correlation rate to measure the correlation between the merged small files, and according to their relevance to pre-read the file, reducing user interaction with the cluster to improve the reading performance of large meteorological small files. The results of experiment show that the proposed method can significantly improve the efficiency of storing and accessing mass meteorological small files in Ceph system compared with the existing methods.
引文
[1] SAGE A W,SAGE A B,ETHAN L M,et al.CEPH:a scalable,high-performance distributed file system[C]//7th Symposium on Operating Systems Design and Implementation.USENIX,2006:307-320.
    [2] SAGE A W.CEPH:reliable,scalable,and high performance distributed storage[D].Santa Cruz:University of California,2007:3-10.
    [3] SAGE A W,SAGE A B,ETHAN L M,et al.CRUSH:controlled,scalable,decentralized placement of replicated data[C]//ACM/IEEE Conference on Supercomputing.ACM,2006:31-43.
    [4] FU Songling,HE Ligang,HUANG Chenlin,et al.Performance optimization for managing massive numbers of small files in distributed file systems[J].IEEE Transactions on Parallel and Distributed Systems,2015,26(12):3433-3448.
    [5] YANG Hongzhang,ZHANG Junwei,ZENG Xiangchao,et al.Research of massive small files reading optimization based on parallel network file system[C]//High Performance Computing and Communications(HPCC),IEEE 7th International Symposium on Cyberspace Safety and Security(CSS),IEEE 12th International Conference on Embedded Software and Systems(ICESS),2015:204-212.
    [6] LI Hongqi,ZHU Liping,SUN Guoyu,et al.Design and implementation of distributed mass small file storage system[J].Computer Engineering and Design,2016,37(1):86-92.
    [7] 张毕涛,辛阳.基于CEPH的海量小文件存储的优化方法[C]//中国通信学会学术年会,2014:229-234.
    [8] 穆彦良,徐振明.CEPH存储中基于温度因子的CRUSH算法改进[J].成都信息工程学院学报,2015,30(6):563-567.
    [9] BO Dong,ZHENG Qinghua,FENG Tian,et al.An optimized approach for storing and accessing small files on cloud storage[J].Journal of Network and Computer Applications,2012,35(6):1847-1862.
    [10] 李铁,燕彩蓉,黄永锋,等.面向Hadoop分布式文件系统的小文件存取优化方法[J].计算机应用,2014,34(11):3091-3095.
    [11] 游小容,曹晟.海量教育资源中小文件的存储研究[J].计算机科学,2015,42(10):76-80.
    [12] SABINA P,GADADHAR S.A modified and memory saving approach to B+ tree index for search of an image database based on chain codes[J].International Journal of Computer Applications,2011,9(3):24-28.
    [13] 施恩,顾大权,冯径,等.B+树索引机制的研究及优化[J].计算机应用研究,2017,34(6):1766-1769.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700