分布式计算环境下的栅格数据存储策略
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Storage strategy of raster data under the distributed computing environment
  • 作者:张剑波 ; 夏灯城 ; 赵加奥 ; 李谢清 ; 崔永键 ; 袁国斌
  • 英文作者:ZHANG Jianbo;XIA Dengcheng;ZHAO Jiaao;LI Xieqing;CUI Yongjian;YUAN Guobin;Faculty of Information Engineering,China University of Geosciences;
  • 关键词:分布式计算 ; 栅格数据 ; 存储策略 ; 地图代数
  • 英文关键词:distributed computing;;raster data;;storage strategy;;map algebra
  • 中文刊名:GFKJ
  • 英文刊名:Journal of National University of Defense Technology
  • 机构:中国地质大学信息工程学院;
  • 出版日期:2017-12-28
  • 出版单位:国防科技大学学报
  • 年:2017
  • 期:v.39
  • 基金:国家自然科学基金资助项目(41001225,41501584)
  • 语种:中文;
  • 页:GFKJ201706009
  • 页数:8
  • CN:06
  • ISSN:43-1067/T
  • 分类号:54-61
摘要
针对传统的栅格数据存储策略不能满足分布式计算环境下粗粒度数据访问需求,应对海量栅格数据计算时效率低下的问题,结合分布式文件系统的存储特点,同时考虑地图代数算子在Map/Reduce阶段以栅格瓦片为单位的计算特点,提出一种基于Hadoop分布式文件系统的栅格瓦片存储策略。围绕栅格数据瓦片分割、压缩瓦片数据组织与存储、分布式文件输入输出接口改进等方面对该存储策略加以实现,并使用基于该存储策略的地图代数局部算子的分布式计算流程加以验证。理论分析与实验结果表明,该策略能够显著提高分布式计算环境下空间分析算子的运算速度。
        Traditional storage strategy of raster data cannot meet the demands of coarse-grained data processing under the distributed computing environment and has low efficiency when dealing with calculations for gigantic raster data. A storage strategy of raster tile data was presented on the basis of the storage characteristics of distributed file system. It also took the calculation characteristics of spatial analysis operators of map algebra into consideration,which uses raster tile as processing unit during map and reduce stage. The storage strategy was implemented by the following steps. Firstly raster data were divided into raster tiles. Then these tiles were compressed and organized by a special sequence in order to be transferred to Hadoop distributed file system. Finally input and output file interfaces were re-implemented to meet the data access requirements of map and reduce stage. The strategy was tested and verified by the distributed calculation process of local map algebra operators. Theoretical analysis and experimental results show that this strategy can significantly improve the processing speed of space analysis operators.
引文
[1]Guan Q F,Kyriakidis P C,Goodchild M F.A parallel computing approach to fast geostatistical areal interpolation[J].International Journal of Geographical Information Science,2011,25(8):1241-1267.
    [2]Qin C Z,Zhan L J,Zhu A.How to apply the geospatial data abstraction library(GDAL)properly to parallel geospatial raster I/O?[J].Transactions in GIS,2014,18(6):950-957.
    [3]张剑波,周斯波,袁国斌,等.异构环境下的空间分析并行映射策略[J].上海交通大学学报,2013,47(1):70-75.ZHANG Jianbo,ZHOU Sibo,YUAN Guobin,et al.Parallel processing mapping strategy of spatial analysis under the heterogeneous environment[J].Journal of Shanghai Jiaotong University,2013,47(1):70-75.(in Chinese)
    [4]康俊锋,杜震洪,刘仁义,等.基于GPU加速的遥感影像金字塔创建算法及其在土地遥感影像管理中的应用[J].浙江大学学报(理学版),2011,38(6):695-700.KANG Junfeng,DU Zhenhong,LIU Renyi,et al.Parallel image resample algorithm based on GPU for land remote sensing data management[J].Journal of Zhejiang University(Science Edition),2011,38(6):695-700.(in Chinese)
    [5]肖汉,张祖勋.基于GPGPU的并行影像匹配算法[J].测绘学报,2010,39(1):46-51.XIAO Han,ZHANG Zuxun.Parallel image matching algorithm based on GPGPU[J].Acta Geodaetica et Cartographica Sinica,2010,39(1):46-51.(in Chinese)
    [6]Zhang J T,You S M.High-performance quadtree constructions on large-scale geospatial rasters using GPGPU parallel primitives[J].International Journal of Geographical Information Science,2013,27(11):2207-2226.
    [7]Qin C Z,Zhan L J,Zhu A X,et al.A strategy for rasterbased geocomputation under different parallel computing platforms[J].International Journal of Geographical Information Science,2014,28(11):2127-2144.
    [8]赫高进,熊伟,陈荦,等.基于MPI的大规模遥感影像金字塔并行构建方法[J].地球信息科学学报,2015,17(5):515-522.HE Gaojin,XIONG Wei,CHEN Luo,et al.An MPI-based parallel pyramid building algorithm for large-scale RS image[J].Journal of Geo-Information Science,2015,17(5):515-522.(in Chinese)
    [9]申焕,石晓春,邱宏华.基于MPI的海量遥感影像并行处理技术探析[J].全球定位系统,2012,37(6):73-76.SHEN Huan,SHI Xiaochun,QIU Honghua.Study on parallel processing technology of massive remote sensing image based on MPI[J].GNSS World of China,2012,37(6):73-76.(in Chinese)
    [10]Didelot S,Carribault P,Pérache M,et al.Improving MPI communication overlap with collaborative polling[J].Computing,2014,96(4):263-278.
    [11]Lai C G,Huang M Q,Shi X,et al.Accelerating geospatial applications on hybrid architectures[C]//Proceedings of IEEE 10th International Conference on High Performance Computing and Communications&IEEE International Conference on Embedded and Ubiquitous Computing,2013:1545-1552.
    [12]Shi X,Lai C G,Huang M Q,et al.Geocomputation over the emerging heterogeneous computing infrastructure[J].Transactions in GIS,2014,18(S1):3-24.
    [13]Shi X,Ye F.Kriging interpolation over heterogeneous computer architectures and systems[J].GIScience&Remote Sensing,2013,50(2):196-211.
    [14]BernabéS,Sánchez S,Plaza A,et al.Hyperspectral unmixing on GPUs and multi-core processors:a comparison[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2013,6(3):1386-1398.
    [15]刘磊,尹芳,冯敏,等.基于开源Hadoop的栅格数据分布式处理[J].华中科技大学学报(自然科学版),2013,41(7):103-108.LIU Lei,YIN Fang,FENG Min,et al.Distributed computation of raster data using open source Hadoop[J].Journal of Huazhong University of Science and Technology(Natural Science Edition),2013,41(7):103-108.(in Chinese)
    [16]刘义,陈荦,景宁,等.利用MapReduce进行批量遥感影像瓦片金字塔构建[J].武汉大学学报(信息科学版),2013,38(3):278-282.LIU Yi,CHEN Luo,JING Ning,et al.Parallel batch-building remote sensing images tile pyramid with MapReduce[J].Geomatics and Information Science of Wuhan University,2013,38(3):278-282.(in Chinese)
    [17]Li J Y,Meng L K,Wang F Z,et al.A map-reduce-enabled SOLAP cube for large-scale remotely sensed data aggregation[J].Computers&Geosciences,2014,70:110-119.
    [18]鲁伟明,杜晨阳,魏宝刚,等.基于MapReduce的分布式近邻传播聚类算法[J].计算机研究与发展,2012,49(8):1762-1772.LU Weiming,DU Chenyang,WEI Baogang,et al.Distributed affinity propagation clustering based on MapReduce[J].Journal of Computer Research and Development,2012,49(8):1762-1772.(in Chinese)
    [19]蒋波涛,王艳东.基于MapReduce的地图代数并行计算方法[J].测绘地理信息,2014,39(3):50-55.JIANG Botao,WANG Yandong.Map algebra parallel calculation method based on MapReduce[J].Journal of Geomatics,2014,39(3):50-55.(in Chinese)
    [20]Zhang G Q,Xie C J,Shi L,et al.A tile-based scalable raster data management system based on HDFS[C]//Proceedings of the 20th International Conference on Geoinformatics,2012:1-4.
    [21]Bernstein A J.Analysis of programs for parallel processing[J].IEEE Transactions on Electronic Computers,1966,EC-15(5):757-763.
    [22]Amdahl G M.Validity of the single processor approach to achieving large scale computing capabilities[C]//Proceeding of AFIPS67,1967:483-485.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700