用户访问驱动的空间数据存储组织策略
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:User-driving Based Storage and Organization Strategy for Spatial Data
  • 作者:潘少明 ; 赖新果 ; 种衍文 ; 李红
  • 英文作者:PAN Shaoming;LAI Xinguo;CHONG Yanwen;LI Hong;State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing, Wuhan University;
  • 关键词:空间数据 ; 负载均衡 ; 数据组织 ; 访问相关性 ; 分布式地理信息系统
  • 英文关键词:spatial data;;load balance;;data organization;;access correlation;;distributed GIS
  • 中文刊名:WHCH
  • 英文刊名:Geomatics and Information Science of Wuhan University
  • 机构:武汉大学测绘遥感信息工程国家重点实验室;
  • 出版日期:2019-01-25 17:09
  • 出版单位:武汉大学学报(信息科学版)
  • 年:2019
  • 期:v.44
  • 基金:国家自然科学基金(41671382,61572372,41271398);; 国家重点研发计划(2017YFB0504202)~~
  • 语种:中文;
  • 页:WHCH201902021
  • 页数:7
  • CN:02
  • ISSN:42-1676/TN
  • 分类号:141-146+154
摘要
针对用户访问服务负载均衡的分布存储要求和磁盘连续读取的合并存储要求之间的矛盾,提出一种基于用户访问行为的综合考虑存储节点连续读取效率和网络负载均衡效率的空间数据存储组织方法(combined strategy of data placement and load balance,CSDL)。该方案根据用户访问行为,通过空间数据的访问相关性计算,将热点数据分布存储在不同的服务器上,以实现用户并发访问时的负载均衡;同时,通过对存储在相同服务器内数据的并发度计算,将并发度高的数据存储在连续的磁盘空间上,以实现磁盘存储服务的连续读取。CSDL方法试图同时从上层应用的负载均衡和底层磁盘效率两个方面对空间数据的存储进行分布组织,以提高地理信息系统服务效率。试验结果表明,该方案可有效提高45.2%~245.3%的系统平均请求响应时间,与此同时,分布式服务器节点的负载均衡度可提高约0.5%~440.9%,能满足大规模分布式环境的应用需求。
        Aimed at the conflict between load balancing for user access services and sequentially reading for disk storage services, a user-driving storage and organization strategy for spatial data is proposed,which takes a comprehensive consideration of the strategy of data placement and load balance(CSDL). This scheme mines the users' behaviors and computes the correlations among all data so as to distribute and store the popular data into different storage nodes to realize load balancing. Then, the concurrency degree is also computed among the data stored in the same storage node and which can be used to store some data in contiguous disk space to realize continuous reading. The CSDL method proposed in this paper tries to organize spatial data storage from two aspects of load balancing and lower disk efficiency at the same time, so as to improve the service efficiency of GIS. Experimental results show that our scheme improves the performance of average request response time by 45.2%-245.3% and also improves the performance of load balance degree by 0.5%-440.9%, which can meet the requirements of large scale distributed environments.
引文
[1] Zhang Lei, Cheng Penggen, Chen Jing. The Design and Implementation of the Earthquake Information Services Integration Based on “Tianditu”[J]. Journal of East China Institute of Technology(Natural Science Edition), 2013, 36(3): 323-327(张磊, 程朋根, 陈静. 基于“天地图"地震信息集成的设计与实现[J]. 东华理工大学学报(自然科学版), 2013, 36(3): 323-327)
    [2] Li Heyuan, An Xiaoya, Chen Gang, et al. A Geographical Information Service Load Balancing Algorithm[J]. Geomatics and Information Science of Wuhan University, 2016,41(11): 1 524-1 529 (李鹤元, 安晓亚, 陈刚,等. 一种地理信息服务动态负载均衡算法[J]. 武汉大学学报·信息科学版, 2016,41(11): 1 524-1 529)
    [3] Pan Shaoming, Xu Zhengquan, Liu Xiaojun. Dynamic Statistical Algorithm of Spatial Data Access Laws[J]. Geomatics and Information Science of Wuhan University, 2013, 38(6): 734-736,741)(潘少明, 徐正全, 刘小俊. 空间数据访问规律的动态统计算法[J]. 武汉大学学报·信息科学版,2013, 38(6): 734-736,741)
    [4] Li Pengfei, Sun Kaimin, Li Deren, et al. A Load Balancing Strategy for Urgent Parallel Processing of UAV Imagery[J]. Geomatics and Information Science of Wuhan University, 2018,43(2): 268-274(李鹏飞, 孙开敏, 李德仁,等. 无人机影像应急并行处理负载均衡方法[J]. 武汉大学学报·信息科学版, 2018,43(2): 268-274)
    [5] Dong B, Zheng Q H, Tian F, et al. An Optimized Approach for Storing and Accessing Small Files on Cloud Storage[J]. Journal of Network and Computer Application, 2012, 35: 1 847-1 862
    [6] Wang Tao, Yao Shihong, Xu Zhengquan, et al. A Small File Merging and Prefetching Strategy Based on Access Task in Cloud Storage[J]. Geomatics and Information Science of Wuhan University, 2013, 38(12): 1 504-1 508(王涛, 姚士红, 徐正全,等. 云存储中面向访问任务的小文件合并与预取策略[J]. 武汉大学学报·信息科学版, 2013, 38(12): 1 504-1 508)
    [7] Xiong Lian, Xu Zhengquan, Wang Tao, et al. On the Store Strategy of Small Spatio-Temporal Data Files in Cloud Environment[J]. Geomatics and Information Science of Wuhan University, 2014, 39(10): 1 252-1 256(熊炼, 徐正全, 王涛,等. 云环境下的时空数据小文件存储策略[J]. 武汉大学学报·信息科学版, 2014, 39(10): 1 252-1 256)
    [8] Pan S, Li Y, Xu Z, et al. Distributed Storage Algorithm for Geospatial Image Data Based on Data Access Patterns[J]. PLoS One,2015,10(7):e0133029
    [9] Pan S, Chong Y, Zhang H, et al. A Global User-Driven Model for Tile Prefetching in Web Geographical Information Systems[J]. PLoS One, 2017, 12(1): e0170195
    [10] Liu Xiaojun, Xu Zhengquan, Pan Shaoming. A Massive Small File Storage Solution Combination of RDBMS and Hadoop[J]. Geomatics and Information Science of Wuhan University, 2013, 38(1): 113-115,120)(刘小俊, 徐正全, 潘少明. 一种结合RDBMS和Hadoop的海量小文件存储方法[J]. 武汉大学学报·信息科学版, 2013, 38(1): 113-115,120)
    [11] Wang Fang, Zhang Shunda, Feng Dan, et al. Hybrid Object Allocation Policy for Object Storage Systems[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2007, 35(3): 46-48(王芳,张顺达,冯丹,等. 对象存储系统中的柔性对象分布策略[J]. 华中科技大学学报(自然科学版),2007, 35(3): 46-48)
    [12] Liu Jianliang, Yang Lin, Guo Mingyang, et al. The Relevance Principle of I/O Reference[J]. Journal of Computer Research and Development, 2014, 51(S1): 48-56(刘建亮, 杨琳, 郭明阳,等. I/O访问相关性原理[J]. 计算机研究与发展, 2014, 51(S1): 48-56)
    [13] Li Shaojun, Yang Haijun, Huang Yaohuan, et al. Geo-spatial Big Data Storage Based on NoSQL Database[J]. Geomatics and Information Science of Wuhan University, 2017,42(2): 163-169(李绍俊, 杨海军, 黄耀欢,等. 基于NoSQL数据库的空间大数据分布式存储策略[J]. 武汉大学学报·信息科学版, 2017,42(2): 163-169)
    [14] Wang Hao, Pan Shaoming, Peng Min, et al. Zipf-like Distribution and Its Application to Image Data Tile Request in Digital Earth[J]. Geomatics and Information Science of Wuhan University, 2010, 35(3): 356-359 (王浩,潘少明,彭敏等. 数字地球中影像数据的Zipf-like访问分布及应用分析[J]. 武汉大学学报·信息科学版, 2010, 35(3): 356-359)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700