基于Hadoop分布式文件系统的商业银行大数据分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Big Data Analysis of Commercial Banks Based on Hadoop Distributed File System
  • 作者:张登耀
  • 英文作者:ZHANG Deng-yao;School of Finance/Dongbei University of Finance and Economics;
  • 关键词:Hadoop文件 ; 商业银行 ; 大数据
  • 英文关键词:Hadoop file;;commercial bank;;big data
  • 中文刊名:SCHO
  • 英文刊名:Journal of Shandong Agricultural University(Natural Science Edition)
  • 机构:东北财经大学金融学院;
  • 出版日期:2018-09-13 17:25
  • 出版单位:山东农业大学学报(自然科学版)
  • 年:2018
  • 期:v.49
  • 语种:中文;
  • 页:SCHO201805033
  • 页数:5
  • CN:05
  • ISSN:37-1132/S
  • 分类号:157-161
摘要
针对当前Hadoop分布式文件系统数据分析时存在的数据读取时间长,数据本地化率低等问题,本文提出了一种基于Hadoop分布式文件系统的商业银行大数据分析方法。首先对Hadoop分布式文件系统的工作原理和流程进行分析,找到引起不足的原因,然后根据商业银行大数据的特点,对Hadoop分布式文件系统的数据副本数量和数据分布位置进行相应的改进,最后通过仿真模拟实验对数据读取速度、本地化率、磁盘负载等进行分析。结果表明,本方法可以有效减少数据读取时间、提升数据本地化率并均衡磁盘负载,整体性能要明显优于对比方法,具有更好的实际应用价值。
        In view of the problems of long data reading time and low localization rate of data in the data analysis by Hadoop distributed file system, this paper proposed a large data analysis method for commercial banks based on Hadoop distributed file system. Firstly, the working principle and process of the Hadoop distributed file system were analyzed and the reasons for the shortage were found. Then, according to the characteristics of the big data of the commercial bank, the number of data copies and the data distribution position of the Hadoop distributed file system were improved. Finally, the data reading speed was passed through the simulation simulation experiment. The localization rate, the disk load and so on were analyzed. The results showed that the method could effectively reduce the time of data reading, improve the localization rate of data and balance the load of the disk. The overall performance was better than the contrast method, and it has a better practical application value.
引文
[1]信怀义.商业银行大数据的应用现状与发展研究[J].中国金融电脑,2016,27(8):26-28
    [2]顾涛.集群Map Reduce环境中任务和作业调度若干关键问题的研究[D].天津:南开大学,2015
    [3]罗鹏,龚勋.HDFS数据存放策略的研究与进步[J].计算机工程与设计,2014,35(24):1127-1131
    [4]信怀义.基于商业银行大数据访问规律的HDFS副本策略优化研究[J].软件,2015,36(11):74-79
    [5]张艳,蔡光兴.基于ARIMA和GRNN模型对人民币汇率的预测[J].特区经济,2017(2):53-55
    [6]林伟伟.一种改进的Hadoop数据放置策略[J].华南理工大学学报:自然科学版,2012,40(1):152-158
    [7]Tom White.Hadoop权威指南[M].第2版.周敏奇,王晓玲译.北京:清华大学出版社,2011:15-73
    [8]王意洁,孙伟东,周松,等.云计算环境下的分布存储关键技术[J].软件学报,2012,23(4):962-986
    [9]王习特,申德荣,于戈,等.Map Reduce集群中最大收益问题的研究[J].计算机学报,2015,38(1):109-121
    [10]宫夏屹,李伯虎,柴旭东,等.大数据平台技术综述[J].系统仿真学报,2014,26(3):489-496
    [11]黄山,王波涛,王国仁,等.Map Reduce优化技术综述[J].计算机科学与探索,2013,7(10):865-885

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700