基于Flume和HDFS的大数据采集系统的研究与实现
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research and implementation of big data acquisition system based on Flume and HDFS
  • 作者:方中纯 ; 赵江鹏
  • 英文作者:FANG Zhong-chun;ZHAO Jiang-peng;Engineering and Training Center,Inner Mongolia University of Science and Technology;Information Engineering School,Inner Mongolia University of Science and Technology;
  • 关键词:HDFS ; Flume ; 大数据采集系统 ; Web ; Server ; BDAS
  • 英文关键词:HDFS;;Flume;;Big Data Acquisition;;Web Server;;BDAS
  • 中文刊名:BTGX
  • 英文刊名:Journal of Inner Mongolia University of Science and Technology
  • 机构:内蒙古科技大学工程训练中心;内蒙古科技大学信息工程学院;
  • 出版日期:2018-09-15
  • 出版单位:内蒙古科技大学学报
  • 年:2018
  • 期:v.37;No.126
  • 基金:国家自然科学基金资助项目(61462069);; 内蒙古自然科学基金资助项目(2017MS0604,2017MS(LH)0603);; 内蒙古科技大学教改重点资助项目(JY2016003)
  • 语种:中文;
  • 页:BTGX201803011
  • 页数:5
  • CN:03
  • ISSN:15-1357/N
  • 分类号:55-59
摘要
在充分研究大数据采集、大数据存储、HDFS和Flume基础上,综合分析并利用相关领域知识,给出了一种基于Flume和HDFS相结合的大数据采集系统BDAS的概念模型和体系结构.并根据BDAS的体系结构,可以明确实现一种大数据采集的具体工作,即:Flume Agent的配置.根据体系结构,给出一个实现Web Server日志采集的具体实现方法和步骤. BDAS概念模型和体系结构在大数据分析和研究领域具有重要的理论意义和实际意义,也为大数据领域的研究提供了一种通用的大数据获取手段.
        On the basis of fully studying big data collection,big data storage,HDFS and Flume,comprehensively analyzing and utilizing the relevant domain knowledge,a conceptual model and architecture were presented based on Flume and HDFS combined with big data acquisition system BDAS. According to the architecture of BDAS,the specific work of a big data collection can be clearly realized,namely,the configuration of the Flume Agent. According to the architecture,a specific implementation method and steps to achieve Web Server log collection were offered. The BDAS conceptual model and architecture have important theoretical and practical significance in the field of big data analysis and research,providing a universal means of big data acquisition for the research of big data.
引文
[1]程学旗,靳小龙,王元卓,等.大数据系统和分析技术综述[J].软件学报,2014,25(09):1889-1908.
    [2] Apache Software Foundation. HDFS Users Guide[EB/OL]. https://hadoop. apache. org/docs/stable/hadoop-project-dist/hadoop-hdfs/Hdfs Design. html,2018-05-01.
    [3] White T,Hadoop.权威指南:大数据的存储与分析(第4版)[M].北京:清华大学出版社,2017.
    [4] Apache Software Foundation. Flume 1. 8. 0 Developer Guide[EB/OL]. http://flume. apache. org/Flume Developer Guide. html,2018-05-01.
    [5]徐海荣,陈闵叶,张兴媛.基于Flume,Kafka,Storm,HDFS的航空维修大数据系统[J].上海工程技术大学学报,2015,29(04):303-305+311.
    [6]于金良,朱志祥,梁小江.基于Flume的MySQL数据自动收集系统[J].计算机技术与发展,2016,26(12):137-141.
    [7] Madani Y,Bengourram J,Erritali M. Social login and data storage in the Big Data File System HDFS[A]. In:Proc. of the International Conference on Compute and Data Analysis ICCDA’17[C]. Lakeland:ACM Press,2017,91-97.
    [8]詹玲,马骏,陈伯江,等.分布式I/O日志收集系统的设计与实现[J].计算机工程与应用,2010,46(36):88-90.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700