Hadoop平台下新型图像并行处理模型设计
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:New Design of Image Parallel Processing Model Based on Hadoop Platform
  • 作者:刘军 ; 李威 ; 吴梦婷 ; 陈起凤
  • 英文作者:LIU Jun;LI Wei;WU Mengting;CHEN Qifeng;School of Computer Science and Engineering, Wuhan Institute of Technology;
  • 关键词:Hadoop ; 并行计算框架(MapReduce) ; 图像处理 ; OpenCV
  • 英文关键词:Hadoop;;MapReduce;;image processing;;OpenCV
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:武汉工程大学计算机科学与工程学院;
  • 出版日期:2018-05-22 13:18
  • 出版单位:计算机工程与应用
  • 年:2019
  • 期:v.55;No.925
  • 基金:湖北省智能机器人重点实验室开放基金(No.HBIR 201608);; 武汉工程大学研究生创新基金(No.CX2016063)
  • 语种:中文;
  • 页:JSGG201906029
  • 页数:5
  • CN:06
  • 分类号:192-196
摘要
Hadoop在处理海量小图像数据时,存在输入分片过多以及海量小图像存储问题。针对这些问题,不同于采用HIPI、SequenceFile等方法,提出了一个新型图像并行处理模型。利用Hadoop适合处理纯文本数据的特性,本模型使用存储了图像路径的文本文件替换图像数据作为输入,不需要设计图像数据类型。在Map阶段直接完成图像的读取、处理、存储过程。为了简化图像处理算法,将OpenCV和Map函数结合并设计了对应的存储方法,实现小图像文件的存储。实验表明,在Hadoop分布式系统平台下,模型不论在小数据量还是在大数据量的测试数据环境中,都具有良好的吞吐性能和稳定性。
        While dealing with huge amount of small image data, Hadoop has the problems of managing the excessive fragmentation of the inputs and saving the rapid growth of small image files. In view of solving these problems, the solution of a new mass small image parallel processing model is proposed and implemented, and is different from the methods such as HIPI and SequenceFile. For Hadoop is suitable for the text-only data processing, the image data is replaced by the text file that stores the image path as input, and the model does not need to design image data types. The functions such as image reading, image processing, image storage are completed in the Map stage of Hadoop. And to simplify the image processing algorithms, the OpenCV functions are combined with the Map function and the corresponding storage method is designed to accommodate the storage of small image files. Experimental results show that, the model has good performance on throughput test and good stability wherever the test data is the small amount of data or large amount of data in Apache Hadoop system.
引文
[1]ApacheHadoop.What is apache Hadoop?[EB/OL].(2011-12-27)[2012-02-17].http://hadoop.apache.org/.
    [2]White T.Hadoop权威指南[M].2版.北京:清华大学出版社,2011:15-73,167-188.
    [3]Wiley K,Connolly A,Krughoff S,et al.Astronomical image processing with Hadoop[J].Astronomical Data,2010,442:93-96.
    [4]Vemula S,Crick C.Hadoop image processing framework[C]//IEEE International Congress on Big Data,2015:506-513.
    [5]Zhang L J,Fei H,Wang Y D.Parallel image processing implementation under Hadoop cloud platform[J].Information Security&Communications Privacy,2012.
    [6]李倩,施霞萍.基于HadoopMapReduce图像处理的数据类型设计[J].软件导刊,2012,11(4):182-183.
    [7]李三淼,李龙澍.Hadoop中处理小文件的四种方法的性能分析[J].计算机工程与应用,2016,52(9):44-49.
    [8]Gopal V,Kumar S,Pamu B.Reduction of data at namenode in HDFS using harballing technique[J].International Journal of Advanced Research in Computer Engineering&Technology,2012,1(4).
    [9]谭台哲,向云鹏.Hadoop平台下海量图像处理实现[J].计算机工程与设计,2017,38(4):976-980.
    [10]Sweeney C,Liu L,Arietta S,et al.HIPI:a hadoop image processing interface for image-based Map Reduce tasks[D].Chris University of Virginia,2010.
    [11]董西成.Hadoop技术内幕:深入解析Map减少架构设计与实现原理[M].北京:机械工业出版社,2013.
    [12]MapReduce[EB/OL].[2011-12-08].http://hadoop.apache.org/mapreduce/.
    [13]Bradski G,Daebler A.Learning Open CV computer vision with Open CV library[D].University of Arizona Usa Since,2008.
    [14]Zhao X,Yang Y,Sun L L,et al.Metadata-aware small files storage architecture on Hadoop[M]//Web information systems and mining.Berlin Heidelberg:Springer,2012:136-143.
    [15]Xia D,Wang B,Rong Z,et al.Effective methods and strategies for massive small files processing based on Hadoop[J].ICIC Express Letters,2014,8(7):1935-1941.
    [16]张良将,宦飞,王杨德.Hadoop云平台下的并行化图像处理实现[J].信息安全与通信保密,2012(10):59-62.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700