基于GPU运算的图像压缩技术的研究

英文题名：Research on Image Compression Technologies of Operation Based on GPU
作者：拱慧璇
论文级别：硕士
学科专业名称：信息与通信工程
中文关键词：GPU ; CUDA ; 并行图像压缩 ; JPEG ; MPEG-2
英文关键词：GPU ; CUDA ; Parallel Image Compression ; JPEG ; MPEG-2
学位年度：2011
导师：赵彬
学科代码：081001
学位授予单位：哈尔滨工业大学
论文提交日期：2011-06-01

摘要

随着数字化技术的快速发展,从静态图像到动态视频图像的数据量都大幅度增加,因此,目前技术领域中关注的热点是,在保证质量的情况下,如何减少图像数据的冗余信息,能够更加有效的存储或实时传输数据信息。现在CPU上运行的多数的压缩算法由于数据量的增加以及计算复杂度的提高,而不能满足实时需求。到目前为止,NIVIDIA的GPU已经发展过了八代,GPU在高性能通用计算领域发展中逐渐占据了主流地位,并且该技术的应用和发展一直呈现稳定且强劲的增长趋势。GPU的特点是对大量密集型数据并行处理运算。因此,本文主要研究的内容就是利用GPU实现静态图像JPEG压缩编码和基于MPEG-2的视频图像压缩编码。
     本文首先阐述了CUDA的编程模型,从主机与设备的关系,内核函数的使用方法到CUDA的线程层次;分析了CUDA的存储模型。并以GeForce GT240为例,分析了GPU的硬件架构和硬件映射,以及warp的发射和执行。以此为基础,展开GPU在图像编码方面的研究和实现。本文采用的是CPU+GPU的架构模型,CPU负责处理逻辑性较强的串行工作,而GPU则负责计算工作量较大的并行处理工作。这两者各司其职,合力完成图像压缩的任务。
     本文主要研究了基于GPU的JPEG静态图像的压缩编码和基于GPU的MPEG-2视频图像压缩编码。
     本文实现了在GPU上进行并行JPEG图像压缩编码。在对原JPEG编码算法的研究分析基础上,提出了适合在CUDA平台上进行并行运算的JPEG编码算法,并给出了在GPU上的优化。其中,最为重要的是研究了适合在GPU上实现的可并行DCT变换方法,并且对于熵编码,也分析了Huffman编码方法。本文在实现了基于GPU的JPEG图像压缩编码,并从几个方面对该算法进行了分析,说明了基于GPU的并行压缩的可行性。
     本文还实现了基于GPU的MPEG-2视频图像压缩编码。分析了MPEG-2视频压缩编码的基本原理。并进一步分析了MPEG-2视频压缩编码在GPU上并行运算的可行性,并提出了CPU+GPU的并行运算和CUDA中的两级并行运算。接着详细的研究了MPEG-2关键模块,包括运动估计、运动补偿、比较计算、变换与反变换、量化与反量化、熵编码,按照原算法的特性以及在基于GPU的CUDA编程模型的特点,提出了适合在GPU上的运算并行方法,分析各个模块的并行算法流程和CUDA编程模型的并行资源分布与图像处理单元的对应情况。给出了实现GPU上的MPEG-2视频图像压缩的实验环境,并从压缩率、峰值信噪比、编码效率几方面对整体并行压缩算法性能进行了分析,得出了本文的方法具有相对较好的结果。并且还对几个模块的编码速度进行了详细的分析。
With the rapid development of digital technology, the data increased greatly in both static image and dynamic video image. It is noticeable how to decrease the redundant data in order to save or transmit information more efficiently. So far the GPU of NIVIDIA has evolved into the eighth generation, which increasingly dominates the high-powered general purpose computer field. This essay is written to show the way GPU encodes the image.
     At the beginning, I describe the model of CUDA, which is mainly made up of CPU+GPU. CPU is in charge of the serial work which is highly logical, and as for GPU, the parallel processing which costs more workload in computing.
     In this paper, a JPEG image compression coding system based on GPU is realized. Based on the analysis of the original JPEG coding algorithm, a concurrent JPEG coding algorithm which is applicable to CUDA platform is proposed. Most important of all, the paper studies the parallel DCT transformation which can be applied based on GPU and analyzes the parallel Huffman coding method. The paper realizes JPEG image compression coding method based on GPU and analyzes the algorithms through several aspects, and indicates the feasibility of GPU concurrent compression.
     In this paper, the MPEG-2 compressing and coding of video image based on GPU is also realized, and the basic principle of MPEG-2 is analyzed. The paper further researches on the feasibility of the concurrent algorithm and proposes the concurrent algorithm of CPU+GPU and 2-level concurrent algorithm of CUDA. In addition, the key module of MPEG-2 is studied in detail. The paper proposes the concurrent algorithm based on GPU and analyzes the flow of each module, the data distribution of CUDA programming model and the corresponding image processing unit. The paper analyzes the integral performance of concurrent compressing algorithm through compressing ratio, PSNR and coding efficiency, which shows that the proposed method in this paper is significantly efficient and robust. Meanwhile, the paper analyzes the coding speed of several modules in a specific way.

引文

[1]孙景琪,孙京.数字视频技术及应用[M].北京:北京工业大学出版社,2006:87-116.
    [2]许志祥.数字电视与图像通信技术[M].北京:清华大学出版社,2009:73-98.
    [3]常环.静态图像压缩标准的发展回顾[J].中国现代教育装备,2007年第4期:30-32
    [4] G. Sullivan,J.-R. Ohm,A. Ortega,E. Delp,A. Vetro, M. Barni. Forum-Future of video coding and transmission[M]. IEEE Signal Processing Mag,vol. 23,no. 6,Nov. 2006:76–82.
    [5] Fernando W A C. Video special effects editing in mpeg-2 compressed video. IEEE International Symposium on Circuits and Systems[C]. 2000,2,2:281-284.
    [6] CHEN Wenxing,WEI Lili,WANG Jinqian,YAO Yuyuan. Study on catalytic oxidation of planar binuclear copper phthalocyanine on 2-mercaptoethanol[J], Science in China Series Vol.49,No.6:522-526.
    [7] Jay Loomis , Mike Wasson. VC-1 Technical Overview[R]. Microsoft Corporation.October2007.http://www.microsoft.com/windows/windowsmedia/howto/articles/vc1techoverview.aspx
    [8]韩云,陈祖爵. H.264,VC-1和AVS视频编码研究[J].电视技术. 2007年第31卷第2期:6-9
    [9] ISO/IEC 10918-l/ITU-T Rec.T.81(JPEG). Digital Compression and coding of continuous tone still mages[S].
    [10] Xiao Jiang,Wu Chengke. JPEG2000 COMPRESSION CODING USING HUMAN VISUAL SYSTEM MODEL[J]. Journal of Electronics. 2005.vol.1: 53-58.
    [11] Sikora, T. MPEG digital video-coding standards[S]. Signal Processing MagazineIEEE,1997. Volume.14:82-100.
    [12] Caccia G.,Lancini R. Data hiding in MPEG-2 bit stream domain. EUROCON,2001 .Vol.2:363 - 364
    [13]张然,刘佩林.基于CUDA平台的并行化实时视频编码[A].信息技术,2011,第4期:14-18.
    [14]毕厚杰.新一代视频压缩编码标准——H.264/AVC[M].北京:人民邮电出版社,2005.
    [15] RANDIMA FERNANDO. GPU精粹-实时图形编程的技术、技巧和技艺[M].北京:人民邮电出版社,2006:1-3.
    [16] MATT PHARR. GPU精粹2-高性能图形芯片和通用计算编程技巧[M].北京:清华大学出版社,2007:102-135.
    [17]张舒,禇艳利. GPU高性能运算之CUDA[M].中国水利水电出版社,2009: 1-20.
    [18] NVIDIA,CUDA C Best Practices Guide Version 3.1. NVIDIA Corporation:Santa Clara,California,2010:22-24
    [19] Seung In Park,Sean P Ponce,Jing Huang. Low-cost,high-speed computer vision using NVIDIA’s CUDA architecture[J]. Proceedings - Applied Imagery Pattern Recognition Workshop. 2008,Volume 00:1-7.
    [20] Di Wu,Fan Zhang,Naiyong Ao,Gang Wang,Xiaoguang Liu,Jing Liu. Efficient lists intersection by CPU-GPU cooperative computing[J]. IEEE Conferences,2010:1-8.
    [21] Buek I.,Foley T.,Hom D.,et al. Brook for GPUs:streame computing on graphics hardware[J]. ACM Trans. Graphics. 2004,23(3):777-786.
    [22] Takizawa H. , Sato K. , Komatsu K. , Kobayashi H.. CheCUDA: A Checkpoint/Restart Tool for CUDA Applications[J]. IEEE Conferences,2009:408-413
    [23]张春田,苏育挺,张静.数字图像压缩编码[M].北京:清华大学出版社, 2006:156-165.
    [24]张太怡,杨晓芸,张双藤.基于JPEG国际标准的图像压缩方法的研究[N].重庆大学学报,1994:9-11.
    [25] Zhiyi Yang,Yating Zhu,Yong Pu. Parallel Image Processing Based on CUDA[J]. Proceedings-International Conference on Computer Science and Software Engineering (CSSE 2008). 2008. Volume 3:198-201.
    [26] Techreport. (2009). Badaboom 1.0 uses Nvidia GPUs to transcode video[R]. http://techreport.com/discussions.x/15763
    [27] Neve W D,Rijsselber Gen V,Hollemeersch C. GPU-Assisted Decoding of Video Samples Represented in the YCrCb-R Color Space[C]. Proceedings of the 13th ACM International Conference. 2005:447-450.
    [28]谢敏,黄贤武等.一种快速DCT图像压缩算法的研究[J].计算机应用研究,2002:150-152.
    [29]阮军,韩定定.基于CUDA的DCT快速变换实现方法[J].微电子学与计算机,26卷第8期,2009.
    [30] FANG B , SHEN G B , LI S P. Techniques for Efficient DCT/IDCT Implementationon Generic GPU[C]. IEEE International Symposium on Circuits and Systems. 2005,2: 1126-1128.
    [31] Shan Zhu,Kai-Kuang Ma. A New Diamond Search Algorithm for Fast Block Matching Motion Estimation[J]. IEEE Transactions on Image Processing. 2000,Volume 9 (No.2):172-173
    [32]张吉玲.基于并行处理的图像无损压缩编码技术的研究[D].曲阜:山东师范大学,2008.
    [33]姜楠,王健.常用多媒体文件格式压缩标准解析[M].北京:电子工业出版社,2005:6.
    [34] Nai-Man Cheng,Xiaopeng Fan,Oscar C. Au. Video Coding on Multicore Graphics Processors[M]. IEEE SIGNAL PROCESSING MAGAZINE. 2010:79-89.
    [35] N-M. Cheung,O.Au,M.Kung. Highly parallel rate distortion optimized intra mode decision on multi-core graphics processors[J]. IEEE Trans. Circuits Syst. Video Technol. (Special Issue on Algorithm/Architecture Co-Exploration of Visual Computing). 2009. vol. 19,no. 11,1692–1703.
    [36] B.Pieters,D.Van Rijsselbergen,W. De Neve. Motion compensation and reconstruction of H.264/AVC video bitstreams using the GPU[J]. WIAMIS’07: Proc. 8th Int. Workshop Image Analysis for Multimedia Interactive Services. Washington,DC: IEEE Comput. Soc. 2007.
    [37] MC Kung,C Au,PHW Wong, et al. Block based parallel motion estimation using programmable graphics hardware[C]. ICALIP 2008. 2008:599–603.
    [38] G. ShenG. Gao,S. Li,et al. Accelerate video decoding with generic GPU[J]. IEEE Trans. Circuits Syst. Video Technol. 2005. vol. 15,no.5:685–693.
    [39]毕厚杰.新一代视频压缩编码标准——H.264/AVC[M].北京:人民邮电出版社, 2005.
    [40]房波.基于通用可编程GPU的视频编解码器——架构、算法与实现[D].杭州:浙江大学. 2005.
    [41]求是科技. Visual C++音视频编解码技术及实践[M].北京:人民邮电出版社,2006:15-42
    [42] Aleksandar Colic,Hari Kala,Borko Furht. Exploring NVIDIA-CUDA for Video Coding[C],International Multimedia Conference archive. 2010:14-22.
    [43]贺玉文.种快速全局运动补偿编码方法[J]. 2007Vol.29. No.2176-177
    [44] PIETERS B,VAN R D,DE N W. Performance Evaluation of H.264/AVC Decoding and Visualization using the GPU[C]. Applications of Digital Image Processing. 2007, 6696(6):61-63.
    [45] Wei-Nie Chen,Shu-Ming Hang. H.264/AVC motion estimation implementation on compute unified device architecture (CUDA)[C]. International Conference on Multimedia and Expo 2008 (ICME2008). 2008:647-669.
    [46] PIETERS B,VAN R D,DE N W. Motion Compensation and Reconstruction of H.264/AVC Video Bitstreams using the GPU[C]. Eighth International Workshop onImage Analysis for Multimedia Interactive Services,2007.
    [47] G. Jin,H.-J. Lee. A parallel and pipelined execution of H.264/AVC intra prediction[C]. IEEE Int. Conf. Computer and Information Technology. 2006:247-249.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700