H.264编码算法研究与汇编优化

英文题名：Research of H.264 Encoding Algorithms and Instruction Optimize
作者：肖敏雷
论文级别：硕士
学科专业名称：计算机软件与理论
中文关键词：H.264 ; 实时编码 ; 视频编码标准 ; 运动估计 ; 指令优化
英文关键词：H.264 ; real time encoding ; video coding standard ; motion estimation ; instruction optimize
学位年度：2008
导师：满家巨
学科代码：081202
学位授予单位：湖南师范大学
论文提交日期：2008-05-01

摘要

H.264是由ITU-T VCEG和ISO/IEC MPEG联合推出的新一代国际视频编码标准,它采用的依然是基于块的运动补偿和变换的混合编码方案,但和其它视频编码标准相比,它采用了一种全新的近似DCT变换技术——整数变换技术以避免以前标准中使用DCT变换所带来的反变换匹配误差,采用帧内预测编码以提高帧内及帧间编码效率,帧间采用灵活多变的不同块大小来准确描述物体的实际运动情况,且使用了高精度的分数像素运动估计与补偿以及多参考帧选择技术来增加预测的准确度,采用自适应滤波器以去除图像的边界效应,采用基于上下文的二进制编码技术以缩减编码所需的位数等等,这些新技术的综合运用使得H.264编码器和以前的视频编码标准相比在同等重建图像质量下能够节约大约50%的码率,但H.264编码器所使用的新技术也直接导致了其实现的高复杂性,作者通过对各种视频编码标准的仿真比较发现,H.264虽然获得比其它标准更高的编解码质量,但编解码速度比其它视频标准慢,从而限制了其在实时领域里的应用,因此,如何以较低的实现复杂度获得较高的编码效率就成了H.264视频编码技术走向实时应用的一个重要研究课题。
     为了达到实时的编码效果,必须寻找相应的快速实现算法来替代H.264中那些复杂度极高的算法;同时为了进一步提高H.264的编码速度,除了对算法进行优化以外,可以对编码器中反复使用的一些功能模块根据平台的特点进行指令级的优化,此外,还可以根据实际需要对整个H.264编码器的程序结构和数据结构进行适当的优化,本文依据这一思路对H.264编码器中所使用的部分关键算法进行了比较。
     把上述有关算法综合运用到H.264编码器中,在参考软件JM8.6框架下对程序与数据结构进行了适当优化,并对一些关键模块如整像素运动估计中求残差的绝对误差和SAD,分数像素运动估计中求残差的Hadamard变换及对变换后的残差矩阵求取绝对值和SATD,整数变换及其逆变换,亚像素内插等模块利用PC机的多媒体指令系统进行指令级优化,取得了比较满意的加速效果。
H.264 is the newest international video coding standard proposed by the Joint Video Team of ITU-T VCEG and ISO/IEC MPEG. H.264 also uses hybrid video coding scheme based on motion compensation and transform. In order to avoid the matching error in IDCT caused by DCT in prior video standards, H.264 utilizes a new approximate DCT transform—integer transform. H.264 uses intra prediction to promote the intra encoding efficiency. It adopts variable block sizes, high exact fractional motion estimation and compensation and multi-reference frame selection to describe exactly the actual motion vector of objects. In order to eliminate the marginal effect, H.264 uses adaptive in-loop deblocking filter. H. 264 uses CABAC to decrease the bits needed for encoding and so on. H. 264 can decrease about 50% percent of bit rates with the same reconstructed picture quality comparing with prior standards by using above advance in video coding technology. But the high efficiency leads to high complexity of implementation directly. The various video coding standard Simulation comparison show that H.264 obtain a higher quality than any other video coding standard, but encode speed slower than other video standard, this will limit the application of H.264 in real time video coding. So it is important to implement the H.264 coder with low hardware complexity and high encoding efficiency.
     In order to achieve high encoding efficiency, H.264 uses a great deal of encoding algorithms with high complexity. These complexity algorithms will limit the application of H.264 in real time video coding. So we must find corresponding fast algorithms to replace those with high complexity in H.264 reference software. In the mean time, in order to accelerate H.264 encoder, those modules used repeatedly in H.264 should be optimized in multimedia instruction. Moreover the structure of program and data for H.264 encoder can be modified according to the actual needs. According to above ideas, this dissertation gives deep research on some key algorithms and the implementation of H.264 encoder under the reference software JM8.6 in PC with high speed and efficiency. The main creative works are as follows.
     The above mentioned algorithms were embedded in the H.264 reference software JM8.6 and the structure of program and data are also optimized properly according to actual needs. Some key modules such as computing the sum of absolute difference SAD in integer pixel motion estimation, computing the Hadamard transform of difference matrix and computing the sum of absolute difference SATD, integer transform and its inverse transform, sub-pel interpolation were optimized with multimedia instructions of PC, it achieved satisfied effects.

引文

[1]章毓晋.图像处理与分析[M].北京:清华大学出版社,2001.
    [2]沈兰荪,卓力等.视频编码与低速率视频传输[M].北京:电子工业出版社,200l
    [3]丁贵广,计文平,郭宝龙等.Visual C++ 6.0数字图像编码[M].北京:机械工业出版社,2004.
    [4]Wang Zhou,Lu Liang,Bovik C.Alan.Video quality assessment based on Structural Distortion Measurement[j].Signal Processing:Image Communication,VOL.19,NO.1,January 2004.
    [5]A.A.webster,C.T.Jones,M.H.Pinson,S.D.Voran,andS.Wolf," An objective video quality assessment system based on human perception "Proc.SPIE,vol.1913,p.15-26,1993.
    [6]Wang Zhou,Guixing Wu,Hamid R.sheikh,Eero P.Simoncelli,En-Hui Yang,Alan C.Bovik.Quality-Aware Images[j].IEEE Transactions On Image Processing.
    [7]s M.Knee "A robust.efficient and accurate signal-ended picture quality measure for MPEG-2." http://www.ext.crc.ca/vqeg/frames.html,2001.
    [8]Gary S.Greenbaum.Remarks on the H.26L Project:Streaming Video Requirements for Next Generation Video Compression Standards[R],doc Q15-G-11,ITU-T video coding experts group meeting,Monterey,Feb.1999.
    [9]Joint Video Team(JVT)of ISO/IEC MPEG and ITU-T VCEG.Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification(ITU-T Rec.H.264| ISO/IEC 14496-10 AVC)[S],document JVT-G050d35.doc,7th Meeting:Pattaya,Thailand,March,2003.
    [10]李宾,高平.H.264编码系统的特点及其应用前景[J].数字电视与数字视频No.62003.
    [11]王嵩,薛全,张颖,陈建乐.H.264视频编码新标准及性能分析[J]。数字电视与数字视频No.6 2003.
    [12]M.Ravasi,M.Mattavelli and C.Clere.A computational complexity comparison of MPEG4 and JVT codecs.Joint Video Team(JVT)of ISO/IEC MPEG & ITU-T VCEG,JVT-D153r1-L,July 2002.
    [13] http://doggge. 51. net/downloads/mpeg2Peter. rar.

    [14] S. Zhu and K. Ma. A new diamond search algoritm for fast block matching [J], IEEE Transactions on Image Processing, Vol. 9 No. 2, pp. 287-290. Feb. 2000.
    [15] Alexis M. Tourapis, Oscar C. Au, Ming L. Liou. Precictive motion vector field adaptive search technique (PMVFAST)-Enhancing block based motion estimation[C]. In proceedings of Visual Communications and Image Processing 2001(VCIP-2001), San Jose, CA, Jan. 2001.
    [16] Alexis M. Tourapis, Oscar C. Au, Ming L. Liou. New Results on Zonal Based Motion Estimation Algorithms-Advanced Predictive Diamond Zonal Search[C]. In proceedings of 2001 IEEE International Symposium on Circuits and Systems (ISCAS-2001), Vol. 5, pp. 183-186, Sydney, Australia, May 2001.
    [17] Hye-Yeon Cheong Tourapis. Fast Motion Estimation within the JVT Codec. Joint Video Team(JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-E023, 2002.
    [18] Yilong Liu and Soontorn Oraintara. Fractional-pel Motion Refinement Based on Hierarchical Adjustable Dual-Parabola Model[C]. International Symposium Communications and Information Technologies 2004(ISCIT 2004). Sapporo, Japan, October 26-29, 2004. pp. 752-755.
    [19] Peng Yin, Hye-Yeon Cheong Tourapis, Alexis Michael Tourapis and Jill Boyce. Fast Mode Decision and Motion Estimation for JVT/H. 264[C]. ICIP 2003, Vol.3, pp. III-853-III-856.
    [20] Feng Pan, Xiao Lin, Rahardja Susanto, et al.. Fast Mode Decision for Intra Prediction. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-G013, 2003.
    [21] Changsung Kim, Hsuan-Huei Shih, et al.. Multistage mode decision for intra prediction in H. 264 codec[C]. Proc. Of SPIE-IS&T Electronic Imaging, SPIE, 5308.
    [22] Jongho Kim and Jechang Jeong. Fast intra-mode decision in H. 264 video coding using simple directional masks. Visual Communications and Image Processing 2005, Proc of SPIE Vol. 5960. pp. 1071-1079.
    [23] K. P. Lim, S. Wu, et al. Fast Inter Mode Selection. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG,JVT-I020,2003.
    [24]D.Wu,S.Wu,K.P.Lim,et al.Block inter mode decision for fast encoding of H.264[C].ICASSP 2004.pp.Ⅲ-181-Ⅲ-184.
    [25]Byeungwoo Jeon,Jeyun Lee.Fast Mode Decision for H.264.Joint Video Team(JVT)of ISO/IEC MPEG and ITU-T VCEG,JVT-I033,2003.
    [26]Andy Chang,Oscar C.Au,Y.M.Yeung.A novel approach to fast multi-frame selection for H.264 video coding[C].ISCAS 2003,Vol.2,pp.Ⅱ-704-Ⅱ-707.
    [27]Mohammed E.Al-Mualla,Nishan Canagarajah and David R.Bull.Simplex minimisation for multiple-reference motion estimation[C].ISCAS 2000,Vol Ⅳ,pp.Ⅳ-733-Ⅳ-736.
    [28]Yu-Wen Huang,Bing-Yu Hsieh,et al.Analysis and reduction of reference frames for motion estimation in MPEG-4 AVC/JVT/H.264[C].ICASSP' 03,Vol.3 pp.Ⅲ-145-Ⅲ-148.
    [29]Myung Jun Kim,Yun Gu Lee and Jong Beom RA.A fast multi-resolution block matching algorithm for multi-frame motion estimation[J].ICICE TRANS.INF.& SYST.,Vol.E88-D,No.12,pp.2819-2827.Dec.2005.
    [30]欧建平.低码率视频编码技术研究及其在DSP上的实现[D].国防科技大学博士学位论文,长沙,2003年10月.
    [31]薛全.H.264/AVC中运动估计、变换与解码器优化[D].浙江大学博士学位论文,浙江,2004年8月.
    [32]刘海峰,郭宝龙等.用于块匹配运动估值的正方形—菱形搜索算法[J].计算机学报,2002,25(7):747-752.
    [33]杨鹏.数字家电网络环境下的视频编码技术及系统实现[D].清华大学硕士学位论文,2004.6.
    [34]田传艳,沈承东,李思昆.一种有效的基于上下文自适应搜索模式的快速运动估计算法.第13届全国多媒体技术学术会议(NCMT2004)论文集,P.291-292,2004.10,浙江宁波.
    [35]滕国伟.H.264/AVC实时编码系统及其相关算法研究[D].上海大学博士论文,上海,2005年4月.
    [36]Zhibo Chen,Cheng Du,Jinghua Wang,Yun He.PPFPS-A Parabloid Prediction based Fractional Pixel Search Strategy for H.26L[C].IEEE ISCAS 2002,Vol.3,pp.Ⅲ-9-Ⅲ-12.
    [37]裴世保,李厚强,俞能海.一种快速的H.264/AVC帧内预测模式选择算法[J].计算机工程与应用,2005年,第10期,pp.71-73.
    [38]李军华,杜清秀.一种H.264的帧间宏块模式选择算法[J].科学技术与工程,2005年第18期,pp.1243-1247.
    [39]余胜生,苏曙光,周敬利.基于H.26L多参考帧运动估计的研究及其新算法[J].计算机工程,2005年第6期,pp.176-178.
    [40]成运,戴葵,王志英,沈立,郭建军.一种新的基于H.264/AVC的零块判决方法.计算机研究与发展,2005年第10期.
    [41]薛金柱,沈兰荪.一种基于H.264/AVC的高效块匹配搜索算法[J].电子学报,2004,32(4):583-586.
    [42]http://iphome.hhi.de/suehring/tml/download/old_jm/jm86.zip.
    [43]崔岩松,段大高,邓中亮.多宏块模式多参考帧快速搜索算法[J].北京邮电大学学报,2005年第4期,pp.37-40.
    [44]沈兰荪,卓力.小波编码与网络视频传输[M].北京:科学出版社,2005。
    [45]Iain E.G.Richardson.H.264/MPEG-4 Part 10 White Paper:Prediction of Inter Macroblocks in P-slices.April,2003.
    [46]Iain E.G.Richardson.H.264/MPEG-4 Part 10 White Paper:Reconstruction Filter.April,2003.
    [47]陆亮,楼剑,虞露.H.264编码环中的去块效应滤波系统[J].数字电视与数字视频第3期总第253期,2003.
    [48]Trembley M.,O' Connor M.,Narayanan V.,He L.VIS Speeds New Media Processing[J].IEEE Micro,1996,16(4):10-20.
    [49]Lee R.Subword Parallelism with MAX2[J].IEEE Micro,1996,16(4):51-59.
    [50]Ninth Annual Microprocessor Forum,October 21-24,1996,San Jose,California.
    [51]Peleg A.and Weiser U.MMX Technology Extension to the Intel Architecture[J].IEEE Micro,1996,16(4):42-50.
    [52]MIPS Technologies Inc.MIPS Extensions for Digital Media with 3D,whitepaper,March 1997,http://www.mips.com/.
    [53]S.Thakkar,T.Huff.Internet Streaming SIMD Extensions[J].Computer,1999,32(12):26-34.
    [54]J.Tiler,J Lent,A.Mather,huy Nguyen.AltiVecrM:Bringing vector technology to the PowerPCrM processor family[C].IEEE Int.Conf.on Performance,Computing and Communications.1999:437-444.
    [55]MIPS Technologies,Inc.Information Backgrounder R1.0.MIPS-3D technology enables low-cost 3D graphics acceleration for embedded MIPS RISC architecture-based systems:.http://www.embeddedinsight.com/pdf/MIPS3Dback.pdf.
    [56]Enhanced 3DNow!~(TM)Technology for the AMD Athlon~(TM)Processor:http://www3pub.amd.com/products/cpg/athlon/3dnow_wp.html.
    [57]The Intel(?)Pentium(?)4 processor product overview:http://www.intel.com/design/Pentium4/prodbref/#streaming.
    [58]张帆,张旭东.H.264编码器的SSE-2优化实现[J].中国图形图像学报,2003年特刊.
    [59]魏芳,李学明.基于MMX技术的H.264解码器优化[J].计算机工程与设计,2004Vol.25 No.12 P.2218-2221,2224.
    [60]Thomas Wiegand,Gary J.Sullivan,Gisle Bjφntegaard,and Ajay Luthra.Overview of the H.264/AVC Video Coding Standard[J].IEEE Transactions on Circuits and Systems for Video Technology,Vol.13,No.7,July,2003.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700