H.264快速运动估计算法研究与实现

英文题名：Research and Implementation of Fast Motion Estimation Algorithm for H.264
作者：周冬辉
论文级别：硕士
学科专业名称：计算机软件与理论
中文关键词：H.264 ; 运动估计 ; 模式选择 ; 率失真优化
英文关键词：H.264 ; Motion Estimation ; Mode Decision ; Rate Distortion Optimization
学位年度：2010
导师：端木春江
学科代码：081202
学位授予单位：浙江师范大学
论文提交日期：2010-05-01

摘要

最新的国际视频编码标准H.264是由ITU-T视频编码专家组(VCEG)和ISO/IEC动态图像专家组(MPEG)组成的联合视频组(Joint Video Team, JVT)在2003年提出的。因为H.264采用了许多新的技术,如：可变块大小的运动估计(variable block size motion estimation)、小数像素的运动估计(decimal pixel resolution motion estimation)、多参考帧的运动估计(multi-reference frame motion estimation)、率失真优化(rate distortion optimization, RDO)、CABAC(context-based adaptive binary arithmetic coding)、CAVLC(context-adaptive variable length coding)等,H.264的编码效率要优于以前的视频编码标准。然而也因为采用了这些技术,使得H.264的运动估计部分的计算复杂度大大高于以前的编码标准。正因为如此,降低H.264中运动估计复杂度的算法的研究是近年来国际上的研究热点。
     在充分考虑了各种图像序列的运动特性的基础上,我们提出了基于矢量预测和多方向梯度下降搜索算法(Multi-Direction Gradient Descent Search,MDGDS)算法。该算法首先利用运动矢量的时间和空间上的相关性来得到预测矢量的搜索起始点,再通过使用自适应阈值来判断当前块的运动类型,以此来采用不同的搜索策略。这个算法可以快速扩展搜索范围和提高运动估计的搜索速度,从而避免使搜索过程陷入局部极小。实验结果表明,与非对称十字多级六边形搜索算法(UMHexagonS)、简化的非对称十字多级六边形搜索算法(简化UMHS)以及增强预测区域搜索法(EPZS)等传统算法相比,本算法能在保持运动估计精度的同时进一步地节约大量的编码时间。
     在通过对多个QCIF视频序列的编码结果的统计分析的基础上,我们发现以下三个特征：每种模式被采用的概率不均匀；相邻宏块的模式与当前宏块的模式存在着比较大的相关性；上层模式和下层模式之间存在一定的相关性。同时,由于模式的分布并不均匀,在某些情况下,我们可以忽略一些对编码效率的影响很有限和出现几率很小的模式。基于上述发现,我们提出了一种基于统计特征的适于H.264的快速模式选择算法。该算法利用编码模式的上述统计特征来提前终止运动估计中模式的搜索过程,从而减少运算量和提高编码效率。实验结果表明,与高复杂模式(High complexity mode, HCM)算法、以及快速高复杂模式选择(Fast high complexity mode, FHCM)算法相比,所提出的算法能在保持几乎相同的率失真性能的同时进一步地节约大量的编码时间。
     本文的结构如下：首先介绍了H.264视频编码的原理,其中着重描述了H.264中的运动估计的原理和关键技术。接下来,详细描述了JM15.1(H.264的参考软件包)中的整数像素快速运动估计算法和快速模式选择算法。然后,重点阐述了我们所提出的一种新的整数像素快速运动估计算法和一种新的快速模式选择算法。
New video coding standard H.264 was published by ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) formed a Joint Video Team (JVT) in 2003. Because some advanced coding technologies are used in H.264, such as variable block size motion estimation, fractional pixel motion estimation, multi-reference frame motion estimation, rate-distortion optimization (RDO), context-based adaptive binary arithmetic coding (CABAC), context-adaptive variable length coding (CAVLC) etc, H.264 gain more coding efficiency than other video sequence standard. However these technologies are employed, the motion estimation of H.264 is much higher computation complexity than previous video coding standards. Because of these reasons, the study of motion estimation algorithm is always hot spot research in recent years to reduce the computational complexity of motion estimation in H.264.
     Based on the consideration of various motion characteristics of different video series, we propose a motion estimation algorithm based on vector prediction and multi-direction gradient descent search. Firstly, the algorithm utilizes the temporal and spatial correlations of the motion vectors for obtaining a predictive vector to determinate initial search point. Secondly, it determines the motion type of the current block by using adaptive thresholds. After that, different search patterns and strategies are intelligently employed. The algorithm can extend the search scope and improve the speed of motion search, and thus avoid being trapped in a local minimum of the search process. Experiment results show that the proposed algorithm can save more encoding time than the traditional hybrid Unsymmetrical-cross Multi-Hexagon-grid Search (UMHexagonS), simplified UMHexagonS and Enhanced Predictive Zonal Search (EPZS) algorithms with the motion estimation accuracy.
     Based on the statistical analysis of extensive experiment results of various QCIF video test sequence, we observed three statistical features, such as the distribution of modes is not uniform; The mode of a macroblock and the mode of macroblocks neighboring have strong correlation; the uplayer mode and dwonlayer mode have a certain correlation; At the same, because the distribution of modes is not uniform, in some cases, we can ignore some modes that have very little effect on encoding efficiency and are almost not used. Based on the analysis above, we propose a fast mode decision algorithm in H.264 base on statistical feature. The algorithm takes advantage of the above statistical features to early termination of mode decision, so it can reduce computation load and improve efficiency of video coding. Experiment results show that the proposed algorithm can save more encoding time than the mode decision algorithm of High Complexity Mode (HCM) and Fast High Complexity Mode (FHCM) with negligible degradation of rate-distortion performance.
     The paper is organized as follows. Firstly, principle of video coding in H.264 are briefly introduced, principles and key technologies of motion estimation in H.264 are detailed introduced. Secondly, Fast integer pixel motion estimation algorithms and fast mode decision algorithm in JM15.1 (H.264 Reference Software) are detailed. Lastly, a new fast integer pixel motion estimation algorithm and a new fast mode decision algorithm are proposed.

引文

[1]Coding of Moving Pictures and Associated Audio for Digital storage Media at up to About 1.5 mbits/s. ISO/IEC 1117-2:Video (MPEG-1), November 1991.
    [2]Generic Coding of Moving Pictures and Associated Audio Information. ISO/IEC 13818-2:Video (MPEG-2), May 1996.
    [3]Coding of audio-visual objects-Part 2:Visual. in ISO/IEC 14496-2 (MPEG-4 Visual Version 1), April 1999.
    [4]ITU-T Recommendation H.261, Video Codec for Audiovisual Services at px64 kbit/s. March 1993.
    [5]ITU-T Recommendation H.263. Video Coding for Low Rate. May 1996.
    [6]Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264|ISO/IEC 144496-10 AVC). Joint Video Team of ISO/IEC and ITU-T, March 2003.
    [7]林炯俊.适用于H.264之整数点移动估侧法则之研究[D].台湾：国立中央大学,2006.
    [8]Chen Zhibo, Zhou Peng, He Yun. Fast Integer and Fractional Pel Motion Estimation for JVT[C]. Proc. of the 6th Meeting for JVT-F017r, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG. Awaji Island, Japan,2002.
    [9]Chen Zhibo, Zhou Peng, He Yun, Fast Motion Estimation for JVT. JVT-G016, March,2003.
    [10]Xiaoquan Yi, Jun Zhang, Nam Ling, etc al. Improved and simplified fast motion estimation for JM[C]. Proc. of the 6th Meeting for JVT-P021, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG. Poznan, Poland,2005.
    [11]Tourapis A M, Cheong H Y, Topiwala P. Fast ME in the JM Reference Software. ISO/IEC MPEG and ITU-T VCEG Joint Video Team, JVT-P026, July.2005.
    [12]Peng Yiny, Hye-Yeon Cheong Tourapis, Alexis Michael Tourapis, et al. Fast mode decision and motion estimation for JVT/H.264. In Proceedings of the 2003 IEEE International Conference on Image Processing (ICIP), pp:853-856, Barcelona, Spain, September 2009.
    [13]Byeungwoo Jeon, Jeyun Lee. Fast Mode Decision for H.264. ISO/IEC MPEG and ITU-T VCEG Joint Video Team, JVT-J033, Dec.2003.
    [14]K. P. Lim, S. Wu, Fast INTER Mode Selection. ISO/IEC MPEG and ITU-T VCEG Joint Video Team, JVT-I02, Sep.2003.
    [15]Chen-Kuo Chiang and Shang-Hong Lai. Fast Multi-reference Motion Estimation Via Statistical Learning for H.264/AVC[C]. Proceedings of the 2009 IEEE international conference on Multimedia and Expo, pp:61-64, New York City, USA, June 2009.
    [16]Peng Wu, Chuang-Bai Xiao. An Adaptive Fast Multiple Reference Frames Selection Algorithm for H.264/AVC. Proceeding of 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp.1017-1020, Las Vegas City, USA, March 2008.
    [17]杨春玲,王华兴,梁荣锟.基于结构相似的H.264帧间预测改进算法[J].计算机学报,2009,8.
    [18]葛蜆蜆,王宇,郝重阳.H.264/AVC率失真优化(RDO)策略研究[J],无线通信技术,2006,2.
    [19]Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, et al. Overview of the H.264/AVC, Video Coding Standard[J]. IEEE Transactions on Circuits and Systems for Video Technology,2003,13(7).
    [20]Keng-Pang Lim, Gary Sullivan, Thomas Wiegand. Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. ISO/IEC MPEG and ITU-T VCEG Joint Video Team, JVT-X101, June.2007.
    [21]毕厚杰.新一代视频压缩编码标准H.264/AVC[M].北京：人民邮电出版社,2005.
    [22]刘峰.视频图像编码技术及国际标准[M].北京：人民邮电出版社,2005.
    [23]Min-Cheol Hong, Young Man Park. Dynamic search range decision for motion estimation. VCEG-N33, September.2001.
    [24]Xiaozhong Xu and Yun He, Modification of Dynamic Search Range for JVT. JVT-Q088, October,2005.
    [25]Xiaozhong Xu, Yun He. Comments on Motion Estimation Algorithms in Current JM Software. ISO/IEC MPEG and ITU-T VCEG Joint Video Team, JVT-Q089, October.2005.
    [26]Lai-Man Po, Ka-Ho Ng, Kwok-Wai Cheung, et al. Novel Directional Gradient Descent Searches for Fast Block Motion Estimation [J]. IEEE Transactions on Circuits and Systems for Video Technology,2009,19 (8).
    [27]Lai-Man Po, Ka-Ho Ng, Ka-Man Wong, et al. Multi-Direction Search Algorithm for Block-based Motion Estimation[C]. Proceeding of 2008 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pp.1466-1469, Macao, China, November 2008.
    [28]王永富.H.264即时快速编码法则之研究[D].台湾：国立中央大学,2008.
    [29]娄水勇.基于H.264标准的运动估计算法的研究[D].杭州：浙江大学,2006.
    [30]王立飞.基于H.264标准的运动估计算法的研究[D].成都：西南交通大学,2008.
    [31]陈航.基于H.264搜索算法的优化与实现[D].成都：四川大学,2006.
    [32]周城.视频编码中运动估计算法研究[D].武汉：华中科技大学,2007.
    [33]张昊.基于H.264标准的运动搜索算法的研究[D].济南：山东大学,2005.
    [34]Mohammed Golam Sarwer, Lai-Man Po, Jonathan Wu. Complexity Reduction Mode Selection of H.264/AVC Intra Coding. Proceeding of 2008 International Conference on Audio Language and Image Processing (ICALIP), pp.1486-1491, Shanghai, China, July 2008.
    [35]Lai-Man Po, Yusuf Md. Salah Uddin, Kai Guo, et al. Compensated Sum of Absolute Difference for Fast H.264 Inter Mode Selection[C]. Proceeding of 2008 International Conference on Audio Language and Image Processing (ICALIP), pp. 1486-1491, Shanghai, China, July 2008.
    [36]Yao-Chung Lin, Torsten Fink, Erwin Bellers. Fast Mode Decision For h.264 Based On Rate-distortion Cost Estimation[C]. in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2007), Honolulu, HI, April 2007.
    [37]Seung-Hwan Kim, Yo-Sung Ho. A Fast Mode Decision Algorithm for H.264 using Statistics of the Rate Distortion Cost[J]. Electronics Letters,2008,44(14).
    [38]Song-Hak Ri, Yuri Vatis, Joern Ostermann. Fast Inter-Mode Decision in an H.264/AVC Encoder Using Mode and Lagrangian Cost Correlation[J]. IEEE Transactions on Circuits and Systems for Video Technology,2009,19(2):302-306.
    [39]裴世保,李厚强,俞能海.H.264/AVC帧间预测模式的快速分类方法[J].计算机工程,2007,33(14)：201-202.
    [40]冯镔,刘文予,朱光喜.基于空间相关性的H.264快速自适应模式选择算法[J].通信学报,2006，27(1)：75-80.
    [41]杨钦闵.混合快速模式决策演算法之研究[D].台湾：国立中央大学,2005.
    [42]李育铭.H.264之快速区块模式决策演算法之探讨[D].台湾：国立中央大学,2006.
    [43]王鸿达.H.264加速区块模式演算法研究[D].台湾：国立中央大学,2008.
    [44]陈毓宏.在H.26上依量化参数泱定的模式决策演算法[D].台湾：国立成功大学,2005.
    [45]王卓凌.H.264视频编码标准中模式选择和快速搜索算法研究[D].成都：西南交通大学,2006.
    [46]林叶.H.264帧内预测模式选择快速算法研究[D].成都：西南交通大学,2006.
    [47]向东.基于H.264框架的运动估计和变换研究[D].武汉：华中科技大学,2006.
    [48]王家任.H.264之多幅参考画面演算法则之研究[D].台湾：国立中央大学,2007.
    [49]JVT H.264/AVC reference software Version 15.1 available online, http://iphome.hhi.de/suehring/tml/download.
    [50]Christopher J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery [J] 1998,2(2):121-167.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700