AVS视频编码器的模块算法优化及其在DSP上的实现

英文题名：Optimization of AVS Video Encoder's Algorithms and Its Implementation on DSP
作者：魏建云
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：AVS ; 亚像素运动估计 ; 帧间模式选择 ; DM642
英文关键词：AVS ; sub-pixel motion estimation ; inter-mode selection ; DM642
学位年度：2010
导师：彭玉华 ; 杨明强
学科代码：081002
学位授予单位：山东大学
论文提交日期：2010-03-31

摘要

AVS是我国具备自主知识产权的第二代信源编码标准。该标准的提出既改变了第一代信源编码技术落后的局面,也解决了AVC专利许可问题死结。与当前流行的国际标准MPEG-4和H.264/AVC相比,有性能高、计算复杂度低、专利费用低等优点。但目前AVS编解码芯片与软件还不成熟,其产业化的实现还有很长的路要走。因此,对AVS编码器的研究具有极为重要的意义。
     为此,本文在研究多媒体技术的发展现状以及目前各种视频压缩技术的基础上,对AVS编码器中的亚像素运动估计和帧间模式选择模块进行了研究,并提出了快速算法,提高了编码速度。然后将编码器和快速算法移植到了DM642平台上,为AVS的实时编码和应用打下了基础。
     论文的主要工作包括：
     1.对亚像素运动估计进行研究,并提出了两个有效的快速搜索算法。为了获得较好的图像质量和高的编码效率,AVS采用了1／4像素精度的运动估计技术。但标准中采用的亚像素全搜索算法的计算量在运动估计中占了非常大的比重,因此减少亚像素的搜索时间尤为重要。由于视频序列具有中心偏置特性和较强的时空相关性,所以本课题从这两个方面来研究减少亚像素搜索的耗时,提出了两种快速算法。一种算法考虑中心偏置特性,根据整像素最优点和次优点的方位关系来预测亚像素运动矢量的方向,使亚像素的待搜索点数由16个降低到4～8个。该算法实现简单,编码效果较好,适合在硬件和实时系统中应用。另一算法利用时空相关性,用宏块的空间和时间上的相邻块的运动矢量来预测当前宏块的运动矢量,降低搜索时间。该算法能使编码器的编码时间节省21％～36％,编码效果稍好于上
     方法,但实现较复杂。在具体的视频应用中,可以根据不同需求,选择一种合适的快速算法使用。
     2.在统计基础上,提出了一种基于预测和阈值判决的帧间模式选择快速算法。标准中对4种帧间模式和Skip.帧内等6种模式都进行了搜索。每种帧间模式下都进行运动估计,而对全部的搜索模式都要进行率失真计算,这样的搜索策略是非常耗时的。经统计发现,当前块的最佳编码模式与其时间和空间上相邻块的最佳编码模式相同的概率非常大,因此可以用相邻块的最佳编码模式来预测当前块最可能的编码模式,从而减少搜索的模式,节省编码时间。实验证明,对于不同的视频序列,该算法在保证图像质量和比特率几乎不变的情况下,编码总时间节省了23％～59％。
     3.把AVS编码器及本课题提出的快速算法移植到嵌入式DM642平台上,并实现两个视频获取方法和数据回传。一是基于UDP/IP协议的网络传输,DM642对接收到的视频序列进行编码。二是DM642对摄像头捕获的视频信息进行编码,即针对监控系统的编码。编码后的码流经网络传输由上位机程序进行存储。这为AVS在嵌入式半台上的实时编码打下了很好的基础。
AVS is the second generation of source coding standard with China's independent intellectual property rights. The proposal of this standard not only changes the lagging situation of the first generation source coding technology, also it avoids the heavy burden of patent fees of MPEG-4 and H.264/AVC. Compared with the current popular international standard MPEG-4 and H.264/AVC, AVS has the advantage of high-performance, low computational complexity, and low patent costs. However, AVS codec chip and software have not been ripe yet, there is still a long way to go for the realization of its industrialization. Therefore, research on the AVS codec is of vital significance.
     Based on the research of the development and basic techniques of video coding, the thesis focuses on the sub-pixel motion estimation and inter-mode selection algorithms, and proposes the fast algorithms for the two modules. Then AVS encoder and the proposed fast algorithms are transplanted to the DM642. The main achievements of the thesis include:
     1. Research on the sub-pixel motion estimation and propose two fast algorithms
     In order to obtain better image quality and higher coding efficiency,1/4 pixel motion estimation is used in AVS. The computation overhead required by fractional pixel motion estimation has become relatively significant, thus reducing the sub-pixel search time is particularly important. People have found that video sequences have the center biased property and strong spatial-temporal correlation, so we can make use of the two aspects to reduce the time cost of sub-pixel search. On one hand, we use the center-biased property, the sub-optimal integer match points are utilized to predict the sub-pixel motion vector direction, and thereby the number of search points can be reduced from 16 to 4-8. The algorithm is simple and has better coding effection. It is better for application in hardware and real-time system. On the other hand, we use the spatial and temporal correlation characteristic, the spatial and temporal neighboring blocks'motion vectors are employed to predict the direction of the current block motion vectors, which can reduce the search time. This algorithm can reduce the total encoding time by 21%～36%. The coding effection of this algorithm is a little better than the first algorithm, but it's harder to program. You can choose an appropriate algorithm according to your needs.
     2. Based on statistics, propose an inter-mode selection fast algorithm utilizing prediction and threshold
     In order to achieve the best compression performances, AVS video standard adopts the full-search mode selection method based on rate distortion optimization (RDO) technique. It calculates the rate distortion cost (RDCost) of all possible six modes to choose the best one with the minimum RDCost. Obviously, this mode selection algorithm requires high computational complexity and consumes a large amount of coding time. Our investigation revealed that the probability that the best mode of current macroblock is the same as the best mode of its co-located or surrounding macroblocks varies from 60% to 90% for different video sequences with different motion intensity. So for the current macroblock, we can use the best modes of temporal and spatial adjacent blocks as the candidate mode list, whose number is less than that of the total modes. When one of them satisfies the setted condition, we can skip the other modes. Thereby the encoding time can be reduced substantially.
     3. Transplant the AVS encoder and fast algorithms to the embedded DM642 platform and the implementation of two video access methods. One method is network transmission. PC transmits the video sequences to be encoded to DM642, with UDP/IP protocol. On DM642, we directly operate the netcard to receive data and return the flag that indicates received data are right or wrong. The other method is to use the video camera to capture videos, namely encoded information from the monitoring system. Encoded data will be transmitted over the network to the host computer which stores them. All of these will do good to promote the industrialization process of the AVS.

引文

[1]毕厚杰.新一代视频压缩编码标准H.264[M].北京：人民邮电出版社,2005.24
    [2]GB／T 20090.2-2006,信息技术先进音视频编码第2部分：视频[S].2006
    [3]庞潼川.视频压缩技术与标准[J].数据通信,4：28-31,2002
    [4]黄爱民,安向京,骆力.数字图像处理与分析基础.第一版[M].北京：中国水利水电出版社,121-133,2005.
    [5]D. Huffman. A method for the construction of minimum redundancy codecs[J]. Proc. of the IRE,40, pp.1098-1011,1952.
    [6]Gao Wei, Li Hua, Liu Kaihua. An improved Exp-Golomb codes for efficient DCT coefficients coding[C]. Canada Electrical and Computer Engineering, pp. 2037-2040,2005
    [7]祝本明,刘桂华.一种改进的游程编码算法[J].西南科技大学学报,22(3)：75-78,2007
    [8]Glen G. Langdon, Jr. An Introduction to Arithmetic Coding[J]. IBM Journal of Research and Development, Vol.28, No.2, pp.135-149,1984
    [9]International Telecommunication Union. Video Codec for Audio-visual Services at px64kbit/s[S]. ITU-T Recommendation H.261,1990.
    [10]ISO/IEC 1172-2. Information Technology-Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbps-Video[S]. 1993
    [11]ISO/IEC International Standard 13818-2. Information Technology-Generic Coding of Moving Pictures and Associated Audio Information:Video[S],1994
    [12]ITU-T. Recommendation H.263:Video Coding for Low Bit Rate Communication[S].1998
    [13]ITU-T. Draft Text of Recommendation H.263 version 2("H.263+")[S].1997
    [14]ITU-T Draft for H.263++ Annexes U,V and W to Recommendation H.263[S]. Helsinki, Nov.2000
    [15]A. Kaup. Object-based Texture Coding of Moving Video in MPEG-4[J]. IEEE Trans. Circuits Syst. Video Technol, Vol.9, No.1, Feb.1999
    [16]Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC)[S]. Geneva, May 2003.
    [17]SF Chang, T Sikora, A Purl. Overview of the MPEG-7 Standard [J]. IEEE Transactions on Circuits and Systems for Video Technology, Vol.11, No.6, pp.688-695, Jun.2001
    [18]J Bormans, J Gelissen, A Perkis. MPEG-21:The 21st Century Multimedia Framework[J]. IEEE Signal Processing Magazine, Mar.2003
    [19]高文,王强,马思伟.AVS数字音视频编解码标准[J].中兴通讯技术,12(3)：6-9,2006.
    [20]黄铁军.视频编码国家标准AVS与国际标准MPEG的比较[EB/OL].2006.06
    [21]Wang Ronggang, Huang Chao, Li Jintao, et al. Sub-pixel motion compensation interpolation filter in AVS[C]. Proceedings of the 2004 IEEE International Conference on Multimedia and Expo(ICME'2004), Taibei, China. New York, USA, Vol.1, pp.93-96, Jun.2004
    [22]Iain E.G.Richardson. H.264 and MPEG-4 Video Compression[M]. England, John Wiley & Sons Ltd, pp.28-29,2003.
    [23]P. Zhang, D. Zhao, S. Ma, et al. Multiple Modes Intra-prediction in Intra Coding[C]. International Conference on Multimedia and Expo, pp:419-422, Jun. 2004
    [24]ZHANG Nan, YIN Baocai, KONG Dehui, et al. Spatial prediction based intra-coding[C]. Proceedings of the 2004 IEEE International Conference on Multimedia and Expo(ICME'2004), Taibei, China. New York, USA, Vol.1, pp.97-100,Jun.2004
    [25]JI Xiang-yang, ZHAO De-bin, GAO Wen, etal. New Bi-prediction techniques for B pictures coding[C].Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'2004), Taibei, China, Vol.1, pp.101-104, Jun. 2004
    [26]JI Xiang-yang, ZHAO De-bin, GAO Wen, etal. New scaling technique for direct mode coding in B pictures[C], IEEE International Conference on Image Processing (ICIP 2004), Singapore, Piscataway, pp.469-472, Oct.2004.
    [27]N.Ahmed, T.Natarjan, K.R.Rao. Discrete Cosine Transform[J]. IEEE Transactions on Computer, Vol.23, pp.90-93,1974.
    [28]虞露,胡倩,易峰.AVS视频的技术特征[J].电视技术,277(7)：8-11,2005.
    [29]MA Siwei, GAO Wen, FAN Xiaopeng. Low complexity integer transform and high definition coding[R]. Proceedings of SPIE 49th Annual Meeting, Denver, CO, USA. Bellingham, WA,USA:SPIEPress, Vol.58, pp.547-554, Aug.2004
    [30]WANG Qiang, ZHAO Debin, MA Siwei, etal. Context-based 2D-VLC for video coding[C]. Proceedings of the 2004 IEEE International Conference on Multimedia and Expo (ICME'2004), Taibei, China. New York, NY,USA:IEEE, Vol.1, pp.89-92,Jun.2004.
    [31]Reeve H, Lim JS. Reduction of blocking effects in image coding[J].Opt Eng, 23(1):34-73,1984
    [32]Chen Zhibo, Zhou Peng, He yun. Fast integer pel and Fractional Pel Motion Estimation for JVT[C]. JVT-F017, Awaji Island, Japan:JVT 6th Meeting, December 2002.
    [33]向友君,雷娜等.运动估计算法匹配准则研究[J].计算机科学,36(9)：278-280,2009
    [34]王维东,姚庆栋,刘鹏.小数像素运动估计快速算法[J].通信学报,(04)：128-132,2003
    [35]Jeong Jechang. Fast sub-pixel motion estimation having lower complexity [C]. IEEE International Conference on Consumer Electronics, pp.17-19,2003.
    [36]Chen Zhibo, Du Cheng, Wang Jinghua, PPFPS-A paraboloid prediction based fractional pixel search strategy for H.26L[J]. IEEE International Symposium on, (3) pp.9-12,2002
    [37]陈永华,王毅,刘东华等.用于AVS视频编码的三步快速子像素搜索算法[J].计算机工程,2008(09)：230-231.
    [38]程霖,郭宝龙,孙宏军.基于H.264的一种快速1／4像素运动估计算法[J].计算机应用研究,2008(02)：373-375.
    [39]Libo Yang, Keman Yu etal. Prediction-based directional fractional pixel motion estimation for h.264 video coding[C]. In Proc. of ICASSP 2005, Ⅱ901-Ⅱ904, vol.2, March 2005
    [40]Z. Zhou, M.T.Sun, and Y.F. Hsu, Fast variable block-size motion estimation algorithm based on merge and split procedures for H.264/MPEG-4 AVC[C], In Proc of ISCAS 2004, pp.725-728, May 2004
    [41]J.F.Xu, Z.B.Chen and Y. He, Effective fast ME prediction and early-termination strategy based on H.264 statistical characters, PCM 2003, pp.218-222, Dec.2003
    [42]H. C. Tourapis, Fast motion estimation with the JVT codec[C], JVT-E023,5th meeting:Ceneva Switzerland, pp.09-17, October 2002
    [43]刘敏,魏志强,李翠苹等.基于AVS-M的帧间模式快速选择算法[J].计算机工程,34(2)：217-219,2008
    [44]邓宇,李华,张淑芳.面向AVS-M的快速帧间预测模式选择算法[J].电视技术,2006.7：18-19
    [45]吴淑英,陈卫东.H.264／AVC快速帧间模式选择研究[J].计算机工程,Vol.33,No.14：203-204,2007
    [46]Jeong Mee Moon, Yong Ho Moon, Jae Ho Kim. A Computation Reduction Method for RDO Mode Decision Based on an Approximation of the Distortion[C]. Image Processing, IEEE International Conference on 8-11 Oct.2006 Page(s):2481-2484
    [47]Manoranjan Paul, Michael Frater, John Arnold. Efficient Mode Selection Algorithm using Image Distortion for H.264 Video encoder. Signal Processing[C], The 8th International Conference on Volume 2, pp.16-20,2006
    [48]Lai-Man Po, Yusuf Md, etal. Compensated Sum of Absolute Difference for Fast H.264 Inter Mode Selection[C]. ICALIP 2008, International Conference on:7-9 July,2008. Page(s):1486-1491
    [49]Yu A C. Efficient Block-size Selection Algorithm for Inter-frame Coding in H.264/MPEG-4 AVC [C]. IEEE Internatinal Conference on Acoustics, Speech and Signal Processing.2004:169-172
    [50]H.Zhu, C.K.Wu, Y.L.Wang, and Y. Fang. Fast mode decision for H.264/AVC based on macroblock correlation[C]. Proc.19th Int. Conf. Adv. Inf. Netw, vol.1, pp. 775-780, Appl.2005
    [51]P. Gao, L. Q. Shen, G. W. Teng, J. H. Xie. Fast inter mode decision algorithm using spatiotemporal characteristic of motion vector field[C]. Proc.3th International Conference on Convergence and Hybrid Information Technology, pp. 912-918,2008
    [52]Huanqiang Zeng, Canhui Cai, Kaikuang Ma. A Novel Fast Mode Decision for the H.264/AVC Based on Local Macroblock Motion Activity[C]. Fourth International Conference on Image and Graphics, pp.263-267, Aug.2007
    [53]Zhiping Lin, Hongtao Yu, Feng Pan. A Scalable Fast Mode Decision Algorithm for H.264[J]. Proceedings of IEEE International Symposium on Circuits and Systems, Page(s):4 pp,2006
    [54]W. P. Ma, Sh. Y. Yang, L. Gao, Ch. K. Pei. An Efficient Fast Mode Decision Algorithm Based on Motion Cost for H.264 Inter Prediction[J]. Procs of International Symposium on Intelligent Information Technology Application Workshops,2008, pp:550-553
    [55]Hanli Wang, Sam Kwong, Chiwah Kok. An Efficient Mode Decision Algorithm for H.264/AVC Encoding Optimization[J]. IEEE Transactions on Multimedia, Vol. 9, Issue 4, pp.882-888, June 2007
    [56]Christos Grecos, Ming Yuan Yang. Fast Inter Mode Prediction for P Slices in the H264 Video Coding Standard[J]. IEEE Transactions on Broadcasting, Volume 51, Issue 2, pp.256-263, June 2005
    [57]Yuanyuan Ding, Yujuan Si, Cheng Yao. A novel inter mode decision algorithm for H.264/AVC [J]. Congress on Image and Signal Processing, CISP'08, Vol.1, pp.334-338, May 2008
    [58]Zhenyu Wei, King Ngi Ngan. A Fast Macroblock Mode Decision Algorithm for H.264 [C]. IEEE Asia Pacific Conference on Circuits and Systems, Singapore, pp. 772-775. Dec.2006
    [59]Texas Instrument. TMS320DM642 Video/Imaging Fixed-Point Digital Signal Processor[DB/OL]. www.ti.com, July.2002

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700