基于冗余小波变换的视频压缩运动估计与补偿算法研究

英文题名：Research of Motion Estimation and Compensation Algorithms in Video Compression Based on Redundant Discrete Wavelet Transform
作者：谢洪途
论文级别：硕士
学科专业名称：电子科学与技术
中文关键词：运动估计和补偿 ; 冗余离散小波变换 ; 块匹配 ; 自适应 ; DT网格 ; 改进SIFT
英文关键词：Estimation and Compensation ; Redundant Discrete Wavelet Ttansform ; Block Matching Algorithm ; Adaptive ; DT Mesh ; Improved SIFT
学位年度：2010
导师：高广珠
学科代码：080903
学位授予单位：国防科学技术大学
论文提交日期：2010-11-01

摘要

视频压缩编码的研究是目前信息技术中最为活跃的领域之一。运动估计和补偿是消除视频信号时间冗余的主要途径,是视频压缩编码的关键技术之一。传统的视频编码标准均采用空域块匹配运动估计方法来消除视频信号的时间冗余。小波变换具有良好的时频局部化特点且符合人类的视觉特性,基于小波域的运动估计技术也成为了近来研究的热点。然而,离散小波变换不具有平移不变性,因此在小波域内难以获得精确的运动估计与补偿效果。而冗余小波变换具有平移不变性,在冗余小波域进行运动估计可以达到很好的效果,但是算法的计算复杂度过高。块匹配法是基于平移模型的,存在一定的局限性和不足。而三角网格法是基于仿射变换的,它克服了块匹配法的不足,对作非平移运动的视频图像有较好的运动估计与补偿效果。
     本文的切入点是如何进一步提高运动估计与补偿的精度和重建图像的主观质量,降低运动估计算法的计算复杂度,提高运动估计的效率。本文首先对冗余小波域内块匹配运动估计与补偿进行了一些探讨与研究,然后针对非平移运动的视频图像,对冗余小波域内DT三角网格运动估计与补偿做了相关研究。本文主要的研究内容和工作如下:
     (1)本文将空间域搜索起点预测方法和搜索方法引入冗余小波域的运动估计中,并进行改进,提出了一种基于冗余小波变换的快速自适应块匹配运动估计算法。该算法首先提出了一种用于划分图像块运动状况和提前终止搜索的自适应阈值的选择与计算方法,并利用自适应阈值提取潜在运动块;然后在冗余小波域内提出了一种自适应搜索起点预测方法,实现对潜在运动块搜索起点的精确预测;最后在空域搜索方法的基础上,在冗余小波域内提出了一种能根据图像块的运动剧烈程度自适应地调整搜索方向和搜索半径以及提前终止搜索的搜索方法,实现对潜在运动块的快速运动估计。实验结果表明,该算法能够在保持良好的运动估计精度的情况下,有效地减少运动估计所需时间,提高运动估计的效率,且重建图像主观质量很好。与现有冗余小波域块匹配运动估计算法相比,该算法在运动估计的质量与效率上有明显的优势。特别对不同运动特征的视频图像序列的运动估计,该算法有很强的适应能力。
     (2)针对块匹配法不适合非平移运动的视频图像的运动估计,在经典冗余小波域DT网格运动估计算法的基础上,本文提出一种基于改进SIFT特征提取的冗余小波域DT网格运动估计与补偿算法。该算法首先在冗余小波域内提出一种新的特征点提取模板,将SIFT算法引入到冗余小波域的特征提取;接着提出了一种具有抗旋转能力和低维特征描述符的改进SIFT算法,以及一种新的特征相似性度量和匹配准则;然后利用改进的SIFT算法和特征匹配准则对新的特征点提取模板进行特征点提取与精确匹配,并以匹配后的特征点为网格节点生成DT网格;最后在冗余小波域内提取潜在运动区PMA,并利用网格节点的运动矢量和仿射变换方法在潜在运动区PMA内对图像进行运动估计。理论分析与实验表明,该算法能快速有效提取视频图像稳定的特征点并进行精确的特征点匹配,能获得准确的运动矢量,提高运动估计的精度和效率,且重建图像主观质量很好。该算法较经典冗余小波域DT网格运动估计算法在预测精度与效率上都有较大提高,特别对非平移运动的视频序列的运动估计效果很明显。
The study of the video compression coding algorithms is one of the most active areas in the information technology now. Motion estimation and compensation play a virtual role in video compression coding to reduce temporal redundancies. Block matching algorithm (BMA) in the spatial domain is widely employed in modern video compression systems.Now one hot spot in the research of video compression coding is the motion estimation algorithm based on the Discrete Wavelet Transform (DWT) became of its good time-frequency localization characteristics and match the human visual system. However, DWT isn’t shift invariant so that the motion estimation deployed in the wavelet domain can’t get good performance.Because the Redundant Discrete Wavelet Transform (RDWT) is shift invariant, the motion estimation algorithms in the redundant wavelet domain have good performance,but very high computational complexity. BMA is based on the translational model,so it has some limits and shortages.However, triangulation-mesh-based motion estimation is based on the affine transform,it amends the shortages of BMA and has a good motion estimation and compensation effect for the video sequence whose motion is non-translational.
     The breakthrough point of this dissertation is how to ulteriorly improve the precision and efficiency of motion estimation and compensation and the subjective quality of reconstructed image and reduce computational complexity. This dissertation mainly researches the block matching motion estimation and compensation algorithms based on the redundant discrete wavelet domain and the DT mesh motion estimation algorithms based on the redundant discrete wavelet domain for the video sequence whose motion is non-translational.The main research of this dissertation is listed as follows:
     This dissertation introduces the method for starting point prediction and search algorithm in spatial domain into the motion estimation algorithms in the redundant wavelet domain, and a fast adaptive block matching motion estimation and compensation algorithms based on the redundant wavelet domain is proposed. Fistly, this algorithm proposes a method of calculating adaptive threshold for dividing motion degree of image blocks and stopping search automatically. And potential motion blocks can be extracted by using adaptive threshold. Then, an adaptive method for starting point prediction is proposed in the redundant wavelet domain to realize the precise prediction of starting point of potential motion blocks. Finally, based upon the motion characteristics of the subjects in the video image and the improvement of search algorithm in spatial domain, an adaptive search algorithm in the redundant wavelet domain is proposed to realize the fast motion estimation of potential motion blocks, which can adjust the direction and radius of search and threshold of stopping search adaptively. The result of experiment indicates that this algorithm can maintain a higher peak signal to noise ratio and reduce the time of motion estimation effectively and improve the efficiency of motion estimation. And the subjective quality of reconstructed image is good. This approach has a remarkable performance compared with the existing block matching motion estimation algorithm in redundant wavelet domain in terms of the precision and efficiency. Especially, this algorithm has very strong adaptive capacity for the video sequences which has different motion characteristics.
     Because BMA dosen’t have a good motion estimation and compensation for the video sequence whose motion is non-translational. Based upon the traditional DT mesh motion estimation algorithms, an DT mesh motion estimation algorithms in redundant wavelet domain based on improved SIFT for extracting feature is proposed. Fistly, this algorithm proposes a new template for extracting feature point and introduces SIFT algorithm in spatial domain into extraction of feature point in the redundant wavelet domain. Then,an improved SIFT algorithm of rotation-resistant and low dimension feature descriptor is proposed and a new feature point similarity measure and matching criterion are proposed. Then, feature point extracting and matching are made in new template using improved SIFT algorithm and feature matching criterion. And DT mesh is generated with matched feature point regarded as mesh node. Fistly, potential motion area (PMA) was extracted in redundant wavelet domain and DT mesh motion estimation and compensation was done in PMA by motion vector of mesh node and the affine transform. The result of experiment indicates that this algorithm can extract feature points of the video sequences fast and effectively and make feature matching accurately, and get accurate motion vector and improve the precision and efficiency of motion estimation. And the subjective quality of reconstructed image is good. This approach has a superior performance compared with the raditional DT mesh motion estimation algorithm in redundant wavelet domain in terms of the predict precision and efficiency, especially for the video sequences whose motion is non-translational.

引文

[1]赵玥.基于冗余小波域可变形块匹配运动估计的研究.硕士论文.天津:河北工业大学, 2008.
    [2]王镇道.视频压缩的运动估计与小波方法研究.博士论文.长沙:湖南大学, 2008.
    [3] Ali J.Tabatabai, Radu S.asinchi, Naveen T. Motion estimation methods for video compression-A review. Journal of The Franklin Institute, 1998, 335(8): 1411~1441.
    [4]肖德贵,余胜生,周敬利.快速而有效的块运动估计算法.计算机研究与发展, 2001, 38(6): 1110~1114.
    [5] TourapisA.M, Au 0.C, Liou M.L. Highly efficient predictive zonal algorithms for fast block-matching motion estimation. IEEE Transactions on Circuits and System for Video Technology, 2002, 12(10): 934~947.
    [6]王赜,刘治国,王光兴.自适应块匹配运动估计搜索算法.计算机研究与发展, 2003, 40(7): 1036~1041.
    [7]李超,熊璋,赫阳.基于帧间差的区域光流分析及其应用.计算机工程与应用, 2005, 41(31): 195~197.
    [8]沈兰荪,卓力.小波编码与网络视频传输.北京:科学出版社, 2005, 146~188.
    [9] J.A.Stuller, A.N.Netravali. Transform domain and motion estimation. Bell System Technical Journal, 1979, 58: 1673~1702.
    [10] Jain J.R, Jain A.K. Displacement measurement and its application in interframe image coding. IEEE Transaction on Communications, 1981, 29(12): 1799~1808.
    [11] Cheung CH, Po LM. A novel cross-diamond search algorithm for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 2002, 12(12): 1168~1177.
    [12] Koga T, Iinuma K, Hirano A, et a1. Motion compensated interframe coding for video conferencing. In:Proceeding of National Telecommunications Conference (NTC81), New Orleans, LA, 198l, C9.6.1~C9.6.5.
    [13] Renxiang Li, Bing Zeng, Ming L.Liou. A new three-step search algorithm for block motion estimation. IEEE Transactions on Circuits and System for Video Technology, 1994, 4(4): 438~442.
    [14] PoLai Man, Ma Wing Chung. A novel four-step search algorithm for fast block motion estimation. IEEE Transactions on Circuits and System for Video Technology, 1996, 6(3): 313~317.
    [15] Lurng-Kuo Liu, Ephraim Feig. A block based gradient descent search algorithm for block motion estimation in video coding. IEEE Transactions on Circuits andSystem for Video Technology, 1996, 6(4): 419~422.
    [16] Shan Zhu, Kai-Kuang Ma. A new diamond search algorithm for fast block-matching motion estimation. IEEE Trans Image Processing, 2000, 9(2): 287~290.
    [17] Ce Zhu, Xiao Lin, Lap-Pui Chau. Hexagon-based search pattern for fast block motion estimation. IEEE Transactions on Circuits and System for Video Technology, 2002, 12(5): 349~355.
    [18] Z.B Chen, P.Zhou, Y.He. Hybrid Unsymmetrical-cross Multi-Hexagon-grid Search Strategyfor Integer Pel Motion Estimation in H.264. Picture Coding Symposium, 2003, April, Saint-Malo, France.
    [19] Tourapis A.M,Au O.CL, Liu M L. Predictive motion vector field adaptive search technique(PMVFAST)-enhancing block based matching estimation. In Proceedings of Visual Communications and Image Processing. San Jose, Calif, USA: Proceedings of SPIE, 2001: 883~892.
    [20] Wiegand, Thomas. Study of Final Committee Draft of Joint Video Specification Joint Video Team of ISO/IEC MPEG&ITU-T VCEG. Document JVT-F 100, 6th Meeting, Awaji, Japan, 2002, 5~13.
    [21] Martucci S A, Sodagar I, ChiangT, et a1. A zerotree wavelet video coder. IEEE Transactions on Circuits and System for Video Technology, 1997, 7(2): 109~118.
    [22]钟敏生,马争鸣.基于小波系数块的运动补偿.自动化学报, 2004, 30(1): 64~69.
    [23] Nogaki.S, M.Ohta. An overlapped block motion compensation for high quality motion picture coding. IEEE Int Conf On Circuits and Systems,1992, 5: 184~187.
    [24] Zhu C, Lin X, Chau L, et a1. Enhanced hexagonal search for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 2004, 14(10): 1210~1214.
    [25] Metin Uz K, Vetterli M, LeCall D.J. Interpolative multiresolution coding of advanced television with compatible subchannels. IEEE Transactions on Circuitsand Systems for Video Technology, 1991, 1(1): 86~89.
    [26] Zhang Yaqin, Sohail Zafar. Motion-compensated wavelet transform coding for color video compression. IEEE Transactions on Circuits and Systems for Video Technology, 1992, 2(3): 285 ~296.
    [27]张义荣,鲜明,肖顺平.一种基于小波分析的多分辨运动补偿视频编码器的设计.计算机工程与应用, 2004, 40(31): 110~113.
    [28]龚涛,丁润涛.基于小波变换的块匹配运动估计方法.天津大学学报, 2003, 3: 320~324.
    [29] A.Nosratinia, M.Orchard. A multi-resolution framework for backward motion compensation. In:Proceedings of SPIE Electronic Imaging, San Jose, CA, 1995,190~200.
    [30] A.Nosratinia, M.T.Orchard. Multi-resolution backward video coding. In: Proceedings of IEEE International Conference on Image Processing Washington, DC, 1995, 563~566.
    [31] H.W.Park, H.S.Kim. Motion estimation using low-band-shift method for wavelet-based moving-picture coding. IEEE Transactions on Image Processing, 2000, 9(4):577~587.
    [32]宋传鸣,王相海.一种新的小波域视频可分级运动估计方案.计算机学报, 2006, 29(12): 2112~2118.
    [33]汪丽丽.基于小波变换的图像压缩算法研究.硕士论文,武汉:武汉理工大学, 2009.
    [34]夏颖.基于A Trous小波变换的小波域运动估计和运动补偿.硕士论文.中山:中山大学, 2005.
    [35] SuxiaCui, Yonghui Wang, James E, Fowler. Mesh-based motion estimation and compensation in the wavelet domain using a redundant transform. Proceedings of the IEEE International Conference on image Processing, 2002, 1: 693~696.
    [36]潘健.基于冗余小波的多分辨率运动估计研究.硕士论文.苏州:苏州大学, 2006.
    [37]于明,曲昕,郭迎春.一种基于冗余小波变换的DT网格运动估计和运动补偿方法.中国图象图形学报, 2007, 12(12): 2072~2079.
    [38]于明,苗艳华,常建刚.一种基于冗余小波变换的快速运动估计算法.计算机应用, 2008, 28(4): 976~978.
    [39] Lurng-Kuo Liu, Ephraim Feig. A block based gradient descent search algorithm for block motion estimation in video coding. IEEE Transactions on Circuits and System for Video Technology, 1996, 6(4): 419~422.
    [40]崔锦泰,程正兴.小波分析导论. 1995.
    [41] Mallat.杨力华,戴道清,黄文良,湛秋辉.信号处理的小波导引. 2002.
    [42] Meyer Y.尤众.小波与算子. 1992.
    [43]陈武凡.小波分析及其在图像处理中的应用.北京:科学出版社. 2002,108~117.
    [44] Suxia Cui, Yonghui Wang, James E.Fowler. Multi-hypothesis Motion Compensation in the Redundant wavelet domain. International Conference on Image Processing, 2003, 9(2): 53~56.
    [45]张郑擎.基于小波变换的图像和视频压缩编码及其数字水印嵌入方法的研究.硕士论文.上海:上海大学, 2001.
    [46] Wei Jie, Li Zenian. An enhancement to MRMC scheme in video compression. IEEE Transactions On Circuits and Systems for Video Technology, 1997, 7(3):564~568..
    [47] Wang Q, Ghanbari M. Scalble coding of very high resolution video using thevirtual zerotree. IEEE Trans circuits and systems for video technology, 1997, 7(5): 719~727.
    [48] Magarey Julian, KingsburyN. Motion estimation using a complex-valued wavelet transform. IEEE Transactions on Signal Processing , 1998, 46(4): 1069~1084.
    [49] ISO/IEC IS 13818-2. Information technology-generic coding of moving picturesand associated audio information-Part2: Video, 1995(MPEG-2 Video).
    [50] Y Andreopoulos, A Munteanu, G Van der Auwera, etal. Scalable wavelet video-coding with in-band prediction implementation and experimental results. In:Proceedings of the International Conference on Image Processing, Rochester, NY, 2002, 3:729~-732.
    [51] Cui Suxia. Motion estimation and compensation in the redundant wavelet domain[D]. Mississippi State, Mississippi, USA: Mississippi State University, 2003.
    [52]杨坤.基于网格模型的运动估计中特征点的提取.硕士论文.天津:河北工业大学, 2007.
    [53]程正兴.小波分解算法与应用[M].西安:西安交通大学出版社, 1998.
    [54]阮秋琦.数字图像处理学[M].北京:电子工业出版社, 2001.
    [55]贾永红.数字图像处理[M].武汉:武汉大学出版社, 2003.
    [56] D Marr, E Hildreth. Theory of edge detection[A]. Proceedings of the Royal Society of London[C]. London, England, 1980, B207: 187~217.
    [57] Smith S M. Edge thinning used in the SUSAN edge detector[R]. Interntional Technical Reports TR9SMS5, Defense Research Agency, Farnbor-ough, Hampshire, CV146TD, UK, 1995.
    [58] Stephane Mallat, Wen liang Huang. Singularity detection and processing with wavelets. IEEE Transactions on Information Theory, 1992, 38(2): 617~643.
    [59] Zhang Lei, Bao Paul. Edge detection by scale multiplication in wavelet domain. Pattern Recognition letters, 2002, 23(14):1771~1784.
    [60] Orchard, M.T, G.J.Sullivan. Overlapped block motion compensation: An estimation-theoretic approach. IEEE Trans. on Image Processing, 1994, 3: 693~699.
    [61] Girod. B. Motion compensation: Visual aspects, accuracy and fundamental limits. In M.I.Sezan and R.L.Lagendijk, eds. Motion Analysis and Image Sequence Processing[M]. Boston: Kluwer Academic Publishers, 1993: 126~152.
    [62] Yucel Altunbasak. A Murat Tekalp Occlusion-Adaptive Content-Based Mesh Design and Forward Tracking. IEEE Trans.on Iimage Processing, 1997, (9): 1270~1280.
    [63] Yucel Altunbasak. A Murat Tekalp Closed-Form Connectivity-Preserving Solutions for Motion Compensation Using 2-D Meshes. IEEE Trans.on Iimage Processing, 1997, 6(9): 1255~1269.
    [64] Aria Nosratinia. New Kernels for Fast Mesh-Based Motion Estimation. IEEE Transactions on Circuits and Systems for Video Technology, 2001, 1(1): 40~51.
    [65] J.R. Shewchuk.Triangle:Engineering a 2-D Quality Mesh Generator and Delaunay Triangulator. Applied Computational Geometry: Towards Geometric Engineering, 1992.
    [66]卢朝阳,周幸妮,高西全,樊昌信.三角形网格基活动图像编码研究-运动估计运动补偿和残差图像处理.硕士论文.西安:西安电子科技大学, 2002.
    [67] Lowe G D. Distinctive image features from scaleinvariant key points. International Journal of Computer Vision, 2004, 60(2): 91~110.
    [68] Ke Y. PCA-SIFT: a more distinctive representation for local image descriptors. CVPR 2004, 2: 506~513.
    [69]龚志辉,张春美,孙雷.改进SIFT特征描述符在影像匹配中的应用研究.测绘科学技术学报, 2008, 25(6): 440~442.
    [70]李伟,沈振康.基于KPCA-SIFT描述符的图像配准.信号处理, 2009, 25(4): 644~647.
    [71]徐小明,杨丹,张小洪,极坐标下基于差分统计的描述器算法.中国图象图形学报, 2009, 14(5): 961~966.
    [72]赵刚强.基于视觉的大范围头部姿态跟踪关键技术研究. [博士论文].浙江:浙江大学, 2009.
    [73] T.Lindeberg. Scale-Spaze Theory: A Basic Tool for Analysing Structures at DifferentScales. Journal of Applied Statistics, 1994, 21(2): 224~270.
    [74] T.Lindeberg. Detecting Salient Blob-Like Image Structures and Their Scales With aScale-Space Primal Sketch: A Method for Focus-of-Attention. International Journal of Computer Vision.Dec, 1993, 11(3): 283~318.
    [75]夏一民.基于SIFT的目标识别及图像拼接. [硕士论文].浙江:浙江理工大学, 2007.
    [76] K.Mikolajczyk, C.Schmid. A Performance Evaluation of Local Descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10), 1615~1630.
    [77]郑南宁.计算机视觉与模式识别.国防工业出版社, 1998.
    [78] C.Harris, M.J.Stephens. A Combined Corner and Edge Detector. Alvey Conference, 1988, 147~152.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700