用户名: 密码: 验证码:
多维矢量矩阵可变分割彩色视频流压缩编码
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着通信技术的日新月异,通信业务日渐繁多,数字视频信息因通俗易懂、传输质量好、抗干扰能力强、可靠性高、易于加密等优点而广泛应用,但是数字信息数据量大,占用频带太宽,存储占用空间大,所以对数字视频图像的压缩势在必行。
     正交变换编码是根据大多数图像中直流和低频区占大部分,高频区占小部分的特性,将空域图像信号变换到频域,来进行压缩的一种有效方法。离散余弦变换因其基向量与自然图像的协方差矩阵的特征向量相当近似,是最小均方误差准则下的准最佳变换编码方法,且DCT是实数运算,有快速算法,因而广泛的应用于视频图像压缩国际标准中。
     DCT变换利用图像块内像素间的相关性来消除冗余,理论上来说,为了充分利用所有图像像素间存在的潜在相关性,应对整幅图像统一进行变换,但是那样的计算量非常大。所以图像一般是分割成许多的小块进行的,变换块的形状一般为正方形,为便于有快速算法,一般块长选为2的n次幂。通常认为在8×8范围内像素的相关性比较大,所以一般固定块尺寸选择为8。但是图像的不同区域具有不同的统计特性,如果在一个区域内图像内容非常相似,那么这个区域内像素的相关性就很强,采用大尺寸的变换具有更好的能量集中性;相反,如果一个区域内的图像内容比较复杂,那么像素间的相关性就比较弱,采用大尺寸变换量化后会产生振铃效应,采用小尺寸变换量化后能保留更多的细节,这就需要自适应技术来支撑。
     当每像素的平均位数小于1时,采用自适应编码技术将起到相当大的作用。自适应变换核也就是将图像根据内容复杂程度分割成不同的矩形块,并自适应的进行相应块尺寸的正交变换。DCT的块分割尺寸选择为4、8和16三种块是比较合适的,另外采用多尺寸编码模式时,可用的尺寸也最好不要超过3种。
     3D-DCT被认为是运动补偿的替代技术,3D-DCT能利用多帧间的相关性来消除其间冗余,而运动补偿技术只能消除最多两帧间的冗余;另外由于3D-DCT的结构是非递归的,这就避免了传输错误的无限传播。在某些图像序列的压缩性能方面,能与运动补偿变换编码技术相媲美。3D-DCT在运动的总量低时非常有效。但由于3D-DCT有较长的编码延迟和大量的存储空间需要,一开始相关的研究并不多,但随着计算机硬件技术的飞速发展,计算机运算速度越来越快,针对3D-DCT的研究也越来越多了。
     但是现阶段的3D-DCT研究存在以下几个问题:
     1、由于多维矩阵间运算方法无统一准则,3D-DCT矩阵运算研究非常匮乏。
     2、自适应分块的分割准则混乱。
     3、三维量化无具体视觉模型参考。
     4、针对三维扫描后的变换系数,进行熵编码基本上还使用的是传统JPEG中的RL-VLC方法。
     针对问题一,桑爱军提出了多维矢量矩阵理论,是一种最新的关于多维矩阵间运算的理论,仿效二维矩阵间运算形式进行推广到多维空间,具有表述简洁明了,形式通俗易懂,计算复杂度适中,易于向高维空间推广的优点。胡铁根并将多维矢量矩阵理论推广应用于彩色图像的多维矩阵正交变换压缩领域,推导出了3D-DCT矩阵变换的多维矢量核矩阵。
     针对问题二,为了减少运算复杂度,本文提出了一种基于图像活动性衡量来进行分割的快速方法。基于梯度的图像活动性衡量是最佳的图像活动性衡量标准,并且不仅在区分不同的图像方面非常有效,而且和PSNR有很大的相关性.由于DCT就是沿水平,竖直方向进行能量集中,所以水平和垂直梯度能很好的反应DCT在该方向的能量集中能力,也就是压缩能力。
     针对问题三,量化器设计需考虑的关键因素有:
     1、变换后系数分布的概率分布函数,有助于最小失真量化器的设计。
     2、根据人类视觉系统模型的一阶低通调制转移函数。将上述二者相结合,来进行量化矩阵的构造。
     针对问题四,对JPEG中RL-VLC进行了修改,提出了一种LL-VLC熵编码的新方方法,并进行了简单的上下文模型构造,码表适中,并有很好的潜力推广应用于更高维的视频图像熵编码体系中。
     本文从一定程度上说明了多维矢量矩阵正交变换编码体系在图像压缩领域的有效性,但是该体系提出时间较短,还有很多不完善和有待改进的地方,发展空间很大。例如,多维矢量矩阵离散余弦变换快速算法的研究,以及需要与现有流行压缩国际标准进一步相结合,与更多的与新技术相融合,都需要进一步深入学习和研究。
With the fast development of communications technology, diverse communications appear, digital video service is widely used as it has so many advantages, easy to understand, good transmission and anti-interference quality, high reliability, easy encryption and so on. However, as the digital data are so large that wide bandwidth and large storage space are needed, so the compression of digital video is imperative. Orthogonal transform coding is a very efficient compression method, because the direct current and low frequency regions take up a great proportion, while high frequency regions occupy small proportion in most of images. The image signal is transformed from spatial domain into frequency domain. Under the rule of the least mean square error, discrete cosine transform is the quasi-optimal transform coding method, as its basis vector is approximate to the characteristic vector of covariance matrix of natural images. Meanwhile, DCT is real arithmetic and there is fast calculation method, so it is widely used in international video and image compression standards.
     DCT takes advantage of the correlation between pixels to compress. Theoretically, all image data should be transformed at one time to eliminate the whole correlation, but the computational complexity is very large. Therefore, before transformation, image is segmented into many small blocks, the shape of which is normally square. The size of block is generally chosen as the nth power of 2 to convenience for fast calculation. Customarily, it is thought that when the size of block is 8, the correlation between pixels is the highest, so the size of block of fixed partition transformation is normally chosen as 8. Unfortunately, there are different statistical characteristics in different aeries of image. On the one hand, if the image content in some region is very similar, the correlation between the pixels in this region is high, then transformation with larger size of block can get higher energy compaction; on the other hand, if the image content is different in some region, transformation with larger block size may cause ring effect, using smaller size block is helpful to detail preserving. The technique of adaptive coding can well deal with this problem.
     When the bit per pixel is less than 1, adaptive coding will play an important role in image compression. Adaptive transform core is that the image is divided into various rectangular blocks according to the image content, and then transformed with corresponding block. It was appropriated when the block size is 4,8, or 16, moreover, the type of size had better not go beyond 3 in variable block size transform coding.
     Three dimensional cosine transform is considered as the replacement technique of motion compensation.3D-DCT can make full use of the correlation between several frames to compress, while motion compensation can only eliminate the correlation between two frames. Moreover, as the structure of 3D-DCT is non-recursive, it avoids infinite spread of the transmission errors. Meanwhile, the encoding and decoding complexity of 3D-DCT is the same, which is very suit for real time coding. On the condition that the gross amount movement is low,3D-DCT is very efficient. The disadvantage of 3D-DCT is that it has longer coding delay and larger memory space requirement, but with the rapid development of computer hardware technology, the computational speed is more and more fast, there are more and more researches focusing on 3D-DCT.
     At present, there are some problems about the study of 3D-DCT:
     1. As there are no unify definition of the operation approaches between multidimensional matrixes, the study about matrix format of three dimensional discrete cosine transform is very scarcity.
     2 There are too many block partion riterions.
     3. There is no concrete reference model (RM) of human visual system (HVS) for 3D quantization.
     4. The entropy method for 3D scanned coefficients is the traditional RL-VLC method in JPEG.
     Regarding of the first problem, Sang proposed multidimensional vector matrix (MDVM) theory, which is about the operation approaches between multidimensional matrixes, and imitates the operation format between 2D matrixes. The representation is concision, the form is easily understand, and the operation complexity is medium, and it is easy to be extended to even more high dimensional matrix operation. Hu applied the theory into the field of orthogonal transform compression of color image, derived the transform core matrix of 3D-MDCT.
     About the second, to reduce complexity, a fast scheme using a picture activity measure is proposed. In lossy coding, most of the gray-level histogram statistics of the images do not have any direct effect on the lossy coding performance, and image activity measure is the only feature that has a negative correlation with the PSNR value, gradient-based activity measure is the best measure and it is not only very effective in differentiating between various images but also correlates well with the PSNR. Moreover, as the direction of energy compaction of DCT is along to the vertical and horizontal, the vertical gradient and horizontal gradient can well reflect energy compact capability.
     For the third, according to the key fact which affects quantization designed a lot:
     1. The probability distribution function of transformed coefficients is helpful to the design of minimum distortion quantization.
     2. The first-order low-contrast modulation transfer function of human visual system model.
     Combined the two mentioned to construct the three dimensional quantization matrix.
     The last, the RL-VLC method in JPEG is improved, a new entropy method that based on nonzero coefficient level and zero coefficient run length level, named LL-VLC is proposed, and according to the simple model based on context, encoding M X N X L coefficients one time, the table code cost is medium, and there are good potential to entropy coding in even higher dimensional orthogonal transform.
     To some extent, this paper proved the effectiveness of multidimensional vector matrix orthogonal transform in image compression field, but the theory is in its early research stage, there are still many details need to study and further research. For example, the research of fast algorithm based on the multi-dimensional discrete cosine transform system, or combined with the classical international compression standard and latest technique.
引文
[1]林福宗.图像文件格式(上)-Windows编程[M].北京:清华大学出版社,1996.
    [2]姚庆栋,毕厚杰,王兆华,徐孟侠.图像编码基础(第3版)[M].北京:清华大学出版,2006.
    [3]容观澳.计算机图像处理[M].北京:清华大学出版社,2000.
    [4]Ahmed N, Natarajan T, Rao K R. Discrete cosine transform [J]. IEEE Transactions on Computers,1974,22(1):90-93.
    [5]Wen-Hsiung Chen, C Smith. Adaptive coding of monochrome and color images [J]. IEEE Transactions on Communications,1977,25(11):1285-1292.
    [6]张春田,苏育挺,张静.数字图像压缩编码[M].北京:清华大学出版社,2006.
    [7]黎洪松.数字图像压缩编码技术及其C语言程序范例[M].北京:学苑出版社,1994.
    [8]刘峰.视频图像编码技术及国际标准[M].北京:北京邮电大学出版社,2005.
    [9]毕厚杰.新一代视频压缩编码标准H.264/AVC[M].北京:人民邮电出版社,2005.
    [10]Natarajan, Ahmed N. On interframe transform coding [J]. IEEE Transactions on Communication,1977,25(11):1323-1329.
    [11]Roese J A, Pratt W, G Robinson. Interframe cosine transform image coding [J]. IEEE Transactions on Communication,1977,25(11):1329-1339.
    [12]Servais M, de Jager G. Video Compression using the three dimensional discrete cosine transform (3D-DCT) [C]. Proceedings of the 1997 South African Symposium on Communications and Signal Pricessing,1997:27-32.
    [13]Alshibami O, Boussakta S. Fast algorithm for the 3-D DCT [C].2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001:1945-1948.
    [14]Boussakta S, Alshibami O. Fast algorithm for the 3-D DCT-II [J]. IEEE transactions on Signal processing,2004,52(4):992-1001.
    [15]Hamid Palangi, Aboozar Ghafari. Image coding and compression with sparse 3D discrete Ccosine transform [J]. Computer Science,2009,5441:532-539.
    [16]Lee M C, Raymond K W Chan, Donald A. Adjeroh. Quantization of 3D-DCT coefficients and scan order for video compression [J]. Journal of Visual Communication and Image Representation,1997,8(4):405-422.
    [17]Tan D M, Wu H R. Multi-dimensional discrete cosine Transform for image compression[C]. The 6th International Conference on Electronics, Circuits and Systems.1999:687-691.
    [18]Westwater R, Furht B. Three-dimensinal DCT video compression technique based on adative quantizers[C]. Second International Conference on Engneering of Complex Computer Systems Proceedings,1996:189-198.
    [19]Nikola Bozinovic, Janusz Konrad. Scan order and quantization for 3D-DCT coding [J]. Visual communications and image processing 2003,5150(3):1204-1215.
    [20]Fryza T. Properties of entropy coding for 3D DCT video compression method[C]. 17th International Conference on Radioelektronika,2007:1-4.
    [21]Malavika Bhaskaran, Jerry D Gibson. Distributions of 3D DCT coefficients for video[C]. IEEE International Conference on Acoustics, Speech and Signal Processing,2009:793-796.
    [22]Yui-Lam Chan, Wan-Chi Siu. Variable temporal-length 3-D discrete cosine transform Coding [J]. IEEE Transactions on Image Processing,1997, 6(5):758-763.
    [23]Wei Ni, Bao-long Guo, Liu Yang. Novel video coding algorithm based on 3D-binDCT [J]. Optoelectronics Letters,2005,1(3):228-231.
    [24]张鹤.基于四维n阶矩阵的彩色图像正交变换算法的研究[D].长春:吉林大学通信工程学院,2007.
    [25]桑爱军.三维矩阵彩色图像WDCT压缩编码[J].电子学报,2002,30(4):594-597.
    [26]朱艳秋.彩色图像三维矩阵变换压缩编码[J].电子学报,1997,25(7):16-21.
    [27]赵岩.彩色视频的四维矩阵离散余弦变换编码[J].中国图像图形学报,2003,8(6):620-624.
    [28]杜相文.面向对象的彩色视频四维矩阵DCT编码[D].长春:吉林大学通信工程学院,2005.
    [29]马行.基于上下文的四维矩阵视频编码[D].长春:吉林大学通信工程学院,2006.
    [30]马行.基于四维矩阵的立体视频压缩算法研究[D].长春:吉林大学通信工程学院,2009.
    [31]赵志杰.基于多维矩阵理论的彩色图像和视频编码研究[D].长春:吉林大学通信工程学院,2008.
    [32]Aijun Sang, Hexin Chen, M. Gabbouj.3D balance quantization encoding based on variable size matrix segmentation[C].2010 International Conference on Information and Automation,2010:2325-2328.
    [33]陈强,陈贺新,李文娟.基于3维矩阵变换的彩色图像质量评价方法研究[J].中国图像图形学报,2006,11(11):1732-1735.
    [34]Aijun Sang, Mianshu Chen, Hexin Chen, Lili Liu, TieningSun. Multi-dimensional vector matrix theory and its application in color image coding [J]. Imaging Science Journal,2010,58(3):171-176.
    [35]胡铁根.基于多维矢量DCT正交矩阵的视频流压缩算法的研究[D].长春:吉林大学通信工程学院,2008.
    [36]冯华.基于多维Walsh矢量正交矩阵的视频流压缩算法的研究[D].长春:吉林大学通信工程学院,2009.
    [37]李钰.彩色视频序列的多维矢量矩阵变换编码中时空相关性研究[D].长春:吉林大学通信工程学院,2009.
    [38]邓琳琳.基于视觉特性的彩色视频流压缩编码算法的研究[D].长春:吉林大学通信工程学院,2009.
    [39]Woods J W, Huang T S. Picture bandwidth compression by linear transformation and block quantization[C]. Picture Bandwidth Compression Symposium,1969.
    [40]Dinstein I, Rose K, Heiman A. Variable block-size transform image coder [J]. IEEE Transactions on Communications,1990,38(11):2073-2078.
    [41]Jain A K. Image data compression:a review [J]. Proceeding of the IEEE,1981, 69(3):349-389.
    [42]Sung-Chang Lim. Rate-distortion optimized adaptive transform coding [J].Optical Engneering,2009,48(8),087004.
    [43]J Vaisey, A Gersho. Image compression with variable block size segmentation [J]. IEEE Transactions on Signal Processing,1992,40(8):2040-2060.
    [44]李学明.自适应图像分块压缩算法的实现和性能分析[J].计算机工程与应用,2003,39(1):1-4.
    [45]Vaisey D, Gersho A. Variable block-size image coding [C]. IEEE International Conference on Acoustics, Speech, and Signal Processing,1987:1051-1054.
    [46]Chen C-T. Adaptive transform coding via quadtree-based variable blocksize DCT [C].1989 International Conference on Acoustics, Speech, and Signal Processing,1989:1854-1857.
    [47]Chun-tat See, Wai-kuen Cham. An adaptive variable block size DCT transform coding system [C]. Proceeings of 1991 International Conference on Circuits and Systems,1991:305-308.
    [48]Sullivan G J, Baker R L. Rate-distortion optimized motion compression for video compression using fixed or variable size blocks[C]. Global Telecommunications Conference,1991:85-90.
    [49]Dony R D, S Haykin. Optimally adaptive transform coding [J]. IEEE Transactions on Image Processing,1995,4(10):1358-1370.
    [50]Wien M. Variable block-size transforms for H.264/AVC [J]. IEEE Transactions on Circuits and Systems for Video Techology,2003,13(7):604-613.
    [51]Yufei Wang, Xunan Mao, Yun He. A dual quad-tree based variable block-size coding method [J]. Journal of Visual Communication and Image Representation, 2010,21 (8):889-899.
    [52]Nath V K, Hazarika D, Mahanta A. A 3D block transform based approach to color image compression [C]. TENCON 2008-2008 IEEE Region 10 Conference,2008:1-6.
    [53]Zaharia R, Aggoun A, McCormick M. Adaptive 3D-DCT compression algorithm for continuous parallax 3D integral imaging [J]. Signal Processing:Image Communication,2002,17(3):231-242.
    [54]Borko Furht, Ken Gustafson.an adaptive three-dimensional DCT Compression based on motion analysis[C]. Proceedings of the 2003 ACM symposium on Applied computing,2003:765-768.
    [55]邹鑫馨.基于3D-DCT的视频编码实现[D].成都:电子科技大学,2009.
    [56]Saya S, Vemuri R. An analysis on the effect of image activity on lossy coding performance[C]. Proceedings of the 2000 IEEE International Symposium on Circuits and Systems,2000:295-298.
    [57]Wook Hoong Kim, Jong Won Yi, Seong Dae Kim. A bit allocation method Based on picture activity for still image coding [J]. IEEE Transaction on Image Processing,1999,8(7):974-977.
    [58]JOEL MAX. Quantizing for minimum distortion [J]. IRE Transactions on Information Theory,1960,6(1):7-12.
    [59]Daly S.Application of a noise adaptive constrast sensitiveity function to image data compression [J]. Optical Engineering,1990,29(08):977-987.
    [60]Long-Wen Chang, Ching-Yang Wang, Shiuh-Ming Lee. Designing JPEG quantization tables based on human visual system[C].1999 International Conference on Image Processing,1999376-380.
    [61]Ngan K N, Leong K S, Singh H. Adaptive cosine transform coding of images in perceptual domain [J].IEEE Transactions on Speech and Signal Processing, 1989,3(11):1743-1750.
    [62]Wiegand T, Sullivan G J, G Bjontegaard, A Luthra. Overview of the H.264/AVC video coding standard [J]. IEEE Transactions on Circuits and Systems for Video Technology,2003,13(7):560-576.
    [63]Dihong Tian, Chen W H, Pi Sheng Chang. Hybrid variable length coding for image and video compression [C]. IEEE International Conference on Acoustics, Speech and Signal Processing,2007:I-1133-I-1136.
    [64]Gopal Lakhani. Optimal huffman coding of DCT blocks [J]. IEEE Transactions on Circuits and Systems for Video Technology,2004,14(4):522-527.
    [65]Chengjie Tu, Jie Liang, T. D. Tran. Adaptive runlength coding [J]. IEEE Signal Processing Letters,2003,10(3):61-64.
    [66]姜丽丽,赵德斌.基于复合上下文的自适应熵编码器设计与实现[J].计算机应用于软件,2007,24(6):98-100.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700