基于FOVEA框架的视觉编码技术的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
与现实世界类似,在计算技术领域同样存在数据量庞大和处理能力有限之间的瓶颈。近年来,由于数字影像技术的发展,每天都有大量的数字化视觉信息(图像、视频等)产生。然而,视觉信息数量的增长远远超过了现有的硬件能力,为应用领域带来了一定的压力。例如,图像和视频信息庞大的数据量为存储系统带来沉重负担;在传输时需要耗费大量的网络流量,同时受到网络带宽的限制,用户需要等待大量的时间。为解决这一瓶颈,有效利用现有的硬件条件,方便用户访问视觉冗余信息较为庞大的视频资料,成为近年来编码领域研究的热点。如果能在压缩系统中采用某种视觉技术来达到获取较低码率和较高视觉质量的目的,是很有意义的。为此,本文对HVS特性进行了研究,提出一种基于HVS“中心-环绕”特性、人眼亮度显著性、色度显著性、运动信息显著性等特性的视频压缩方法。
     视频压缩中,由于人眼实际得到的是解压缩后的图像,因此对重建图像质量的评价是人们非常关心的问题。本文在总结了各种视频质量评测方法的基础上,对基于结构失真的视频质量评测方法SSIM与传统评测方法PSNR进行了实验比较,并确定采用SSIM对本文的视频压缩方法进行评测。
     人眼的视锥细胞和神经细胞的分配是高度不均匀的,在小凹处密度很高,而周边区域的细胞密度则下降很快。因此HVS对视频图像的分辨率也是高度不均匀的。结合人眼的这种特性,本文对基于小凹的视频编码模型进行了研究,提出综合了以背景亮度与亮度变化梯度为基础的人眼亮度显著性、以色度变化梯度与时域色度差为基础的色度显著性、与以运动强度、运动对比度与运动矢量空间相位熵为基础的运动信息显著性三种特性的基于内容自适应foveation模型的视频编码方法。在假设关注点在视频图像中心的情况下,根据视频序列的内容对该视频不同区域的分辨率进行调整。实验结果表明,该方法能够获得较好的压缩效率。
Similar to the real world, computing technology field also exists a bottleneck between amount of data and the limited processing capacity. In recent years, because of the development of digital imaging technology, every day a large number of digital visual information (images, video, etc.) is produced. However, the growth of the number of visual information has far exceeds the capacity of the existing hardware, it brought a certain amount of pressure for application areas . For example, a large amount of data of images and video take a heavy burden to the storage systems , in transmission need to spend a lot of network traffic, and be limited by the network bandwidth limitations, users need a lot of time to wait. To figure out this bottleneck, using of existing hardware effectively and user-friendly access to visual information, become a hot area of research in recent years. In this paper, author give a new method of video coding base on“center– surround”, luma salience, chorma salience, and motion salience.
     In video compression, because actually receipt of the human eye is the image after decompression, so the evaluation of the quality of the reconstructed image is much concerned .This paper take a summary of video quality evaluation method, and the SSIM video quality evaluation method is analyzed and realized, and use the method in this paper.
     The human eye cone cells and nerve cells is a highly uneven distribution, high density in a small hollow, and the surrounding areas of the cell density decreased rapidly. To combine this characteristics of the human eye, the article give a new method of video coding integrated with luma salience which integrated with the background brightness and the luma gradient, chorma salience which integrated with the chorma on the timeline and the chorma gradient, and motion salience which integrated with the motion intensity, motion contrast, and motion phasic entropy. The experimental results show that this method can have better compression efficiency.
引文
1.林福宗.多媒体技术基础.清华大学出版社, 2000:45~61.
    2. Jerry D. Gibson, Toby Berger, Tom Lookabaugh.多媒体数字压缩原理与标准.李煜晖,朱山风,段上为译.电子工业出版社, 2000:186~2621.
    3. L. Chiariglione. MPEG and Multimedia Communications. IEEE Transactions on Circuits and Systems for Video Technology. 1997, 7(1):5-18.
    4. T.Wiegand. Joint Final Committee Draft (JFCD) of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC). Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG 4th Meeting , 2002: D157.
    5. T.Wiegand. Editor’s Proposed Draft Text Modifications for Joint Video Specification (ITU-T Rec. H.264|ISO/IEC 14496-10 AVC). Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG 5th Meeting, 2002: E022.
    6. T.Wiegand. Study of Final Committee Draft of Joint Video Specification (ITU-T Rec. H.264|ISO/IEC 14496-10 AVC). Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG 6th Meeting, 2002: F100.
    7. T.Wiegand, G. Sullivan. Working Draft Number 2 Revision 8 (WD-2r8). Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG , 2002: B228.
    8. Bo Hong. Introduction of H.264. url: http://www.utdallas.edu/~bhong/h.264.pdf, 2002.
    9. YB Zhang, XY Ji, Debin Zhao, Wen Gao. Video Coding by Texture Analysis and Synthesis Using Graph Cut. 7th Pacific-Rim Conference on Multimedia (PCM 2006), Hangzhou, China, Nov.2-4, 2006: 582-589.
    10. X.F Wang, De-Bin Zhao Performance Comparison of AVS and H.264/AVC Video Coding Standards J.Comput. Sci. &Technol. 2006:21(3): 310-314.
    11. M.Boliek, C. Christopoulos, E. Majani(editors). JPEG2000 Part Final Draft International Standard. ISO/IEC JEC1/SC29/WG1, 2000: N1855.
    12. ISO/IEC JTC1/SC29/WG1. JPEG2000 Verification Model 7.0. ISO/IEC JTC1/SC29/WG1, 2000: N1684.
    13. P.J. Burt, E. H. Adelson. The Laplacian Pyramid as a Compact Image Codec.IEEE Trans. on Communications, April 1983: 532-540.
    14.谈新权.视频技术基础.华中科技大学出版社,2004:1-30.
    15. SHARF,A,ALEXA,M.,COHEN-OR Context-based Surface Completion ACM Trans on Graphics, 2003:878-887.
    16. J.Rissanen and G.G.Langdon. Arithmetic Coding. IBM J. Res. March 1979: 149-162.
    17. Marpe, D. Bl?ttermann, G., Heising, G., Weigand, T. Video Compression Using Context-Based Adaptive Arithmeic Coding. Proc. IEEE ICIP. 2001:558-561.
    18. Gisle Bjφntegarrd and Karl Lillevold. Context Adaptive VLC (CAVLC) Coding of Coefficients. Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG3Meeting, May, 2002: C028.
    19. J. Portilla and E. P. Simoncelli, A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients, Int. J. Compute. Vision, 2000:40(1): 49-71.
    20. S. Valaeys, G. Menegaz, F. Ziliani, and J. Reichel, Modeling of 2D+1 Texture Movies for Video Coding, Image Vision Comput. 2003:21(1):49-59.
    21. P. Ndjiki-Nya, B. Makai, G. Blattermann, A. Smolic, H. Schwarz, and T. Wiegand, Improved H.264/AVC Coding using Texture Analysis and Synthesis, in Proc. IEEE Int. Conf. Image Process. Sept. 2003.
    22. A. Dumitras and B. G. Haskell, An encoder-decoder texture replacement method with application to content-based movie coding, IEEE Trans. Circuits Syst. Video Technol. Jun. 2004:14:825-840.
    23. C Wang, Xiaoyan Sun, Feng Wu,Hongkai Xiong. Image Compression with Structure-Aware Inpainting. IEEE International Symposium on Circuits and Systems. ISCAS 2006.
    24. A. B. Watson:Image Data Compression having Minimum Perceptual Error. US Patent 5,629,780, 1997.
    25. Z. Wang and A. C. Bovik, A human visual system-based objective video distortion measurement system, in Proc. Int. Conf. Multimedia Process. Syst.Aug. 2000.
    26. I. Hontsch and L. J. Karma, Adaptive image coding with perceptual distortion control, IEEE Transaction on image processing, 2002:11(3).
    27. X. Yang, W. Lin, Z. Lu, X. Lin, S. Rahardja, E. P. Ong, and S. Yao, Rate Control for Videophone using Perceptual Sensitivity Cues, IEEE Trans. Circuits Syst. Video Technol. Apr. 2005:15(4):496-507.
    28. J. Li, G. Chen, K. Yu, T. He, Y. Lin, S. Li and Y. Q. Zhang, Scalable Portrait Video for Mobile Video Communication, in IEEE Trans. Circuits and Systems for Video Technology, May 2003:13(5): 376-384.
    29. Hui Cheng, Arkady Kopansky, and Michael A. Isnardi, Reduced Resolution Residual Coding for H.264-based Compression System, in IEEE International Symposium on Circuits and Systems, 2006: 3486-3489.
    30. T.N.Comsweet. Visual perception. Academic press, New York, 1970.
    31. W. Schreiber, Fundamentals of Electronic Imaging Systems. Springer Verlag, New York, 1993.
    32. Z.Wang Chapter 41 in The Handbook of Video Databases: Design and Applications, B. Furht and O. Marqure, ed., CRC Press, 2003:1041-1078.
    33. C. J. van den Branden Lambrecht, A working Spatio-Temporal Model of the Human Visual System for Image Restoration and Quality Assessment Applications, in Proc. IEEE Int. Conf. Acoust., Speech, and Signal Processing, 1996:2291-2294.
    34. A. B. Watson, and J. F. III. McGowan, DVQ: A Digital Video Quality Metric based on Human Vision, Journal of Electronic Imaging, 2001:10(1):20-29.
    35. D.H. Kelly. Motion and vision II, Stabilized spatio-temporal threshold surface.J Opt Soc Amer,1979:69(10):1340-1349.
    36. D.H. Kelly Spatiotemproal Variation of Chromatic and Achromatic Contrast Thresholds.J Opt Soc Amer,1983:73(6):742-750.
    37. J.Yang, W. Markous. Spatiotemporal Separability in Contrast Sensitivity, Vision Res, 1994:34(9):2569-2576.
    38. J. G. Daugman. Two-Dimensional Spectral Analysis of Cortical Receptive Field Profiles.Vision Res,1980:20(10):847-856.
    39. R E Fredericksen, R F Hess. Estimationg Multiple Temporal Mechanisms in Human Vision[J]. Vision Res,1998:38(7):1023-1040.
    40. J. G. Daugman:Two-dimensional spectral analysis of cortical receptive field profiles. Vision Res, 1980:20(10):847-856.
    41. A. B. Watson:The cortex transform: Rapid computation of simulated neural images. Computer Vision, Graphics, 1987:39(3):311-327.
    42. Reichel et al.:Integer Wavelet Transform for Embedded Lossy to Lossless Image Compression.”to be published in IEEE Transactions on ImageProcessing 2000.
    43. Levine M.D. Vision in Man And Machine. New York,1985.
    44. P. C. Teo and D. J. Heeger, Perceptual Image Distortion, in Proc. IEEE Int. Conf. Image Processing, 1994:982-986.
    45. Z. Wang, A. C. Bovik and L. Lu, Why is Image Quality Assessment so Difficult Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Proc. 2002:4: 3313-3316.
    46. CCIR, Method for The Subjective Assessment of the Quality of Television Pictures, Recommendations and Report of the CCIR, 1982.
    47. VQEG. Final Report from the Video Quality Experts Group on the Validation of Objective Models of Video Quality Assessment, http://www.vqeg.org/, 2000:3.
    48. Wang Zhou, Bovik Alan C., Sheikh Hamid R. and Simoncelli Eero P. Image Quality Assessment: From Error Visibility to Structural Similarity, IEEE Transactions on Image Processing, 2004:13(4): 600-612.
    49. W. S. Geisler and J. S. Perry, A real-time foveated multiresolution system for lowbandwidth video communication, Human Vision and Electronic Imaging, Proc. SPIE, 1998:3299: 294-305.
    50. R. S. Wallace, P.-W. Ong, B. B. Bederson, and E. L. Schwartz, Space Variant Image Processing, Int. J. Comput. 1994:13(1):71-90.
    51. Z. Wang, Rate Scalable Foveated Image and Video Communications, Ph.D. dissertation, Dept. of ECE, University of Texas at Austin, Dec. 2001.
    52. N. Tsumura, C. Endo, H. Haneishi and Y. Miyake, Image compression and decompression based on gazing area, Human Vision and Electronic Imaging, Proc. SPIE, 1996:2657: 361-367.
    53. S. Lee, M. S. Pattichis and A. C. Bovik, Foveated video quality assessment, IEEE Trans. Multimedia, Mar. 2002:4(1): 129-132.
    54. C.-H. Chou, C.-W. Chen, A perceptually optimized 3-D subband image codec for video communication over wireless channels, IEEE Trans. Circuits Syst. Video Technol. 1996: 6 (2): 143-156.
    55. Levine M.D. HVS sensitivity of the chroma under the different conditions. New York,2005.
    56. C.-H. Chou, Y.-C. Li, A perceptually tuned subband image coder based on the measure of just-noticeabledistortion profile, IEEE Trans. Circuits Syst. Video Technol. 1995 :5 (6) :467-476.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700