基于H.264MVC多视点立体视频编码研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着人们生活水平的提高,人们对视频享受的要求也一直在不断的提高,从黑白电视到彩色电视,从模拟电视到高清数字电视。现在,人们对视频的观看已经不仅停留在平面视频的水平上,而是过渡到了立体视频的时期。随着立体显示技术的发展,有关三维(3D)视觉的研究逐渐升温。作为表征3D视频信号的重要方式,多视点视频的研究已成为目前图像视频领域的重要研究方向。多视点立体视频本身合成困难、数据量大且难于传输,这些问题一直是制约多视点立体视频技术产业化和实用化的主要瓶颈,因此,必须对原始视频进行有效的压缩处理,以适应立体视频技术发展的要求。
     联合视频专家组正在积极研究开发的H.264的扩展集中包括多视点视频编码MVC(Multi-view Video Coding)。MVC继承了H.264的优异性能:高质量的编码效率,自由的编码结构,良好的网络兼容性,同时MVC还增加了时间的可分级性,视点可分级性,光照补偿,视点间预测等编码工具,非常适用于立体视频的压缩编码。
     本文首先介绍了立体视频的发展历史和研究情况,然后从人类视觉系统特性开始,介绍了立体视觉和立体显示的各种理论。接着,介绍了H.264编码标准,提出了基于H.264的多视点视频编码方案,并研究了联合多视点视频编码方案,对联合多视点视频模型的各个参数进行了总结,并对其中的运动估计做了详细分析,此模型可以实现高分辨率的视频编码,最后用实验验证了联合多视点视频编码方案的编码效率和压缩率等内容。
With the improvement of people’s living standards, the requirements of video quality have been becoming more and more demanding, for example, from analog TV to HDTV. Nowadays, people watch multi-view three-dimensional TV instead of complanate video. Studies on three-dimensional (3D) vision draw more and more of people’s attention due to advancements in the 3D display technologies. As a critical component of 3D display technology, multi-view stereo video has become one of the important research orientations recently. Multi-view stereo video itself is difficult to be composite and has a large amount of data, which make it hard to be transmitted. The above problem has become the bottleneck that prevents the multi-view stereo video from industrialization and application. Therefore, effective compression of original data is needed to accommodate the demands from stereo video technology.
     Joint Video Team is researching on the development of the H.264 extension, including Multi-view Video Coding (MVC), which inherits the merits of H.264. These are high-quality coding efficiency, free coding structure and good network compatibility. Besides, several new coding tools such as temporal scalable structure, viewpoint scalability and illumination compensation, which suits very well for stereo video compression, are added to MVC structure.
     First of all, the history and research status of stereo video are introduced. In addition, several theories on stereo video and stereo display are presented. In the following chapters, H.264 coding standard is given, Multi-view stereo video coding scheme based on H.264 is proposed, Joint Multi-view Video Model (JMVM) coding scheme is researched, Joint Multi-view Video Model parameters are summarized, and Motion Estimation is analyzed in detail. The Multi-view Joint Video Model can realize high-resolution video encoding. Finally its coding efficiency and compression ratio are verified in experiments.
引文
[1] V.S.Nalwa, A Guided Tour of Computer Vision[M]:Addson-Wesley,1993
    [2] S.Aljoscha,M.C.Chen,3DAV exploration of video based rendering technology in MPEG[J],IEEE Transaction on Circuit and Systems for Video Technology, vol.14(3), 2004.348-356
    [3] T.Kanade, P.Rander, P.Narayanan, Virtualized reality: Constructing virtual worlds from real scenes[J], IEEE Multimedia, Immersive Tele-presence 4, 1997.34–47
    [4]钱诚,戴琼海,多视序列编码概述,2005国际有线电视技术研讨会,杭州,2005,428-434
    [5]荆其诚,焦书兰,纪桂萍,人类的视觉,北京:科学出版社,1987年5月第一版,114-134
    [6] Lydia Meesters, TU/e, Eindhoven University of Technology, Survey of perceptual quality issues in three-dimensional television systems
    [7] W.A. IJsselsteijn, P.J.H.Seuntens, L.M.J.Meeters, State-of-the-art in human factors and quality issues of stereoscopic broadcast television, Deliverable ATTESR/WP5/01,Aug.2002
    [8] Levent ONURAL, Thomas SIKORA, An Assessment of 3DTV Technologies. Online Available: http://www.3dtv-research.org
    [9] Lydia M.J.Meesters,wijnand A.Usselsteijn and Pieter J.H.Seuntiens. A Survey of Perceptual Evaluations and Requirements of Three-Dimensional TV. IEEE Transaction on Circuits and Systems for Video Technology, 2004, VOL.14(3):333-339
    [10] Yao Wang, Jorn Ostermann, Ya-Qin Zhang.视频处理与通信(候正信,杨喜,王文全译)北京:电子工业出版社.2003
    [11]刘峰,视频图像编码技术及国际标准,北京:北京邮电大学出版社,2005
    [12] Masayuki Tanimoto, Toshiaki Fujii, Hideaki Kimata, Proposal on Requirements for FTV, ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-W127.doc, April. 2007
    [13]侯春萍,平面图像立体化技术的研究:[博士学位论文],天津;天津大学,1998.27-34
    [14] Ye Zhang and Chandra Kambhamettu, Integrated 3D Scene Flow and Structure Recovery from Multiview, Image Sequences, Video/Image Modeling and Synthesis Lab, 2000 IEEE
    [15] Liang Zhang, Tam, W.J., Stereoscopic Image Generation Based on Depth Images for 3D TV, Broadcasting, IEEE Transactions on Volume 51, Issue 2, June 2005.191-199
    [16] C.Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, Richard Szeliski, High-quality video view interpolation using a layered representation, Interactive Visual Media Group, Microsoft Research, Redmond, WA
    [17] Masyuki Tanimoto, Toshiaki Fujii, Hideaki Kimata, Proposal on Requirement for FTV, ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-W127.doc, April 2007
    [18]张以漠,应用光学,北京:机械工程出版社,1988:35-36 , 375-385
    [19] [日]福岛邦彦著,马万禄等译,视觉生理与仿生学,北京:科学出版社, 1980 . 28-36 , 238
    [20]郑竺英,双眼立体视觉的信息加工,北京:科学出版社,1998. 23
    [21] Owen M . Thomas , Bruce G . Cumming and Andrew J . Parker , A specialization for relative disparity in V2, nature neuroscience , volume 5 no 5 , May 2002
    [22] Richard Hartley , Multiple View Geometry in Computer Vision Second Edition , Cambridge University Press , Second Edition 2003
    [23] Michael A . Sutton , Stephen R . McNeill , Jeffrey D . Helm , Advances in Two-Dimensional and Three-Dimensional Computer Vision , P . K . Rastogi ( Ed . ) : Photomechanics , Topics Appl . Phys . 77,2000.323-372
    [24] Mubarak Shah , Fundamentals of computer vision , Computer Science Department University of Central Florida Orlando , December 1997
    [25] Josef Bigun , Vision with Direction A Systematic Introduction to Image Processing and Computer Vision , Springer-Verlag Berlin Heidelberg 2006
    [26] T. Okoshi, Three-Dimensional Imaging Techniques, Academic Press, 1976
    [27] Sexton I., Surman P., Stereoscopic and Autostereoscopic Display System, Signal Processing Magazine. 1999.85-99
    [28] Jungyoung Son and Bahram Javidi, Three-Dimensional Imaging Methods Based on Multiview Images, IEEE / OSA Journal of Display Technology, VOL. 1, NO . 1 , September 2005
    [29] Dodgson N.A., Autostereoscopic 3D Displays, Computer Aug. 2005.31-36
    [30] Dodgson N. A., Autostereoscopic 3D Displays, Computer Volume 38, Issue 8, Aug . 2005 , 31-36
    [31] Cees Van Berkela and John A Clarke , Characterization and Optimization of 3D-LCD Module Design, Philips Research Laboratories , UK , Published ProcSPIE VOL 3012 , 1997 .179-187
    [32] C van Berkel, A R Franklin, Design and Applications of Multiview 3D-LCD , Philips Research Laboratories , UK , Proc SID Euro–Display 96 , 1996 .109-112
    [33] Nick Holliman , 3D Display Systems , Department of Computer Science , University of Durham: 1 , Science Laboratories , Novembers , 2002
    [34] KPung-Moo Huh , Young-Bin Park , A Viewpoint-dependent Autostereoscopic 3D Display Method, Dept. of Electronic Engineering , Dan kook University . Cheonan , Chung-Nam , Korea , 330-714
    [35] Andre Redert, Emile Hendriks, and Jan Biemond, 3-D Scene Reconstruction with View Point Adaptation on Stereo Displays ,IEEE Transaction on Circuits and System for Video Technology, VOL. 10, No. 4 , June 2000.350-362
    [36] Jeffrey Scott Mc Veigh, Efficient Compression of Arbitrary Multi-view Video Signals, Carnegie Mellon University, Doctor of Philosophy in Electrical and Computer Engineering, June, 1996
    [37] Ru-Shang Wang and Yao Wang, Multiview Video Sequence Analysis, Compression, and Virtual Viewpoint Synthesis, IEEE Transaction on Circuits and System for Video Technology, VOL.10, NO.3, April 2000. 297-410
    [38] Emin Martinian , Alexander Behrens, Jun Xin, and Anthony Vetro, View Synthesis for Multiview Video Compression, TR2006-035 April 2006, Online Available: http://www.merl.com
    [39] Emin Martinian , Alexander Behrens, Jun Xin, Extensions of H.264/AVC For Multiview Video Compression, TR2006-048 May 2006, online Available: http://www.merl.com
    [40] Christoph Fehn, A 3D-TV System Based On Video Plus Depth Information, Image Processing Department, 2003 IEEE, 1529-1533
    [41] Jens-Rainer Ohm, Stereo/Multiview Video Encoding Using the MPEG Family of Standards Heinrich-Hertz-Institut, Image Processing Department. Einsteinufer 37,D-10587 Berlin, Germany
    [42] Lei Yang, Xiaowei Song, Chunping Hou, A Method of H.264 Based Still Stereoscopic Pictures Compression, IEEE CCECE/CCGEL Ottawa. May 2006,306-309
    [43] Shiping Li, Mei YUI, Gangyi JIANG, Approaches To H.264--Based Stereoscopic Video Coding , Proceedings of the Third International Conference on Image and Graphics ( ICIG ' 04 ) , 2004 IEEE
    [44] Iain E. G.Richardson. H.264 and MPEG-4 Video Compression Video Coding for Next-generation Multimedia , John Wiley & Sons Ltd
    [45] Viktor Nordling , Efficient Compression of Stereoscopic Video Using the MPEG Standard , Master’s Thesis in Computer Science , TRITA-NA-EO3153.2003
    [46] Hideaki Kimata , Masaki Kitahara , and Yoshiyuki Yashima , 3D Motion Vector Coding with Block Based on Adaptive Interpolation Filter on H.264, Proceedings of the 2003 IEEE International Conference on Acoustics , Speech , & Signal Processing , April6-10 . 2003.597-600
    [47] A.B.B. Adikari, W.A.C. Fernando , H. Kodikara Arachchi , A H . 264 Compliant Stereoscopic Video Codec, IEEE CCECE / CCGEI, Saskatoon , May 2005. 1614-1671
    [48] Guoping Li , Yun He , A Novel Multi-View Video Coding Scheme Based on H.264, ICICS-FCM ZW3 , Singapore , 15-18 December 2003.493-497
    [49] Anthony Vetro , Purvin Pandit , Hideaki Kimata , Joint Multiview Video Model ( JMVM ) 3.0 , ISO/ IEC JTCI / SC29 / WG11 and ITU-T SG16 Q.6 , JVT-V207.doc , January 2007
    [50] Anthony Vetro, Purvin Pandit, Hideaki Kimata, Joint Draft 2.0 on Multiview video Coding , ISO/ IEC JTC1 / SC29 / WG11 and ITU-T SG16 Q.6 , JVT-V209.doc , January 2007
    [51] JMVM 8.0.1 and JMVM Software Manual , CVS tag : JMVM _8_0_1, June3, 2008
    [52] Sriram Sethuraman. Stereoscopic Image Sequence Compression Using Multi-resolution and Quad tree Decomposition Based Disparity and Motion-adaptive Segmentation, Carnegie Mellon University, Doctor of Philosophy in Electrical Engineering, July , 1996
    [53] Pieter J.H. Seuntiens , Visual Experience of 3D TV , Technische University Eindhoven , 2006
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.