三维视频视觉质量及增强处理研究

英文题名：Research on3D Video Visual Quality and Enhancement Processing
作者：赵寅
论文级别：博士
学科专业名称：通信与信息系统
中文关键词：三维视频 ; 视觉质量 ; 视频处理 ; 双目视觉 ; 视点合成
英文关键词：3D video ; binocular visual property ; video processing ; view synthesis ; visual
英文关键词：quality
学位年度：2013
导师：虞露
学科代码：081001
学位授予单位：浙江大学
论文提交日期：2013-04-01
答辩委员会主席：庄越挺

摘要

三维视频是近十年来视频处理领域的一大热点。它将传统的平面图像生动地拓展到三维空间,带给观众更为真切的视觉体验。继传统的双目立体视频系统之后,一种新式的基于深度的三维视频系统正逐渐兴起：它通过虚拟视点合成由参考视点的纹理和深度图像产生目标虚拟视点的纹理图像,从而可由参考视点和虚拟视点图像共同构成具有合适视差的立体视频,实现用户自定义的、更为舒适的三维立体视觉体验,因此被ITU和MPEG等组织确立为第二代三维电视系统。
     三维视频的视觉质量较二维视频更为复杂。双目立体观看中,不同的双目图像将激发深度感知、双目融合和双目竞争等多种双目视觉特性,从而引入双目感知相关的视觉质量影响因素。因此,三维视频的视觉质量被认为是包含图像质量和深度感受等若干因素的多维问题；求解这个复杂问题,可能需要对众多因素分而治之,依据视觉特性建立定量计算模型,从而逐步完善理论基础。
     三维视频系统包括内容获取、压缩传输、三维重现等多个环节,其中每一环节均可能扭曲理想的三维视频信号、引入视觉失真。三维视觉质量增强既可以从消除视觉敏感的某些类型失真出发,直接提升图像的视觉自然度,也可以从更好地保持受损信号与理想信号之间的信号保真度出发,降低失真的整体强度,间接地增强主观质量。
     本文第一章对比传统双目立体视频系统和新一代基于深度的三维视频系统,并简要阐述三维视觉质量的多种影响因素以及三维视频系统中视觉失真的引入途径。随后,总结回顾近年来三维视频视觉质量和三维视频处理领域的发展状况,进而阐明三维视觉质量的研究思路和三维视觉质量增强处理的两个可行方向。
     第二章详细阐述三项关于三维视觉质量影响因素建模的研究工作,这些研究包括：1)创新地提出结合人眼视觉敏感度、视觉暂留、最近效应等多个重要视觉特性,通过时空域联合分析受损视频视觉失真强度的客观质量评价算子,其系统地模拟主观质量评价过程,并且在多个质量评价数据库上稳定地取得了世界领先的评价准确度；2)率先提出基于双目融合特性和掩蔽效应的双目最小可辨别差异模型,它将传统的双目最小可辨别差异模型拓展为更完善的双目模型,为双目信号失真感知提供理论基础；3)建立了世界首个双目竞争失真的主观样本数据库,其指明三维视频处理可能引起人眼敏感的双目竞争现象,并为双目竞争失真的检测和抑制方法研究提供主观实验样本。
     第三章围绕虚拟视点合成模块,通过深入的理论分析,回答了两个重要问题：为什么合成图像中物体边缘区域容易产生显著的视觉失真?以及怎样通过视点合成使三维视频能够自动地呈现舒适的深度感?基于这些问题的分析结果,提出了两项视觉质量增强方法：1)抑制物体边缘失真的纹理图像改善方法,它从根本上分析并防止边缘失真的形成,相比于传统方法具有更好的理论性能和实际效果；2)在不同显示器上保持相同舒适深度感受的三维体验增强方法,该方法已经被JCT-3V组织采纳为三维视频后处理的一种辅助增强方法。以上两项工作分别从纹理真实度和深度舒适度出发,提高三维视频的视觉自然度。
     第四章以三维视频编码为中心,提出面向重建三维视频保真度的理论模型与增强方法。一方面,通过理论推导,建立不引起视点合成图像变化的深度失真容限模型,该模型是面向目标虚拟视点图像质量的深度压缩等处理的理论基础,并基于该深度失真容限模型提出改善深度编码效率的两项编码预处理方法。另一方面,通过深度指示的视间对应关系,率先提出在非对称纹理编码应用中,以重建的高质量视点信号增强低质量视点图像保真度的编码增强后处理,该方法明显提升低质量视点的信号保真度,并帮助抑制其中的振铃效应和块效应。这两项研究工作目标在相同的传输代价下产生具有更高保真度的解码端图像,因而也有助于提高三维视频系统的运行效率和服务质量。
     第五章首先总结本文的研究工作；之后,对三维视频质量分析和增强处理的未来动向作简要分析,并对三维视频产业的发展进行初浅的前瞻。
     综上所述,本文的研究工作以认识和提高三维视频的视觉质量出发,通过主观实验和理论分析,增进了对三维视频视觉质量的理解,提出了三维视频质量评价和增强处理方面的一些创新理论和技术方法,帮助解决三维视频视觉质量改善方面的多个热点科学问题,为今后三维视频的广泛应用提供理论积淀和技术储备。
3D video (3DV) is a hot topic in the field of video processing in the past decade.3DV vividly extends2D images into a third dimension, thus creating immersive viewing experience. Preceded by stereoscopic video system, a new depth-based3D video system has received increasing interest. The depth-based3DV system employs view synthesis to generate virtual views, and can realize user-defined depth sensation by composing stereo pairs of arbitrary baseline with synthesized virtual views. Owing to this benefit, depth-based3DV is considered a strong candidate for the second-generation3DTV by several organizations, e.g., International Telecommunication Union (ITU) and Motion Picture Expert Group (MPEG).
     Visual quality of3D video is surely more complex than that of2D video. In stereoscopic viewing, two different images for left and right eyes may evoke binocular properties, such as the stereopsis, binocular fusion and binocular rivalry, in which the activated binocular properties can further impact the overall3D visual quality. Therefore,3D visual quality is generally considered as a high-dimension problem involving at least both image quality and depth quality. To solve this complicated issue, it may be necessary to separate attributes of3D visual quality and to establish computational models for them. These research efforts are believed to contribute to future development of3D visual quality theory.
     3DV system includes several components to make a real3D scene be shown to remote viewers, including content acquisition, data compression and transmission, and3D visualization. Each of them may distort ideal3D video signal and introduce visual impairments, thus resulting in visual quality degradation. There are two ways to promote3D visual quality:1) to enhance the naturalness of3D video by eliminating specific sensitive visual distortions, and2) to improve fidelity of processed3D video signal to reduce the overall distortion strength and thus promote visual quality.
     Chapter1introduces framework of conventional stereoscopic system and the new depth-based3DV system, and elaborates contributing factors to3D visual quality as well as the cause of visual distortions in3DV. Then, recent research progress in3D visual quality and quality-enhancing processing is reviewed, and the research directions for3D quality analysis and enhancement presented in this thesis are introduced.
     Chapter2discusses three works related to the modeling of3D visual quality attributes, including:1) a video quality model based on many monocular visual properties (e.g., visual sensitivity, visual persistence and recency effect), which systematically simulates subjective quality evaluation process and consistently outperforms state-of-the-art metrics on several VQA databases;2) a binocular just-noticeable-difference (JND) model which considers binocular fusion and masking effects and extends conventional JND model into binocular model, and3) establishment of the world's first subjective database for binocular rivalry in real processed images, which points out the existence of binocular rivalry in processed3D video and collect some subjective samples of binocular rivalry artifacts for future development of binocular rivalry detection and suppression methods.
     Chapter3highlights the view synthesis module. Two questions are answered based on deep analysis:why salient visual distortions in synthesized images are prone to appear at object boundaries, and how to create comfortable depth sensation with synthesized views. Based on the answers, we develop a method to improve texture quality by suppressing boundary artifacts in synthesized images, which is theoretically and experimentally superior to existing methods, and also propose a scheme to maintain the same comfortable depth sensation on different3D displays, which is adopted by JCT-3V as a recommended post-processing technique for3D video. The two works enhance the naturalness of3D video from image quality and depth sensation, respectively.
     Chapter4proposes a theoretical model and a quality enhancing technique oriented at3DV signal fidelity. First, we develop a model describing critical depth distortions that still results in the same synthesized images, which is the theoretical fundamentals of depth processing optimized for target synthesized views, and propose its applications in depth map coding. On the other hand, with the inter-view correspondences indicated by depth data, we attempt to use the high-quality view to enhance the low-quality view in the context of asymmetric coding of texture video in3DV, which significantly improves signal quality of the low-quality view and efficiently suppresses blocking and ringing artifacts in the more-severely-distorted view. These two algorithms contribute to better signal fidelity of the coded and synthesized views, at the same data transmission cost.
     Chapter5first concludes the thesis, and then points out some future movements in the field of3D visual quality and related enhancement processing. The prospect of3DV applications is also briefly discussed.
     In summary, this paper is oriented at the understanding and enhancement of visual quality of3D video. With psychophysical experiments and theoretical derivation, we enrich the comprehension of3D visual quality, and propose novel theories and techniques for visual quality evaluation and enhancement, which help solve the key issues in improving3D visual quality and also contribute to the development of3DV systems.

引文

[1]A. Kubota. A. Smolic. M. Magnor. M. Tanimoto. T. Chen and C. Zhang, "Multiview imaging and 3DTV." IEEE Signal Processing Magazine, vol.24, no.6, pp.10-21, Nov.2007.
    [2]L. Onural. "Signal processing and 3DTV," IEEE Signal Processing Magazine, vol.27, no.5, pp.144-142. Sept.2010.
    |3] J. Konrad and M. Halle, "3-D displays and signal processing," IEEE Signal Processing Magazine, vol.24, no. 7. pp.97-111. Nov.2007.
    [4]P. Benzie. J. Watson. P. Surman. I. Rakkolainen. K. Hopf. H. Urey. V. Sainov and C. von Kopylow, "A survey of 3DTV displays:techniques and technologies." IEEE Transactions on Circuits and Systems for Video Technology, vol.17, no.11, pp.1647-1658, Nov.2007.
    [5]N. S. Holliman, N. A. Dodgson. G. E. Favalora. and L. Pockett, "Three-dimensional displays:a review and applications analysis," IEEE Transactions on Broadcasting, vol.57. no.2. pp.362-371, June 2011.
    [6]H. Urey. K. V. Chellappan, E. Erden. P. Surman, "State of the art in stereoscopic and autostereoscopic displays," Proceedings of the IEEE, vol.99, no,4. pp.540-555. Apr.2011.
    [7]M. Cho, M, Daneshpanah. I. Moon, and B. Javidi, "Three-dimensional optical sensing and visualization using integral imaging," Proceedings of the IEEE, vol.99. no.4. pp.556-575. Apr.2011.
    [8]L. Onural. F. Yaraz. and H. Kang. "Digital holographic three-dimensional video displays." Proceedings of the IEEE, vol.99. no.4. pp.576-589. Apr.2011.
    [9]G. E. Favalora. "Volumetric 3D displays and application infrastructure." Computer, vol.8. no.8. pp.37-44. Aug.2005.
    [10]K. Thyagarajan and A. Ghatak. "Spatial frequency filtering and holography." Lasers:Fundementals and Applications. Springer.2011. pp.389-402,
    [11]B. Lee. J.-H. Park, and S.-W. Min. "Three-dimensional display and information processing based on integral imaging." Digital Holography and Three-dimensional Display, Springer,2006, pp.333-378. Ch.12.
    [12]C. Fehn, "A 3D-TV approach using depth-image-based rendering (DIBR),'" in Proc. Visualization, Imaging and Image Processing (VHP),2003. pp.482-487.
    [13]L. Zhang and W. J. TaM, "Stereoscopic image generation based on depth images for 3D TV," IEEE Transactions on Broadcasting, vol.51. no.2. pp.191-199,2005.
    114] K. Muller, A. Smolic. K. Dix, P. Merkle, P. Kauff and T. Wiegand, "View synthesis for advanced 3D video systems." EURASIP Journal on Image and Video Processing, vol.2008, Article ID 438148.
    [15]D. Tian, P. Lai, P. Lopez, and C. Gomila, "View synthesis techniques for 3D video," in Proc. Applications of Digital Image Processing XXXII, vol.7443.2009, pp.74430T-1-11.
    [16]Y. Mori, N. Fukushima, T. Yendo, T. Fujii, and M. Tanimoto, "View generation with 3D warping using depth information for FTV." Signal Processing:Image Communication, vol.24. no.1-2, pp.65-72, Jan.2009.
    [17]S. Zinger, L. Do, and P. H. N. de With. "Free-viewpoint depth image based rendering," Journal of Visual Communication and Image Representation, vol.21, pp.533-541, July 2010.
    [18]Y. Zhao, C. Zhu, and L. Yu, "Virtual view synthesis and artifact reduction techniques",3DTV System with Depth-image-based Rendering, Springer,2012, ch.5.
    [19]C. Zhu, Y. Zhao, L. Yu, and M. Tanimoto,3DTV System with Depth-image-based Rendering, Springer, August 2012. ISBN 978-1-4419-9963-4.
    [20]T. Kanade and M. Okutomi, "A stereo matching algorithm with an adaptive window:theory and experiment." IEEE Trans, Pattern Analysis and Machine Intelligence, vol.16, no.9, pp.920-932, Sept.1994.
    [21]D. Scharstein and R. Szeliski. "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," International Journal of Computer Vision, vol.47, no.1-3, pp.7-42, Apr.2002.
    [22]J. Sun. N. Zheng, and H. Shum, "Stereo matching using belief propagation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol.25, no.7. pp.787-800, July 2003.
    [23]Y. Boykov and V. Kolmogorov, "An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26, no.9, pp. 1124-1137,2004.
    [24]D. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol.2, no.60, pp.91-110,2004.
    [25]R. Lange and P. Seitz, "Solid-state time-of-flight range camera," IEEE Journal of Quantum Electronics, vol. 37, no.3, pp.390-397, March 2001.
    [26]D. Scharstein and R. Szeliski, "High-accuracy stereo depth maps using structured light," in Proc. Computer Vision and Pattern Recognition,2003, pp.195-202.
    [27]Q. Yang, R. Yang. J. Davis, and D. Nister, "Spatial-depth super resolution for range images." in Proc. Int. Conf. Comput. I'is. Pattern Recog.,2007, pp.1-8.
    [28]J. Zhu. L. Wang, R. Yang, and J. Davis, "Fusion of time-of-flight depth and stereo for high accuracy depth maps," in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp.231-236, June 2008.
    [29]B. Bartczak and R. Koch. "Dense depth maps from low resolution time-of-flight depth and high resolution color views." in Proc. Int. Symp. Adv. Vis. Comput.,2009. pp.228-239.
    [30]A. Smolic, P. Kauff, S. Knorr. A. Hornung, M. Kunter, M. Muller, and M. Lang, "Three-dimensional video postproduction and processing." Proceedings of the IEEE, vol.99, no.4, pp.607-625, Apr.2011.
    [31]P. Harman. J. Flack. S. Fox, and M. Dowley. "Rapid 2D to 3D conversion," in Proc. SPIE, vol.4660,2002. pp.78-86.
    [32]W. J. Tam and L. Zhang. "3D-TV content generation:2D-to-3D conversion." in Proc. IEEE International Conference on Multimedia and Expo (ICME), Toronto, Canada,2006.
    [33]L. Zhang, C. Vazquez, and S. Knorr, "3D-TV content creation:automatic 2D-to-3D video conversion." IEEE Transaction on Broadcasting, vol.57. no.2, pp.372-383, June 2011.
    [34]S. Battiato, S. Curti, and M. La Cascia. "Depth map generation by image classification," in Proc. SPIE, vol. 5302,2004, pp.95-104.
    [35]J. Ens and P. Lawrence, "An investigation of methods for determining depth from focus," IEEE Trans. Pattern Anal. Mach. Intell., vol.15, no.2, pp.97-108.1993.
    [36]K. Moustakas, D. Tzovaras, and M. G. Strintzis, "Stereoscopic video generation based on efficient layered structure and motion estimation from a monoscopic image sequence," IEEE Trans. Circuits Syst. Video Technol., vol.15, no.8. pp.1065-1073, Aug.2005.
    [37]Y. Feng, J. Ren. and J. Jiang, "Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications," IEEE Transaction on Broadcasting, vol.57, no.2, pp.500-509, June 2011.
    [38]J. Lou, H. Cai, J. Li, "A real-time interactive multi-view video system," in Proc. the 13th annual ACM international conference on Multimedia, Nov.2005, Hilton, Singapore, pp.161-170.
    [39]W. J. Matusik and H. Pfister, "3D TV:a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes," ACM Trans. Graphics, vol.23, no.3, pp.814-824,2004.
    [40]X. Cao. Y. Liu, Q. Dai, "A flexible client-driven 3DTV system for real-time acquisition, transmission, and display of dynamic scenes," EURASIP Journal on Advances in Signal Processing, vol.2009. Article ID 351452. pp.1-15,2009.
    [41]U. Fecker. M. Barkowsky. and A. Kaup, "Histogram-based prefiltering for luminance and chrominance compensation of multiview video." IEEE Trans. Circuits Syst. Video Technol., vol.18, no.9. pp.1258-1267, Sept.2008.
    [42]C. Doutre and P. Nasiopoulos. "Color correction preprocessing for multi-view video coding." IEEE Trans. Circuits Syst. Video Technol., vol.19, no.9, pp.1400-1405, Sept.2009.
    [43]H. R. Wu and K. R. Rao. Digital Video Image Quality and Perceptual Coding, Boca Raton. FL:CRC Press. 2006.
    [44]Y. Zhao. C. Zhu. Z. Chen. D. Tian, and L. Yu. "Boundary artifact reduction in view synthesis of 3D video: from perspective of texture-depth alignment." IEEE Trans. Broadcasting, vol.57, no.2, pp.510-522, June 2011.
    [45]Y. Zhao, Z. Chen, D. Tian. C. Zhu. and L. Yu. "Suppressing texture-depth misalignment for boundary noise removal in view synthesis." in 28th Picture Coding Symposium (PCS), Dec.2010, Nagoya, Japan, pp.30-33.
    [46]Y. Zhao and L. Yu, "A perceptual metric for evaluating quality of synthesized sequences in 3DV system,'" in Proc. Visual Communications and Image Processing (VCIP), July 2010, pp.77440X-1-9.
    [47]Y. Zhao, C Zhu, Z. Chen, L. Yu. "Depth no-synthesis-error model for view synthesis in 3-D video." IEEE Transactions on Image Processing, vol.20. no.8. pp.2221-2228, Aug.2011.
    [48]S. J. Daly. R. T. Held, and David M. Hoffman. "Perceptual issues in stereoscopic signal processing." IEEE Trans. Broadcasting, vol.57, no.2. pp.347-361. June 2011.
    [49]H. Yamanoue. M. Okui. and F. Okano. "Geometrical analysis of puppet-theatre and cardboard effects in stereoscopic HDTV images." IEEE Trans. Circuits Syst. Video Technol., vol.16. no.6. pp.744-752, May 2006.
    [50]D. M. Hoffman. A. R. Girshick, K. Akeley, and M. S. Banks, "Vergence-accommodation conflicts hinder visual performance and cause visual fatigue." J. Vision, vol.8, no.3, pp.1-30,2008.
    [51]S. Yano, S. Ide, T. Mitsuhashi, and H. Thwaites. "A study of visual fatigue and visual comfort for 3D HDTV/HDTV images." Displays, vol.23. no.4, pp.191-201, Sept.2002.
    [52]S, Yano. M. Emoto, and T. Mitsuhashi. "Two factors in visual fatigue caused by stereoscopic HDTV images," Displays, vol.25. pp.141-150,2004.
    [53]M. T. M. Lambooij. W. A. Usselsteijn. M. Fortuin and I. Heynderickx, "Visual discomfort and visual fatigue of stereoscopic displays:A review," J. Imaging Sci. Technol., vol.53, no.3. pp.030201-030201-14, May-Jun. 2009.
    [54]W. J. Tam. F. Speranza, S. Yano. K. Shimono, and H. Ono, "Stereoscopic 3D-TV:visual comfort," IEEE Trans. Broadcasting, vol.57, no.2, pp.335-346. June 2011.
    [55]Y. Nojiri, H. Yamanoue, S. Ide, S. Yano, and F. Okana, "Parallax distribution and visual comfort on stereoscopic HDTV." in Proc. IBC,2006. pp.373-380.
    [56]M. Lang. A. Hornung, O. Wang, S. Poulakos, A. Smolic, and M. Gross, "Nonlinear disparity mapping for stereoscopic 3D," ACM Trans. Graph, vol.29.no.4. pp.75:1-75:10, July 2010.
    [57]1. P. Howard and B. J. Rogers. Binocular Vision and Stereopsis, New York:Oxford University Press,1995.
    [58]D. Stidwill and R. Fletcher, Normal Binocular Vision:Theory, Investigation and Practical Aspects, Wiely-Blackwell,2011,ch.5.
    [59]C. Bal, A. K. Jain, T. Q. Nguyen, "Detection and removal of binocular luster in compressed 3D images," in 2011 IEEE ICASSP, May 2011, pp.1345-1348.
    [60]J. Radun, T. Leisti. T. Virtanen, J. Hakkinen, T. Vuori, and G. Nyman, "Evaluating the multivariate visual quality performance of image-processing components," ACM Transaction on Applied Perception, vol.7, no. 3, article 16, June 2010.
    [61]M. Pedersen, N. Bonnier, J. Y. Hardeberg, and F. Albregtsen, "Attributes of image quality for color prints," Journal of Electronic Imaging, vol.19, no.1, pp.011016-1-13, Jan.-Mar.2010.
    [62]W. IJsselsteijn, H. de Ridder, R. Hamberg, D. Bouwhuis. and J. Freeman, "Perceived depth and the feeling of presence in 3DTV," Displays, vol.18, no.4, pp.207-214, May 1998.
    [63]M. Lambooij, W. IJsselsteijn, D. Bouwhuis, and I. Heynderickx, "Evaluation of stereoscopic images:Beyond 2D quality," IEEE Transactions on Broadcasting, vol.57, no.2, pp.432-444, June 2011.
    [64]L. Xing, J. You, T. Ebrahimi, A. Perkis, "An objective metric for assessing quality of experience on stereoscopic images," in Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP), 2010, pp.373-378.
    [65]L. Goldmann, J. S. Lee. T. Ebrahimi. "Temporal synchronization in stereoscopic video:influence on quality of experience and automatic asynchrony detection," in Proc. Int. Conf. lm. Proc. (ICIP), Hong Kong, Sept. 2010, pp.3241-3244.
    [66]D. V. S. X. De Silva, W. A. C. Fernando, S. T. Worrall, S. L. P. Yasakethu. and A. M. Kondoz. "Just noticeable difference in depth model for stereoscopic 3-D displays." in Proc. IEEE International Conference on Multimedia and Expo (ICME). Singapore, July 2010, pp.1219-1224.
    [67]Y. Zhao, Z. Chen, C. Zhu, Y. Tan, and L. Yu, "Binocular just-noticeable-difference model for stereoscopic images," IEEE Signal Processing Letters, vol.18. no.1, pp.19-22, Jan.2011.
    [68]L. B. Stelmach and W. J. Tam. "Stereoscopic image coding:effect of disparate image-quality in left-and right-eye views," Signal Processing:Image Communication, vol.14. pp.111-117, Nov.1998.
    [69]J. Li, M. Barkowsky, and P. Le Callet. "The influence of relative disparity and planar motion velocity on visual discomfort of stereoscopic videos," in 2011 Third International Workshop on Quality of Multimedia Experience (QoMEX), Sept.2011, pp.155-160.
    [70]S. L. P. Yasakethu, C. T. E. R. Hewage, W. A. C. Fernando, and A. M. Kondoz. "Quality analysis for 3D video using 2D video quality models," IEEE Trans. Consum. Electron., vol.54, no.4, pp.1969-1976, Nov. 2008.
    [71]A. Benoit, P. Le Callet, P. Campisi. and R. Cousseau, "Quality assessment of stereoscopic images," EURASIP J. Image Video Process., vol.2008, Article ID 659024,2008.
    [72]J. You, L. Xing, A. Perkis, X. Wang.. "Perceptual quality assessment for stereoscopic images based on 2D image quality metrics and disparity analysis," in Proc.5th International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM), Scottsdale, AZ, USA,2010.
    [73]C. T. E. R. Hewage, S. T. Worrall, S. Dogan, S. Villette, and A. M. Kondoz, "Quality evaluation of color plus depth map based stereoscopic video," IEEE. J. Sel. Topics Signal Process., vol.3. no.2, pp 304-318. Apr. 2009.
    [74]P. Lebreton, A. Raake. M. Barkowsky. and P. Le Callet, "Evaluating depth perception of 3D stereoscopic videos," IEEE Journal of Selected Topics in Signal Processing, vol.6, no.6, pp.710-720, Oct.2012.
    [75]B. Julesz, Foundations of Cyclopean Perception. The University of Chicago Press,1971.
    [76]A. Boev, A. Gotchev, K. Egiazarian, A. Aksay, and G. B. Akar. "Towards compound stereo-video quality metric:a specific encoder-based framework," in Proc. IEEE Southwest Symposium on Image Analysis and Interpretation,2006, pp.218-222.
    [77]A. Maalouf and M.-C. Larabi, "CYCLOP:A stereo color image quality assessment metric," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),2011, pp.1161-1164,
    [78]H. Shao. X. Cao. and G. Er, "Objective quality assessment of depth image based rendering in 3DTV system," in Proc.3DTV Conference,2009, pp.1-4.
    [79]E. Bosc. R. Pepion, P. Le Callet. M. Koppel, P. Ndjiki-Nya, M. Pressigout, and L. Morin, "Towards a new quality metric for 3-D synthesized view assessment," IEEE Journal of Selected Topics in Signal Processing, in press.2011.
    [80]S. Shimizu, M. Kitahara. H. Kimata, K. Kamikura. and Y. Yashima, "View scalable multi-view video coding using 3-D warping with depth map." IEEE Trans. Circuits Systems for Video Technology, vol.17, no.11, pp. 1485-1495. Nov.2007.
    [81]S. Yea and A. Vetro. "View synthesis prediction for multiview video coding," Signal Process. Image Commun., vol.24, no.1+2, pp.89-100, Jan.2009.
    [82]W. Su. D. Rusanovskyy. M. M. Hannuksela, and H. Li, "Depth-based motion vector prediction in 3D video coding," in Picture Coding Symposium (PCS), Krakow, May 2012.
    [83]M.-K. Kang and Y.-S. Ho. "Adaptive geometry-based intra prediction for depth video coding," in Proc. IEEE International Conference on Multimedia & Expo (ICME), July 2010, pp.1230-1235.
    [84]K.-J. Oh. A. Vetro. and Y.-S. Ho, "Depth coding using a boundary reconstruction filter for 3D video systems," IEEE Transactions on Circuits and Systems for Video Technology, vol.21, no.3. pp.350-359, March 2011.
    [85]P. Merkle. Y. Morvan, A. Smolic. D. Farin. K. Muller, P. H. N. de With and T. Wiegand. "The effects of multiview depth video compression on multiview rendering." Signal Processing:Image Communication, vol. 24, no.1-2. pp.73-88. Jan.2009.
    [86]I. Daribo, C. Tillier, and B. Pesquet-Popescu, "Adaptive wavelet coding of the depth map for stereoscopic view synthesis", in Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP'08), Cairns. Australia, pp.34-39. Oct.2008.
    [87]S. Liu. P. Lai. D. Tian. and C. W. Chen. "New depth coding techniques with utilization of corresponding video." IEEE Trans. Broadcasting, vol.57, no.2. pp.551-561, June 2011.
    [88]M. Winken. H. Schwarz. T. Wiegand. "Motion vector inheritance for high efficiency 3D video plus depth coding." in Picture Coding Symposium (PCS), Krakow. May 2012, pp.53-56.
    [89]G. Tech. H. Schwarz. K. Muller and T. Wiegand, "3D video coding using the synthesized view distortion change", in Picture Coding Symposium (PCS), Krakow, May 2012, pp.25-28.
    [90]D. V. S. X. De Silva, W. A. C. Fernando, and S.T. Worrall, "Intra mode selection method for depth maps of 3D video based on rendering distortion modeling," IEEE Transactions on Consumer Electronics, vol.56, no. 4, pp.2735-2740, Nov.2012.
    [91]B. T. Oh. J. Lee, and D.-S. Park. "Depth map coding based on synthesized view distortion function," IEEE Journal of Selected Topics in Signal Processing, vol.5, no.7, pp.1344-1352. Nov.2011.
    [92]W.-S. Kim, A. Ortega, P. Lai. D. Tian. and C. Gomila. "Depth map coding with distortion estimation of rendered view." in Proc. SPIE Visual Inf. Process. Commun.. vol.7543, pp.75430B-75430B-10, July 2010.
    [93]L. Stelmach, W. J. Tam. D. Meegan and A. Vincent. "Stereo image quality:effects of mixed spatio-temporal resolution," IEEE Trans. Circuits Syst. Video Technol., vol.10, pp.188-193,2000.
    [94]P. Seuntiens, L. Meesters, and W. Ijsselsteijn, "'Perceived quality of compressed stereoscopic images:effects of symmetric and asymmetric JPEG coding and camera separation." ACM Transactions on Applied Perception, vol.3, no.2. pp.95-109,2006.
    [95]P. Aflaki, M.M. Hannuksela, J. Hakkinen, P. Lindroos, and M. Gabbouj, "Subjective study on compressed asymmetric stereoscopic video." in Int. Conf. Image Process. (ICIP) 2010, Hongkong. China, pp.4021-4024.
    [96]P. Aflaki. M.M. Hannuksela, J. Hakkinen, P. Lindroos, and M. Gabbouj, "Impact of downsampling ratio in mixed-resolution stereoscopic video," in 3DTV-CON 2010, Tampare, Fineland, pp.1-4.
    [97]G. Saygili, C.G. Gurler, and A.M. Tekalp, "Quality assessment of asymmetric stereo video coding." in Int. Conf. Image Process. (ICIP) 2010, Hongkong, China, pp.4009-4012.
    [98]H. Brust, A. Smolic, K. Muller, G. Tech, and T. Wiegand, "Mixed resolution coding of stereoscopic video for mobile devices." in 3 DTV Conference, May 2009, pp.1-4.
    [99]L. Yang, T. Yendoa, M. P. Tehrania, T. Fujii and M. Tanimoto. "Artifact reduction using reliability reasoning for image generation of FTV," Journal of Visual Communication and Image Representation, vol.21, pp. 542-560. July-August,2010.
    [100]L. Yang, T. Yendo, M. P. Tehrani, T. Fujii and M. Tanimoto, "Error suppression in view synthesis using reliability reasoning for FTV," in Proc.3DTV Conference (3DTV-CON), Tampere, Finland.2010, pp.1-4.
    [101]C. Lee, and Y.S. Ho, "Boundary filtering on synthesized views of 3D video," in Proc. Int. Conf. on Future Generation Communication and Networking Symposia, Sanya,2008, pp.15-18.
    [102]D. Fu, Y. Zhao, and L. Yu, "Temporal consistency enhancement on depth sequences." in Proc. Picture Coding Symposium (PCS), Dec.2010, pp.342-345.
    [103]I. Daribo and B. Pesquet-Popescu. "Depth-aided image inpainting for novel view synthesis," in Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP),2010, pp.167-170.
    [104]M. Schmeing and X. Jiang, "Depth image based rendering:A faithful approach for the disocclusion problem,' in Proc.3DTV Conference,2010, pp.1-4.
    [105]P. Ndjiki-Nya, M. Koppel. D. Doshkov, H. Lakshman. P. Merkle, K. Muller, and T. Wiegand. "Depth image-based rendering with advanced texture synthesis for 3-D video." IEEE Trans. Multimedia, vol.13. no. 3, pp.453-465. June 2011.
    [106]S. Rogmans. M. Dumont, G. Lafruit, and P. Bekaer1, "Biological-aware stereoscopic rendering in free viewpoint technology using GPU computing." in 3DTV-CON 2010,2010, pp.1-4.
    [107]A. Norkin. and 1. Girdzijauskas "3DTV:one stream for different screens:keeping perceived scene proportions by adjusting camera parameters," in Picture Coding Symposium (PCS) 2012, May 2012, pp. 489-492.
    [108]Y. Zhao and L. Yu. "3D-AVC HLS:recommended stereo pair information signalling". MPEG M24893. Apr. 2012, Geneva, Switzerland.
    [109]Y. Zhao, Y. Zhang, and L. Yu. "Subjective study of binocular rivalry in stereoscopic images with transmission and compression artifacts," in IEEE Int. Conf. Image Process. (ICIP), Sept.2013.
    [110]S. Winkler. "Issues in vision modeling for perceptual video quality assessment." Signal Processing, vol.78. no.2. pp.231-252, Oct.1999.
    [111]A. Duchowski, Eye Tracking Methodology:Theory and Practice,2nd ed., London:Springer,2007, ch.2-3.
    [112]V. Virsu, R. Nasanen. and K. Osmoviita, "Cortical magnification and peripheral vision,"J Opt. Soc. Am. A. vol.4, pp.1568-1578, Aug.1987.
    [113]S. L. Sally, and R. Gurnsey, "Foveal and extra-foveal orientation discrimination," Exp Brain Res, vol.183, no.3, pp.351-360, July 2007.
    [114]P. Makeia, J. Rovamo, and D. Whitaker, "The effects of eccentricity and stimulus magnification on simultaneous performance in position and movement acuity tasks," Vis. Res., vol.37, pp.1261-1270, Apr. 1997.
    [115]R. W. Bowen, J. Pola, and L. Matin, "Visual persistence:effects of flash, luminance, duration and energy," Vision Research, vol.14, no.4, pp.295-303, April 1974.
    [116]G. E. Meyer and W. M. Maguire. "Spatial frequency and the mediation of short-term visual storage," Science, vol.198, pp.524-25. Nov.1977.
    [117]J. J. Mezrich, "The duration of visual persistence," Vision Research, vol.24, no.6, pp.631-632,1984.
    [118]C Ware, Information Visualization:Perception for Design.2nd ed.. San Francisco, CA:Morgan Kaufmann, 2004. ch.1.2.11.
    [119]O. Le Meur, P. Le Callet. and D. Barba. "Predicting visual fixations on video based on low-level visual features." Vision Research, vol.47. no.19. pp.2483-2498, Sept.2007.
    [120]C. Rashbass. "The relationship between saccadic and smooth tracking eye movements,"J. Physiol., vol.159, pp.326-338.1961.
    [121]L. Itti and C. Koch. "Computational modeling of visual attention," Nature reviews. Neuroscience, vol.2. no. 3. pp.194-203. Mar.2001.
    1122] D. Parkhurst. K. Law. and E. Niebur. "Modeling the role of salience in the allocation of overt visual attention." Vis. Res., vol.42. pp.107-123,2002.
    [123]L. Zhang. M. H. Tong, T. K. Marks. H. Shan, and G. W. Cottrell. "SUN:A Bayesian framework for saliency using natural statistics." Journal of Vision, vol.8, no.7. pp.1-20.2008.
    [124]O. Le Meur. P. Le Callet. D. Barba. and D. Thoreau, "A coherent computational approach to model bottom-up visual attention." IEEE Trans. Pattern Anal. Mach. Intell., vol.28, no.5, pp.802-817, May 2006.
    [125]B. W. Tatler. "The central fixation bias in scene viewing:selecting an optimal viewing position independently of motor biases and image feature distributions," Journal of Vision, vol.7. no.14, pp.1-17, 2007.
    [126]V. Manahilov. W. A. Simpson, and J. Calvert. "Why is second-order vision less efficient than first-order vision?" Vision Research, vol.45. no.21. pp.2759-2772. Oct.2005.
    [127]A. J. Schofield and M. A. Georgeson. "The temporal properties of first-and second-order vision." Vis. Res, vol.40. pp.2475-2487, Aug.2000.
    [128]D. H. Kelly. "Visual responses to time-dependent stimuli. I. Amplitude sensitivity measurements."J. Opt. Soc. Am., vol.51. pp.422-429.1961.
    [129]B.G. Breitmeyer. "Visual masking:past accomplishments, present status, future developments." Advances in Cognitive Psychology', vol.3. no.1-2. pp.9-20,2007.
    [130]G. E. Legge and J. M. Foley. "Contrast masking in human vision," J. Opt. Soc. Am., vol.70, no.12, pp. 1458-1471, Dec.1980.
    [131]N. Li. S. Desmet, A. Deknuydt. and L. V. Eycken, "Motion adaptive quantization in transform coding for exploiting motion masking effect." in Proc. SPIE, vol.1818.1992. pp.1116-1123.
    [132]J. H. D. M. Westerink and K. Teunissen, "Perceived sharpness in complex moving images," Displays, vol. 16.no.2. pp.89-97.1995.
    [133]R. Marois, and J. Ivanoff. "Capacity limits of information processing in the brain." Trends in Cognitive Sciences, vol.9. pp.296-305, June 2005.
    [134]G. Houghton and S. P. Tipper, "Inhibitory mechanisms of neural and cognitive control:applications to selective attention and sequential action." Brain and Cognition, vol.30, no.1, pp.20-43. Feb.1996.
    [135]N. Jayant, J. Johnston, and R. Safranek, "Signal compression based on models of human perception," Proc. IEEE, vol.81. pp.1385-1422, Oct.1993.
    [136]C. H. Chou and C. W. Chen, "A perceptually optimized 3-D subband codec for video communication over wireless channels." IEEE Trans. Circuits Syst. Video Technol., vol.6, pp.143-156, Apr.1996.
    [137]Y. Jia, W. Lin and A. A. Kassim, "Estimating just-noticeable distortion for video." IEEE Trans. Circuits Syst. Video Technol., vol.16, pp.820-829, July 2006.
    [138]Z. Wei and K.N. Ngan, "Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain," IEEE Trans. Circuits Syst. Video Technol., vol.19, pp.337-346,2009.
    [139]Z. Chen and C. Guillemot. "Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model," IEEE Trans. Circuits Syst. Video Technol., vol.20. pp.806-819. June 2010.
    [140]V. Kayargadde and J. B. Martens. "An objective measure for perceived noise." Signal Processing, vol.49. no.3. pp.187-206, Mar.1996.
    [141]K. T. Tan. M. Ghanbari, and D. E. Pearson, "An objective measurement tool for MPEG video quality," Signal Process.:Image Comm., vol.70, no.3. pp.279-294. Nov.1998.
    [142]S. Winkler, "Quality metric design:a closer look," in Proc. SPIE Human Vision and Electronic Imaging, San Jose. CA.2000, vol.3959, pp.37-44.
    [143]Z. Wang and X. Shang, "Spatial pooling strategies for perceptual image quality assessment," in Proc. IEEE Int. Con/. Image Process. (ICIP), Atlanta, GA.2006, pp.2945-2948.
    [144]R. Aldridge, J. Davidoff. M. Ghanbari, D. Hands and D. Pearson, "Measurement of scene-dependent quality variations in digitally coded television pictures." in IEE Proc. Vis. Image Signal Process., vol.142. no.3, pp. 149-154, June 1995.
    [145]D. S. Hands and S. E. Avons. "Recency and duration neglect in subjective assessment of television picture quality," Applied Cognitive Psychology, vol.15, no.6. pp.639-657,2001.
    [146]J. R. Anderson and M. Matessa, "A production system theory'of serial memory." Psychological Review, vol. 104. pp.728-748.1997.
    [147]W. J. Levelt. "Binocular brightness averaging and contour information," Brit. J. Psychol., vol.56. pp.1-13. Feb.1965.
    [148]J. Ding and G. Sperling, "A gain-control theory of binocular combination," Proceedings of the National Academy of Sciences of the United States of America (PNAS), vol.103, no.4. pp.1141-1146. Jan.2006.
    1149] S. H. Schwartz. Visual Perception:A Clinical Orientation. Fourth Lidition. New York:McGraw-Hill Medical, 2009, ch.11.
    [150]M. H. Pinson and S. Wolf, "A new standardized method for objectively measuring video quality," IEEE Trans. Broadcast., vol.50. no.3. pp.312-322, Sept.2004.
    [151]A. P. Hekstra. J. G. Beerends, D. Ledermann. F. E. de Caluwe. S. Kohler. R. H. Koenen. S. Rihs. M. Ehrsam and D. Schlauss, "PVQM-A perceptual video quality measure," Signal Process.:Image Comm., vol.17, no.10, pp.781-798, Nov.2002.
    [152]A. A. Webster. C. T. Jones, M. H. Pinson. S. D. Voran. and S. Wolf. "An objective video quality assessment system based on human perception," in Proc. SPIE, vol.1913, San Jose. CA,1993, pp.15-26.
    [153]Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment:from error visibility to structural similarity," IEEE Trans. Image Process., vol.13. no.4, pp.600-612, Apr.2004.
    [154]H. R. Sheikh and A. C. Bovik. "Image information and visual quality." IEEE Trans. Image Process., vol.15, no.2, pp.430-444, Feb.2006.
    [155]D. M. Chandler and S. S. Hemami, "VSNR:A wavelet-based visual signal-to-noise ratio for natural images,' IEEE Trans. Image Process., vol.16, no.9, pp.2284-2298, Sept.2007.
    [156]K. Seshadrinathan and A. C. Bovik, "Motion tuned spatio-temporal quality assessment of natural videos," IEEE Trans. Image Process., vol.19, no.2, pp.335-350, Feb.2010.
    [157]Z. Wang, E. P. Simoncelli, A. C. Bovik, and M. Matthews, "Multiscale structural similarity for image quality assessment," in Proc. IEEE Asilomar Conf. Signals, Systems and Computers,2003, pp.1398-1402.
    [158]H. R. Sheikh and A. C. Bovik, "A visual information fidelity approach to video quality assessment," in Proc. 1st Int. Conf. Video Processing and Quality Metrics for Consumer Electronics, Jan.2005, pp.23-25.
    [159]A. Ninassi, O. Le Meur, P. Le Callet, and D. Barba, "Considering temporal variations of spatial visual distortions in video quality assessment," IEEE J. Sel. Topics Signal Process., vol.3, no.2, pp.253-265, Apr. 2009.
    [160]M. Barkowsky, J. Bialkowski, B. Eskofier, R. Bitto, and A. Kaup, "Temporal trajectory aware video quality measure," IEEE J. Sel. Topics Signal Process., vol.3, no.2, pp.266-279, Apr.2009.
    [161]Y. Zhao, L. Yu, Z. Chen, and C. Zhu, "Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives," IEEE Trans. Circuits and Systems for Video Technology, vol.21, no.12, pp.1890-1902, Dec.2011.
    [162]W. J. Tam, L. B. Stelmach, L. Wang, D. Lauzon, and P. Gray, "Visual masking at video scene cuts," in Proc. SPIE Human. Vision, Visual Processing, and Digital Display VI, April 1995, vol.2411, pp.111-119.
    [163]K. Seshadrinathan, R. Soundararajan, A. C. Bovik, and L. K. Cormack, "Study of subjective and objective quality assessment of video," IEEE Trans. Image Process., vol.19, no.16, pp.1427-1441, June.2010.
    [164]F. de Simone, M. Tagliasacchi, M. Naccari, S. Tubaro, and T. Ebrahimi, "H.264/avc video database for the evaluation of quality metrics," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process. (ICASSP), March 2010, Dallas, Texas, U.S.A, pp.2430-2433.
    [165]M. Tanimoto, T. Fujii, M. P. Tehrani, K. Suzuki and M. Wildeboer, "Depth estimation reference software (DERS) 3.0," ISO/IEC JTC1/SC29/WG11 Doc. M16390, Maui, USA, April 2009.
    [166]M. Tanimoto, T. Fujii and K. Suzuki, "View synthesis algorithm in view synthesis reference software 2.0 (VSRS2.0)," ISO/IEC JTC1/SC29/WG11 Doc. M16090, Lausanne, Switzerland, Feb.2009.
    [167]B. Mendiburu,3D Movie Making:Stereoscopic Digital Cinema from Script to Screen. Oxford:Focal Press (Elsevier),2009.
    [168]B. Clark, "3D production and post.",2010 [Online]. Available: http://www.itbroadcastanddigitalcinema.com/docs/3d_production_and_post.pdf, White Paper.
    [169]Y. Zhao, L. Yu, and Z. Chen, "Cross-view post-filtering for fidelity enhancement on asymmetric coding of 3D video," in Visual Communications and Image Processing (VCIP), Nov.2011, Tainan City, Taiwan, pp. 1-4.
    [170]M. A. Robertson and R. L. Stevenson, "DCT quantization noise in compressed images," IEEE Transactions on Circuits and Systems for Video Technology, vol.15, no.1, pp.27-38, Jan.2005.
    [171]JSVM Software. [Online]. Available:http://ip.hhi.de/imagecom_G1/savce/downloads/
    [172]Y. Zhao and L. Yu, "Evaluating video quality with temporal noise," in International Conference on Multimedia & Expo (ICME), July 2010, Singapore, pp.708-712.
    [173]W. J. Tam, D. V. Meegan, A. Vincent, and P. Corriveau, "Human perception of mismatched stereoscopic 3D inputs," in IEEE International Conference on Image Processing, Sept.2000, Vancouver, Canada, pp.5-8.
    [174]Meeting report of the first meeting of the Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V), JCT3V-A_Notes_d5, [Online]. Available: http://wftp3.itu.int/av-arch/ict3v-site/201207 A Stockholn
    [175]Y. Zhao, L. Yu "3D-AVC HLS:a show case of m25256 (view synthesis adaptation to display parameters)", JCT2-A0153, July 2012, Stockholm, Sweden.
    [176]A. Norkin, I. GIrdzijauskas, Y. Zhao, and L. Yu, "Show-case and syntax for SEI message on reference display information signaling", JCT2-A0163, July 2012, Stockholm, Sweden.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700