用户名: 密码: 验证码:
立体视频中视差估计研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
伴随着计算机技术的快速发展,视频编解码技术得到了很大程度的提高。然而,随着人们对视频信息的渴求,单视点视频信息已经不再能够满足人们的要求。近年来,具有3D视觉功能的立体视频技术及多视点视频技术得到了普遍的重视,并且成为一个研究的热点。
     据统计,人类从外部获取的信息中约有60%-75%来自视觉系统。因此,为了满足人们日益增长的对视觉系统的需求,必然要大力发展与视频信息密切相关的加工、处理等一系列先进技术,其中对视频信息加工处理、传输的首要问题就是如何对巨大的数据量进行有效压缩编码。众所周知,立体视频中蕴含了景物的深度信息,因此在自然景物的表征上更具有真实感。然而,在立体视频中,数据量要远大于单通道视频,所以对立体视频的高效压缩显得尤为重要。同时,随着计算机技术的发展,立体视觉理论将被广泛应用于自动导航、工业测量、虚拟现实、生物医学以及军事侦察等领域。在立体视频中,视差估计是一个研究的重点,对整个系统有着非常重要的作用。
     本文针对立体视频中的视差估计做了相关的研究,由于传统的动态规划算法是仅在扫描线内进行全局能量最优,在得到的视差图中具有明显的“条纹”现象,文中通过对全局能量函数进行改进,加强了扫描线间的约束,使得“条纹”现象得到改善;对置信度传播的视差估计算法进行了相关的研究,通过对能量函数进行消减,并且对最终视差图进行处理,使得精度有所提高。
With the development of computer technology, video codec technology has greatly improved to some extent. However, single-view video information can not meet the demand of people’s desire in the video information. In recent years, stereo video technology with 3D visual function and multidimensional viewpoint video technology has been paid great attention to, and become a research hotspot.
     According to the statistics, about 60%-75% of the information that human obtained from outside is from visual system. Therefore, in order to meet the increasing demand on the visual system, it is bound to develop a series of advanced technologies such as information processing and handling which is in close relationship with the video information. For the video information processing and transmission, the primary problem is how to compress the video effectively. As we all know, the stereo video contains three-dimensional depth information of the scene. Therefore, the characterization of the natural features is much more real. However, in the stereo video, the amount of data is much larger than single-channel video. Therefore, an efficient stereo video compression is particularly important. Meanwhile, with the development of computer technology, stereo vision theory will be widely used in automatic navigation, industrial measurement, virtual reality, biomedical, military reconnaissance and other fields. In the stereo video, disparity estimation is a key point of the study, which has a very important role in the whole system. In this dissertation, we do some research on the disparity estimation in the stereo video.
     As the traditional dynamic programming algorithm implements the global energy minimum only in the scan line, the obtained disparity map has the obvious“fringe”phenomenon. We improve the global energy function and strengthen the restriction between the scanning lines, so the“fringe”phenomenon has been improved. We also study the disparity estimation algorithm in the belief propagation. By reducing the energy function and processing of the final disparity map, the accuracy has been improved.
引文
[1]马颂德,张正友.计算机视觉——计算理论与算法基础.北京:科学出版社,1998.
    [2] Gary S. Greenbaum. Remarks on the H.26L Project: Streaming Video Requirements for Next Generation Video Compression Standards[R], doc Q15-G-11, ITU-T video coding experts group meeting, Monterey, Feb. 1999
    [3] M. Ravasi, M. Mattavelli and C. Clere. A computational complexity comparison of MPEG4 and JVT codecs. Joint Video Team(JVT) of ISO/IEC MPEG & ITU-T VCEG, JVT-D153RL-L, July 2002
    [4]周敬利,金毅,余胜生,郑俊浩.基于H.264视频编码技术的研究[J].华中科技大学学报(自然科学版)(J),2003,31(8):32-34
    [5] Francesco Isgro, Emanuele Trucco, Peter Kauff and Oliver Schreer. Three-Dimensional Image Processing in the Future of Immersive Media. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2004, VOL 14(3): 333-339
    [6]钱诚,戴琼海.多视序列编码概述. 2005国际有限电视技术研讨会,杭州,2005,428-434.
    [7] Koschan, A., What is New in Computational Stereo Since 1989: A Survey of Current Stereo Papers[R], Technical Report93-22.Univ. of Berlin, 1993.
    [8] D Scharstein and R Szeliski. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms[J]. International Journal of Computer Vision. 2002, 47(1):7-42.
    [9] M Pollefeys. Vision Modeling with a Hand-held Camera[J]. International Journal of Computer Vision, 2004, 59(3).
    [10] D. Marr and T. Poggio. A computational theory of human stereo vision. In Proc. of the Royal Society of London,1979, B 204, 301-328.
    [11] Dhond U R and J K Affarwal. Structure from stereo-A Review[J]. IEEE Transactions on Systems, Man and Cybernetics, 1989, 19(6): 1989-1510.
    [12] M Z Brown, D Burschka and G D Hager. Advances in Computational Stereo[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2005. 25(8).
    [13] C Harris and M Stephens. A Combined Corner and Edge Detector[C]. Proc. Alvey Conf., 1987: 189-192.
    [14] David G Lowe. Distinctive Image Features from Scale-Invariant Key points[J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
    [15] Marroquin, J. L. (1983). Design of Corrperative Networks. Working Paper 253, Artifical Intelligence Laboratory, Massachusetts Institute of Technology.
    [16] Barnard,S. T. Stochastic stereo matching over scale. International Journal of Computer Vision. 1989. 3(1), 17-32.
    [17] S Birchfield and C Tomasi. Depth discontinuities by pixel-to-pixel stereo[C]. ICCV, 1998:1073-1080.
    [18] Y Boykov, O Veksler and R Zabih. Fast Approximate Energy Minimization via Graph Cuts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001.23(11):1222-1239.
    [19] M F Tappen and W T Freeman. Comparison of Graph Cuts with Belief Propagation for Stereo using Identical MRF Parameters[C]. IEEE ICCV, 2003.
    [20] J Sun, N N Zheng and H Y Shum. Stereo matching using belief propagation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003,25(7):787-800.
    [21] Y Weiss and W T Freeman. On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs[J]. IEEE Transactions on Information Theory, 2001, 47(2): 723-735.
    [22] Pei S C, Lai C L. Very low bit-rate coding algorithm for stereo video with spatio-temporal HVS model and binary correlation disparity estimation[J]. IEEE Journal on Selected Areas in Communications. 1998, 16(1): 98-107.
    [23] Luo Y, Zhang Z Y, An P. Stereo video coding based on frame estimation and interpolation[J]. IEEE Transactions on Broadcasting, 2003, 49(1): 14-21.
    [24] Bos P, Haven T. Field-sequential stereoscopic viewing systems using passive glasses[C]. Processing of the SID, 1989,30(1):39-43.
    [25] K. Hopf. An Autostereoscopic Display Providing Comfortable Viewing Conditions and a High Degree of Telepresence. IEEE Transactions on Circuits Systems for Video Technology. Vol. 10, No.3, pp. 359-365, Apr. 2000.
    [26] G. J. Woodgate, D. Ezra, J. Harrold, N. S. Holliman, G. R. Joes, and R. R. Moseley. Autostereoscopic 3D Display Systems with Observer Tracking. Signal Processing-Image Communication. Vol.14, No.6, pp. 131-145, 1998.
    [27] Philips. http://www.research.philips.com/generalinfo/special/3dlcd/.
    [28] D. T. Inc. http://www.dti3d.com/.
    [29] N. Corporation. http://www.newsight.com/.
    [30] S. Technologies. http://www.seeReal.com/.
    [31] S. Usa. http://www.sharpusa.com/.
    [32]任重,邵军力.立体视觉中的双目匹配方法研究,信息与控制,2001,30(7):727-730.
    [33]郭龙源,夏永泉,杨静宇. Rank变换在立体匹配中的应用研究[J].系统仿真学报.2007. 19(9): 2121-2123. (GUO Long-yuan, XIA Yong-quan, YANG Jing-yu. Research and Application of Stereo Matching Based on Rank Transform[J]. Journal of System Simulation. 2007. 19(9): 2121-2123.)
    [34] M. Z. Brown, D. Burschka, and G. D. Hager. Advances in computational stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2003, 25(8): 993-1008.
    [35] B. K. P. Horn and B. G. Schunk. Determining Optocal Flow Artifical Intelligence. 1981. 17: pp: 185-204.
    [36] R. Zabih and J. Woodfill. Non-Parametric Local Transforms for Computing Visual Correspondence. Proc. 3rd European Conf. Computer Vision. pp: 150-158. 1994.
    [37] V. Venkateswar and R. Chellappa. Hierarchical Stereo and Motion Correspondence Using Feature Groupings. International Journal of Computer Vision. 1995. 15: pp. 245-269.
    [38] U. R. Dhond and J. K. Aggarwal. Structure from Stereo-A Review. IEEE Transactions Systems, Man, and Cybernetics. 1989.19: pp. 1489-1510.
    [39] Y. Ohta and T. Kanade. Stereo by Intra- and Intra-Scanline Search Using Dynamic Programming. IEEE Transactions Pattern Analysis and Machine Intelligence. 1985. 7: pp. 139-154.
    [40] I, J.Cox, et al., A Maximum Likehood Stereo Algorithm. Computer Vision and Image Understanding. 1996. 63: pp.542-567.
    [41] H. H. Baker. Depth from Edge and Intensity Based Stereo. 1982. Stanford University Artifical Intelligence Laboratory.
    [42] P. N. Belhumeur. A Bayesian Approach to Binocular Stereopsis. International Journal of Computer Vision. 1996. 19(3): pp. 237-260.
    [43] C. Tomasi and R. Manduchi. Stereo Matching as a Nearest-Neighbor Problem. IEEE Transactions Pattern Analysis and Machine Intelligence. 1998. 20: pp. 333-340.
    [44] S. Roy and I. J. Cox. A Maximum-Flow Formulation of the N-camera Stereo Correspondence Problem Proc. International Journal Conference of Computer Vision. 1998: pp. 492-499.
    [45] H. Zhao. Global Optimal Surface from Stereo. International Journal Conference of Pattern Recognition. 2000. pp. 101-104.
    [46] l. Thomos, S. Malasiotis, and M. G. Strintzis. Optimized Block Based Disparity Estimation in Stereo Systems Using a Maximum-Flow Approach. SIBGRAP, 1998.
    [47] S. S. Intille and A. F. Bobick. Incorporating Intensity Edges in the Recovery of Occlusion Regions. International Journal Conference of Pattern Recognition. 1994. pp. 674-677.
    [48] Y. Boykov and V. Komolgorov. An Experience Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. and Pattern Recognition. 2001.
    [49] Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions. Pattern Analysis and Machine Intelligence. 2001. 23(11): pp. 1222-1239.
    [50] V. Komolgorov and R. Zabih. Computing Visual Correspondence with Occlusions using Graph Cuts. International Journal Conference Computer Vision. 2001.
    [51] M. I. Jordan. Learning in Graphical Models. MIT Press. 1998.
    [52] J. Shah. A Nonlinear Diffusion Model for Discontinuous Disparity and Half-Occlusions in Stereo. in Proc. Computer Vision and Pattern Recognition. pp. 34-40. 1993.
    [53] J. Sun, H. Y. Shun, and N. N. Zheng. Stereo Matching Using Belief Propagation. in Proc. European Conference Computer Vision. pp. 510-524. 2002.
    [54] P. Fua and Y. G. Leclerc. Object-Centered Surface Reconstruction: Combining Multi-Image Stereo and Shading. International Journal Computer Vision. 1995. 16: pp. 35-56.
    [55] K. N. Kutulakos and S. M. Seitz. A Theory of Shape by Space Carving. International Journal Computer Vision. 2000. 38(3): pp. 199-218.
    [56] Geiger D, Ladendorf B and Yuille A. Occlusions and binocular stereo[C]. European Conference on Computer Vision. 1992. pp. 425-433.
    [57] Birchfield S, Tomasi C. A pixel dissimilarity measure that is insensitive to image sampling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1998. 20(4): 401-406.
    [58] A. F. Bobick, S, S, Intille. Large occlusion stereo[J]. International Journal of Computer Vision. 1999. 33(3): 181-200.
    [59] T. Kanade and M. Okutomi. A stereo matching algorithm with an adaptive window: Theory and experiencement. IEEE Transaction on Pattern Analysis and matchine Intelligence. 1994.16(9): 920-932.
    [60] D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion. International Journal of Computer Vision. 1998. 28(2): 155-174.
    [61] P. N. Belhumeur, Mumford D. A Bayesian treatment of the stereocorrespondence problem using half occluded regions[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1992. 506-512.
    [62] B. D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Pro. of Int. Joint Conf. Artifical Intelltgence. 1981.
    [63] P. Fua. A parallel stereo algorithm that produces dense depth maps and preserves image features. Machine Vision and Applications. 1993. 40(6): 35-49.
    [64]李向军,高寅生.多媒体数据压缩技术[J].西安联合大学学报. 2001. 4(2): 44-49.
    [65] J. D. Nahmias, A. Steed, B. Buxton. Evaluation of modern dynamic programming algorithms for real-time active stereo systems[C]. Plzen Czech Republic: WSCG. 2005.
    [66] Stephen S. Intille, Aaron F. Bobick. Disparity-space images and large occlusion stereo[C]. Stockholm, Sweden: Third European Conference Computer Vision. 1994.
    [67] Minglum Gong, Yee-Hong Yang. Near real-time reliable stereo matching using programmable graphics hardware[J]. Computer Vision and Pattern Recognition. 2005. 1(1): 924-931.
    [68] Stuart Geman, Donald Geman. Stochastic relaxation, gibbs distribution, and the Bayesian restoration of image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1984. 6(6): 721-741.
    [69] http://cat.middlebury.edu/stereo/.
    [70] P. Felzeneszwalb and D. Huttenlocher. Efficient Belief Propagation for Early Vision. International Journal of Computer Vision. 2006. 70(1): pp. 41-54.
    [71] Greig D, Porteous B, Seheult A. Exact maximum a posteriori estimation for binary images. J. Roy. Stat. Soc. B. 1989. 48.
    [72] . Kolmogorov and R. Zabih. What Energy Functions can be Minimized via Graph Cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004. 26(2): pp. 147-159.
    [73] W. Freeman, E. Pasztor, and O. Carmichael. Learning low-level vision[J]. International Journal of Computer Vision. 2003. 40(1): 25-47.
    [74] P. Felzeneszwalb, and D. Huttenlocher. Distance Transforms of Sampled Functions[R]. Cornell Computing and Information Science TR2004-1963. New York: Cornell University, 2004: 1-15.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700