用户名: 密码: 验证码:
视频对象分割技术的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着通讯和信息处理技术的发展,基于视频的应用展现出了强大的灵活性和可扩展性。视觉通讯随之成为成长最快的信息载体。数字化的应用和服务正大量涌现,如数字电视,远程会议,视频电话和基于图像的交互式多媒体等。这些伴随着大数据量的应用和服务要求更先进的数字信号处理技术,以便进行更高效的存储和传输,以及更准确的分析和更灵活的操纵。视频对象分割就是这样一种技术。
     视频对象分割,旨在分割出视频序列中的运动对象并沿时间轴跟踪运动对象的演进。许多与图像处理、视频压缩、模式识别相关的应用都依赖于对运动对象的分割。视频对象分割技术同时也是基于内容的视频编码、视频内容的操纵和交互式多媒体等应用的重要工具。对视频对象的分割通常是将视频的内容分割成具有语义的区域,并进一步作为对象来处理。这些语义上分割出的对象能够独立地编码,从而实现交互式多媒体中对视频内容基于对象的操纵。比如,在MPEG-4标准中,视频序列被认为是由一系列相互独立的运动对象组成的,并且视频序列的编码是针对一个一个对象的。在MPEG-7中,基于帧间运动信息的分割结果以及对象的突然形变将被用于高层(对象层)的语义描述。
     本文首先叙述了视频对象分割技术产生和发展的背景,然后讨论了视频对象分割技术发展的现状。接着,本文深入研究了视频对象分割技术:首先将信息融合技术应用于视频对象分割,利用视频流的图像信息和运动信息,提出了一种新的视频对象分割方法,为实时的视频流前景提取提供了一种新的思路与解决方案;然后,为增加通用性,本文又提出了一种基于动态规划的自动视频对象分割方法;最后,作为一种补充,本文还实现了一种交互式的视频对象分割方案。本文的研究思想和内容是通过对图像分割和视频跟踪等关键技术的研究,实现视频对象的自动分割和半自动分割,并在此基础上实现其在视频编码、编辑、检索,视频会议和视频理解等方面的应用。并在最后对这一领域的发展方向和前景做了展望。
With advances in communication and information processing technologies, video-driven applications show a very large degree of flexibility and extensibility. Visual communication is the fastest growing vehicle for information. Many new digital applications and services are emerging such as digital TV, teleconference, videophone, and image-based interactive multimedia. These diversified applications and services with a large amount of data demand more advanced digital signal processing techniques for efficient storage and transmission, accurate analysis, and flexible manipulation. Video object segmentation is such a kind of technique.
    Video object segmentation aims to partition an image sequence into moving objects and to track the evolution of the moving objects along the time axis. Many applications related to video compression and transmission, and pattern recognition rely on video object segmentation. Video object segmentation techniques are also important tools for content-based video coding and manipulation, and interactive multimedia applications. Video object segmentation usually divides the contents of a video frame into semantic regions that can be dealt as objects. These semantically segmented objects can be coded so that object-based manipulation of video content can be realized in interactive multimedia applications. For example, in the context of the MPEG-4 standard, a video is considered to consist of independently moving objects and is encoded object by object. In the MPEG-7, segmented results based on the frame-to-frame motion information or abrupt shape change can be utilized for a high-level description.
    This paper first presents the background of research on video object segmentation, and introduces the status in quo of this area. And then this paper lucubrates the video segmentation techniques: first, a new algorithm based on information fusion, which can be used for the segmentation of video object is proposed; then, to enhance the generality, this paper presents a technique for automatic video object segmentation based on dynamic programming; at last, as an alternative, a user-assisted segmentation of video object is proposed. The idea and content of this paper are implementing automatic and semi-automatic video object segmentation via research work on key techniques and based on video object segmentation, building several applications such as video coding, video editing, video retrieval, video conference and video understanding, and give a prospect to the further research on this area.
引文
[1] MPEG4 Group, "Information technology-coding of audio-visual objects: Visual", Doc.lSO/lEC JTC1/SC29/WG11 N2202, Final Committee Draft, May 1998.
    [2] S. W. Lee, J. G. Choi and S. D. Kim, Automatic segmentation of moving objects for video object plane generation, IEEE Transactions on Circuits and Systems for Video Technology,vol. 7, pp. 279(286, 1997.
    [3] Neri, S. Colonnese, G. Russo and P. Talone, Automatic moving object and background separation, Signal Processing, vol. 66, pp. 219(232, May 1998.
    [4] R. Mech and M. Wellborn, A noise robust method for segmentation of moving objects in video sequences, in Proceedings ICASSP '97 (IEEE International Conference on Acoustics, Speech and Signal Processing), vol. 4, Munich, Germany, pp. 2657(2660, Apr. 1997.
    [5] Kunt M, et al. Second-generation image-coding techniques. Proc IEEE, 1985, 73 (4) :549-574.
    [6] Kunt M, et al. Recent results in high-compression image coding. IEEE Trans on Cas, 1987,34(11) : 1306-1336.
    [7] Sikora T. The mpeg-4 video standard verification model. IEEE Trans Circuits Syst Video Technol, 1997: 19-31.
    [8] Mpeg-7 requirements. JTC/SC29/WG11/n2083, 1998.
    [9] Tekalp A M.数字视频处理(影印版)第一版,北京:清华大学出版社. 1998. 78-79.
    [10] Aggarawal J K, Nandhakumar N. On the computation of motion from sequences of images. Proc IEEE. 1988, 76: 917-935.
    [11] Adiv G. Determining three-dimensional motion and structure form optical flow generated by several moving objects. IEEE Trans Pattern Anal Machine Intell, 1985: 384-401.
    [12] Wang J, Adelson E. Representing moving images with layers. IEEE Trans Image Processing, 1994, (3) : 625-638.
    [13] Hotter M, Thoma R. Image segmentation based on object oriented mapping parameter estimation. Signal Processing, 1988, 15 (3) : 315-334.
    [14] Musmann H G, Hotter M, Ostermann J. Object-oriented analysis-synthesis coding of moving images. Signal Processing: Image Commun., 1989, (1) : 117-138.
    [15] Diehl N. Object-oriented motion estimation and segmentation in image sequences. Signal Processing: Image Comm, 1991, (3) : 23-56.
    [16] Wang Y, Lee 0. Active mesh: a feature seeking and tracking image sequence representation scheme. IEEE Trans Image Proc, 1994, (3) : 610-614.
    
    
    [17] Flusser J. An adaptive method for image registration. Part Recog, 1992, 25: 45-54.
    [18] Murray D W, Buxton B F. Scene segmentation form visual motion using global optimization. IEEE Trans Patt Anal Mach Intel, 1987, 9 (2) : 220-228
    [19] Chang M M, Sezan M I, Tekalp A M. An algorithm for simultaneous motion estimation and scene segmentation. IEEE Int Ocnf Acoust, Speech, Signal Processing, ICASSP'94, ADELAIDE, AUSTRALIA, 1994, v: 221-224.
    [20] Stiller C. A statistical image model for motion estimation, leee int. Conf. Acoust., speech, Signal Processing, ICASSP'93, MINNEAPOLIS, MN, 1993, v: 193-196.
    [21] Object-based estimation of dense motion fields. IEEE Trans limage Processing, 1997, (6) : 234-250.
    [22] Choi J G, Lee S W, Kim S D. Spatio-temporal video segmentation using a joint similarity measure. IEEE Trans Circuits Syst Video Technol, 1997, (7) : 279-286.
    [23] Sethi I k, Jain R. Finding trajectories of feature points in a monocular image sequence. IEEE Trans Pattern Anal Machine Intell, 1987: 56-73.
    [24] Yao Y S, Chellappa R. Tracking a dynamic set of feature points. IEEE Trans Image Processing, 1995,4(10)
    [25] Deriche R, faugeras 0. Tracking line segments. ECCV'98, 1990: 259-268.
    [26] Black M J. Combining intensity and motion for incremental segmentation and tracking over long image sequences. ECCV92,1992: 485-493.
    [27] Meyer F G, Bouthemy P. Region-based tracking using affine motion models in long image sequences. Cvgip: Image Understanding, 1994, 60(2) : 119-140.
    [28] Schalkoff R J, Mevey E S. A model and tracking algorithm for a class of video targets. IEEE Trans Pattern Anal Machine Intell, 1982, 4 (1) : 2-10.
    [29] Letgers G R, Young T Y. A mathematical model for computer image tracking. IEEE Trans Patern Anal Machine Intell, 1982, (6) : 583-594.
    [30] Gordon G L. On the tracking of featureless objects with occlusion. IEEE Workshop Visual Motion, IRVING, 1989: 13-20.
    [31] Huang T S. Modeling, analysis, and visualization of nonrigid object motion. Proc Of 10th ICPR, 1990:361-364.
    [32] Kambhamettu C, Goldgof D B, Terzopoulos D, et al. Non-rigid motion analysis. Handbook of prip: Computer Vision, 1994, 2.
    [33] Aggarwal J K, Cai Q. Human motion analysis: a review. Computer vision and image understanding, 1999, 73(3) :428-440.
    [34] Aggarwal J K, Cai Q, Liao W, et al. Nonrigid motion analysis: articulated and elastic motion. Computer vision and image understanding, 1998, 70(2) : 142-156.
    [35] Leymarie F, Levine M D. Tracking deformable objects in the pklane using an active contour model. IEEE Trans Patt Anal Mach Intel, 1993, 15: 617-634.
    [36] Kervrann C, Heitz F. Robust tracking of stochastic deformable models in long image
    
    sequences. Proc IEEE Int Conf Image Proc, AUSTIN, TX, 1994, III: 88-92.
    [37] Metaxas D, Terzopoulos D. Shape and nonrigid motion estimation through physics-based synthesis. IEEE Trans Patt Anal Mach Intel, 1993,15: 580-591.
    [38] Thomas M, King N N. Automatic segmentation of moving objects for video object plane generation. IEEE Trans on Circuits and System for Video Technology, 1998, 8(5) : 525-538.
    [39] Weng J, Ahuja N, Huang T S. Matching two perspective views. IEEE Trans Patt Anal Machine Intell, 1992, 14: 806-825.
    [40] Zheng Q, Chellappa R. Automatic feature point extraction and tracking in image sequences for arbitrary camera motion. Int J Comput Vision, 1995, 15: 31-76.
    [41] Blostein S D, Huang T S. Detecting small, moving objects in image sequences using sequential hypothesis testing. IEEE Trans Signal Processing, 1991, 39:1611-1629.
    [42] Sethi I K, Jain R. Finding trajectories of feature points in a monocular image sequence. IEEE Trans Patt Anal Machine Intell, 1987, (9) : 56-73.
    [43] Zhang Z, Faugeras O D. Three-dimensional motion computation and object segmentation in a long sequence of stereo frames. Int J Comput Vision, 1992, (7) : 211-241.
    [44] Huttenlocher D P, Klanderman G A, Rucklidge W J. Comparing images using the hausdorff distance. IEEE Trans Pattern Anal Machine Intell, 1993: 850-863.
    [45] Mech R, Wollborn M. Automatic segmentation of moving objects (partial results of core experiment N2) . ISO/IEC JTC1/SC29/WG11 MPEG98/m3187, 1998.
    [46] Mech R, Wollborn M. A noise robust method for 2D shape estimation of moving objects in video sequences considering a moving camera. Workshop on Image Analysis for Multimedia Interactive Services, LOUVAIN-LA-NEUVE, BELGIUM, 1997.
    [47] Kim M, Choi J G, Lee M H, et al. Performance analysis of an ETRI's global motion compensation and scene cut detection algorithms for automatic segmentation. ISO/IEC JTC1/SC29/WG11 MPEG97/m2387, 1997.
    [48] Bouthemy P, Francois E. Motion segmentation and qualitative dynamic scene analysis from an image sequence. Int J Comput Vision, 1993, 10 (2) : 157-182.
    [49] Core experiments on multifunctional and advanced layered coding aspects of MPEG4 video. ISO/IEC JTC1/SC29/WG11 MPEG98/n2176, 1998
    [50] Kim M, Choi J G, Lee M H, et al. User-assisted segmentation for moving objects of interest. ISO/IEC JTC1/SC29/WG11 MPEG97/m2803,1997.
    [51] Kim M, Choi J g, Lee M H, et al. User-assisted video object segmentation by multiple object tracking with a graphical user interface. ISO/IEC JTC1/SC29/WG11 MPEG98/m3349.
    [52] Kim M, Jeon J G, Kwak J, et al. User's guide for a user-assisted video object segmentation tool. ISO/IEC JTC1/SC29/WG11 MPEG98/m3935, 1998.
    
    
    [53] Colonnese S, Russo G. User interactions modes in semi-automatic segmentation: development of a flexible graphical user interface in Java. ISO/IEC JTC1/SC29/WG11 MPEG98/m3320, 1998.
    [54] Neri A, Colonnese S, Russo G. Video sequence segmentation for object-based coders using higher order statistics. ISCAS'97, HONGKONG, 1997.
    [55] Marques F, Molina C. Object tracking for content-based functionalities. SPIE Visual Commun. Image Processing, VCIP'97, SAN JOSE, CA, 1997: 190-199.
    [56] Lim Y W, Lee S U. On the image segmentationalgorithm based on the thresholding and the fuzzy C-mean technique. Pattern Recognition, 1990, 23(9) :935-952.
    [57] Alexandre F, Gerard M. Adaptive color background modeling for real-time segmentation of video streams. Proc. of International on Imaging Science, System and Technology, 1999: 227-232.
    [58] Identifying connected components, http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/OWENS.
    [59] Thomas M, King N N. Automatic segmentation of moving objects for video object plane generation. IEEE Trans on Circuits and System for Video Technology, 1998, 8(5) : 525-538.
    [60] 高文,陈熙霖著,计算机视觉。清华大学出版社, 1998.
    [61] Arcelli C, Baja G S D. A width-independent fast thinning algorithm. IEEE Trans Pattern Anal. Machine Intell., 1985, 7(4) : 463-474.
    [62] Intel image processing library[Z], http://developer.intel.com/vtune/perflibst/ipl
    [63] Identifying connected components[Z], http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/OWENS/LECT3/node2. html
    [64] T. Arch, A. Kaup. Statistical model-based change detection in moving video[J]. Signal Processing, 1993, 31: 165-180
    [65] T. Arch, A.Kaup, R. Mester. Change detection in image sequences using Gibbs random fields: a Bayesian approach[C]. Proc. of International Workshop on Intelligent Signal Processing and Communication Systems, Sendai, Japan, October 1993: 56-61
    [66] T. Meier, K. N. Ngan. Automatic segmentation of movings for video object plane generation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1998, 8(5) : 525-538
    [67] ISO/IEC JTC1/SC29/WG11. Overview of the MPEG-4 standardfS]. MPEG98/N2323, Dublin, July 1998
    [68] ISO/IEC JTC1/SC29/WG11. MPEG-4 visual final committee draft[S]. Dublin, July 1998
    [69] M. Wellborn, R. Mech. Refined procedure for objective evaluation of video object generation algorithms[S]. ISO/IEC JTC1/SC29/WG11 M3448, March 1998
    
    
    [70] ISO/IEC JTC1/SC29/WG11. MPEG-7 requirements document[S]. N4035, Singapore, March 2001
    [71] ISO/IEC JTC1/SC29/WG11. MPEG-7 applications document[S]. N3934, Pisa, January 2001
    [72] Horn B. K., Schunck B. G.. Determining optical flow[J]. Artif. Intel.,1981,17:185-203.
    [73] M. Kim, J.G. Choi, M.H. Lee, C. Ann. User-assisted segmentation for moving objects of interest[S]. ISO/IEC JTC1/SC29/WG11 MPEG97/m2803,1997.
    [74] M. Kim, J.G. Jeon, J. Kwak, M.H. Lee, C. Ahn. User's guide version 1. 1 for a user-assisted video object segmentation tool[S]. ISO/IEC JTC1/SC29/WG11 MPEG98/m4232,1998.
    [75] M. Kim, J.G. Jeon, J. Kwak, M.H. Lee, C. Ahn. User's guide for a user-assisted video object segmentation tool[S]. ISO/IEC JTC1/SC29/WG11 MPEG98/m3935,1998.
    [76] S. Colonnese, G. Russo. User interactions modes in semi-automatic segmentation: development of a flexible graphical user interface in Java[S]. ISO/IEC JTC1/SC29/WG11 MPEG98/m3320,1998.
    [77] S. Colonnese, G. Russo. Segmentation techniques: towards a semiautomatic approach[S]. ISO/IEC JTC1/SC29/WG11 MPEG98/3093. 1998.
    [78] R. Castagno, T. Ebrahimi, M. Kunt. Video segmentation based on multiple features for interactive multimedia applications[J]. IEEE Transactions on Circuits and Systems for Video Technology,1998,8(5) :562-571.
    [79] C. Gu, M.-C. Lee. Semi-automatic segmentation and tracking of semantic video objects[J]. IEEE Transactions on Circuits and Systems for Video Technology, 1998,8(5) :572-584.
    [80] B. D. Lucas, T. Kanade. An iterative image-registration technique with an application to stereo vision[C], DARPA Image Understanding Workshop,1981:121-130.
    [81] R. Nevatia. Locating object boundaries in textured environments[J]. IEEE Trans. Comp., 1976, C-25:1170-1180.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700