视频场景中的群体目标分析研究

英文题名：Research on Crowd Target in Video Scene
作者：乔伟
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：群体目标检测 ; 有限时间李亚普诺夫指数 ; 轮廓提取 ; 灰度共生矩阵 ; 小轮廓融合算法
英文关键词：crowd target detection ; FTLE ; contour extraction ; GLCM ; small-contour fusion algorithm
学位年度：2010
导师：王汇源
学科代码：081002
学位授予单位：山东大学
论文提交日期：2010-04-12

摘要

计算机视觉已成为一个非常活跃的研究领域,它涉及信号采集、图像处理、机器学习、模式识别、行为控制甚至认知科学等多门学科,主要研究视频图像序列中目标的检测、跟踪、行为分析与识别等问题。视频中目标检测和跟踪是计算机视觉的一个方向,目前这方面的研究大多数集中在单或多个物体和人的检测和跟踪。多目标检测和跟踪,也往往将检测和跟踪的目标局限在十个以内。
     随着人口数目的增加、城市化进程的加快、社会活动频繁增加,公共场合正变得越来越拥挤,大型的集会也日渐增多。于是,对群体活动视频监测的要求变得越来越迫切,但是目前很少有研究涉及群体目标检测和分析。针对这个问题,我们做了如下研究工作：
     首先,本文对运动场和光流场进行了概述。运动场是描述目标运动的矢量,在没有光照影响的情况下,光流场可以用来表示运动场。光流场是指图像亮度模式的表观运动,可以通过添加限制条件来求解光流约束方程来获得。Lucas-Kanade光流法是比较适合群体运动目标检测和分类的一种光流法,本文用此光流法计算光流场。
     混沌动力学中,拉格朗日方法尝试跟踪运动流中各像素点的运动轨迹,是处理流体的一种方法。由于群体运动目标的高密度性,可以被认作流体,用混沌动力学的方法处理。有限时间李亚普诺夫指数表示相近粒子间的混合和分离性,反映粒子间的分离程度。根据流图,通过计算龙哥库塔方程获得有限时间李亚普诺夫指数图像。文中比较了运用立体插值和三维反距离加权插值算法对结果的影响,得到运算简单三维反距离加权插值算法更适合实时的应用于群体运动目标的检测和分析的结论。
     根据获得的有限时间李亚普诺夫指数图像,可以获得运动区域。本文提出了一种改进的Bernsen自适应二值化算法获得有限时间李亚普诺夫指数图像的二值图像。然后用形态学的方法获得大概的运动区域。这时的运动区域存在一个问题是：存在一些“空洞”。为了处理这个问题,可以采用Freeman轮廓提取并填充的方法。
     在运动区域获得后,需要对其进行分析。本文的分析从两方面进行：方向和密度。对于方向分析,用光流场来获得。提出一种改进的K-均值聚类的方法获得的光流场方向图像。由于视频本身质量原因,得到的结果斑驳不堪。又提出一种小轮廓融合算法对上述结果做改进,有效地去除了各种杂质。对于密度分析,本文采用了纹理的方法。灰度共生矩阵分析是一种纹理分析的方法,其特征参数从不同方面描述图像的纹理。其中对比度反映沟纹的清晰程度,纹理的细致程度,可以通过它获得该运动区域的密度信息。根据各个方向不同的密度信息,通过贝叶斯分类,将不同方向的群体运动目标划分为：稀疏、中等和密集。
     本文各种算法均通过采用C语言并结合Intel OpenCV库实现。将密集群体系统作为混沌系统处理,首先求解FTLE场。求解龙哥库塔方程时采用反距离加权插值算法,并排除FTLE中标识为非混沌系统的点,表示流动的部分。再用形态学方法处理FTLE场图后获得运动区域。然后,对运动目标区域进行方向分析,使用了改进的K-均值聚类的方法,并提出小轮廓融合算法吸收杂质以优化分割结果。接着对运动目标区域进行密度分析,对分割后的各个区域求取GLCM,以对比度为标准获得密度信息。结果表明,提出的算法有效实现了对从多体到群体的运动目标检测和分析,对群体运动目标的研究有一定的探索意义。
Computer vision has become a very active research area, which involves signal acquisition, image processing, machine learning, pattern recognition, behavior control and even cognitive science and many other subjects. It researches on the target of video image sequence for detection, tracking, behavior analysis and so on. Target detection and tracking in video is an aspect of computer vision. Current research is only concentrated on single or multiple targets and human monitoring and tracking. The number of targets is limited to 10 or less for multi-target detection and tracking.
     With the increase of population and the acceleration of urbanization process, public places is becoming more and more crowded and large gatherings are also increasing. As a result, the requirements for crowd target monitoring have become urgent. But few researches have involved in crowd target detection and analysis. Concerning this problem, we have done the following work:
     First, an overview of motion field and optical flow field is given in this thesis. Motion field are vectors that describe the target motion. Without considering the effect of illumination, the optical flow field can be used to express motion field. Optical flow field is referred to the apparent motion of image brightness patterns. It can be obtained by adding constraints to solve the optical flow constraint equation. Lucas-Kanade optical flow method is suitable for crowd target detection and classification. We use the method to calculate optical flow field in this thesis.
     In Chaotic dynamics, the Lagrangian algorithm is an approach of dealing with fluid, which attempts to track the motion trajectory of each pixel. Due to the high density, crowd target can be recognized as fluid. Finite time Lyapunov exponent(FTLE) shows the mixing and separation of particles. It reflects the degree of separation between particles. According to flow diagram, FTLE images can be obtained by calculating the Runge-Kutta-Fehlberg equations. The thesis compares cubic interpolation and inverse distance weighted interpolation algorithm in the impact on the results. The simple inverse distance weighted interpolation algorithm is more suitable for real-time crowd motion target detection and analysis.
     Motion area is obtained based on FTLE image. This thesis presents an improved Bernsen adaptive binary algorithm to get FTLE binary image. Then the motion area is obtained by morphology. There are holes in the motion area. In order to solve this problem, we can use Freeman contour extraction.
     The motion area needs to be analyzed. The analysis is in two ways:the direction and density. For the direction, optical flow field is used. An improved K-means clustering approach is presented to obtain optical flow direction image. Duo to the quality of video, the results is mottled. The small-contour fusion algorithm is proposed to improve the results, which effectively removes all impurities. For density analysis, this thesis adopts the approach of texture. Gray level co-occurrence matrix (GLCM) analysis is an approach of texture analysis. Its characteristic parameters describe the image texture from different aspects. Contrast reflects the clarity of texture. The density of motion area can be got. According to the different density of all directions and the Bayesian classification, different targets are classified:sparse, middle and dense.
     All the proposed algorithms are implemented by C and Intel OpenCV library. We treat crowd target as fluid. Firstly, FTLE field is got. For solving Runge-Kutta-Fehlberg equations, inverse distance weighted interpolation algorithm is used. Non-chaotic points are excluded, and the rest represent the flow region. Secondly, motion region can be obtained after morphological processing. Then, K-means clustering is applied for motion direction analysis. Impurities are absorbed by small-contour fusion algorithm to optimize the results. Finally, GLCM is obtained for density analysis. Density information is given by the contrast of GLCM. The results show that the algorithms are effective. It brings a way to crowd target research.

引文

[1]D Marr. Vision[M]. San Francisco, Freeman W.H. and Company,1982.
    [2]H G Barrow, J M Tenenbaum. Computational Vision[J]. Proceedings of the IEEE, 1981.69(5):572-595.
    [3]贾云得.机器视觉[M].北京,科学出版社,2000.
    [4]J M S Prewitt. Object Enhancement and Extraction[M]. New York, Academic Press, 1970,75-150.
    [5]R M Haralick. Digital Step Edges from Zero Crossing of Second Directional Derivatives[J].IEEE Trans on PAMI,1984,6(1):58-68
    [6]R M.Haralick. Edge and Region Analysis for Digital Image Data[M]. New York, Academic Press,1983.
    [7]R M Haralick. Ridges and Valleys on Digital Images[J].CVGIP.1983,22:28-38.
    [8]Rosenfeld. Computer Vision:A Source of Models for Biological Visual Process[J]. IEEE Trans, on Biomed Eng,1989,36(1):83-94.
    [9]A P Witkin. Scale Space Filtering[C]. Proceedings of the International Joint Conference on Artificial Intelligent,1983,2:1019-1022.
    [10]S G Mallat. Multifrequence Channel Decomposition of Images and Wavelet Models[J]. IEEE Transactions on Acoustics, Speech and Signal Processing,1989, 37(12):2091-2110.
    [11]P S Marc, J S Chen, G Medioni. Adaptive Smoothing:A General Tool for Early Vision Pattern Analysis and Machine Intelligenc[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13(6):514-529.
    [12]刘富强.数字视频信息处理与传输教程[M].北京,机械工业出版社,2004.
    [13]C Kuglin, D Hines. The Phase Correlation Image Alignment Method[C]. Proceedings of IEEE International Conference on Cybernetics and Society, 1975:163-165.
    [14]Falconer, G David. Target Tracking with a Fourier-hough Transform[C]. Proceedings of the Asilomar Conference on Circuits, Systems&Computers, 1980:479-482.
    [15]Heeger, J David. Optical Flow from Spatiotemporal Filters[C]. Proceedings of the
    First International Conference on Computer Vision,1987:181-190.
    [16]T S Huang, A N Nravali. Motion and Structure from Feature Correspondences:A Review [J]. Proceedings of the IEEE,1994,82(2):252-268.
    [17]A Saad, S Mubarak. A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis[C]. IEEE International Conference on Computer Vision and Pattern Recognition,2007:1-6.
    [18]M A Green, C W Rowley, G Haller. Definition and Properties of Lagrangian Coherent Structures from Finite Time Lyapunov Exponents in Two Dimensional Aperiodic Flows[J]. Physica D,2005,212(3-4):271-304.
    [19]F Lekien, J Marsden. Tricubic Interpolation in Three Dimensions[J]. International Journal of Numerical Methods in Engineering,2005,63(3):455-471.
    [20]吕金虎,陆君安,陈士华.混沌时间序列分析及其应用[M].武汉,武汉大学出版社,2002：11-93.
    [21]D Hearn, M P Baker. Computer Graphics with OpenGL (3th Edition)[M]. Prentice Hall-Pearson,2004.3.
    [22]J Bernsen. Dynamic Thresholding of Gray-level Images[C].8th International Conference on Pattern Recognition. Paris, IEEE Computer Society Press, 1986:1251-1255.
    [23]阮秋琦.数字图像处理学[M].北京,电子工业出版社.2001,1-6.
    [24]C Maggioni, B Kammerer. Gesture Computer:History, Design and Applications[M]. Computer Vision for Human-machine Interaction. Cambridge, Cambridge University Press,1998.
    [25]I Pitas. Digital Image Processing Algorithms and Applications[M]. Englewood Cliffs, NJ:Prentice-Hall,1993:305-310.
    [26]K R Castleman. Digital Image Processing(影印版)[M].北京,清华大学出版社, 1998:261-270.
    [27]H Hasan, H Haron, S Z,Hashim, et al.Logical Heuristic Algorithm in Extracting 2D Structure Thinned Binary Image into Freeman Chain Code[C]. Proceedings of International Visual Informatics Conference, Springer Verlag,2009:770-778.
    [28]Li Chao, Wang Zhong, Li Lin. An Improved HT Algorithm on Straight Line Detection based on Freeman Chain Code[C]. Proceedings of the 2nd International Congress on Image and Signal Processing, IEEE Computer Society,2009:1-4.
    [29]J R Smith, S F Chang. Automated Binary Texture Feature Sets for Image Retrieval [C]. IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, USA,1996,4:2239-2242.
    [30]B Lucas, T Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision[C]. Proceedingds of 7th International Joint Conference on Artificial Intelligence (IJCAI),1981:674-679.
    [31]B D Lucas. Generalized Image Matching by the Method of Differences[D]. Robotics Institute, Carnegie Mellon University, July,1984.
    [32]W A Coppel. Dichotomies in Stability Theory[M]. New York, Springer-Verlag, 1978.
    [33]C Coulliette, S Wiggins. Integral Transport in a Wind-driven, Quasi Geostrophic Double Gyre:An Application of Lobe Dynamics[J]. NonlinearProcessesin Geophysics,2000,7:59-85.
    [34]C Coulliette, F Lekien, G Haller, et al. Optimal Pollution Mitigation in Monterey Bay Based on Coastal Radar Data and Nonlinear Dynamics[J]. Environmental Science and Technology,2007,41 (18):6562-6572.
    [35]R Doerner, B Hubinger, W Martienssen, A Grossmann, S Thomae. Stable Manifolds and Predictability of Dynamical Systems[J]. Chaos, Solitons & Fractals,1999, 10(11):1759-1782.
    [36]J D Eldredge, T Colonius, A Leonard. A Vortex Particle Method for Two-dimensional Compressible Flow[J]. Journal of Computational Physics,2002, 179:371-399.
    [37]J D Eldredge. Efficient Tools for the Simulation of Flapping Wing Flows[D].43rd AIAA Aerospace Sciences Meeting and Exhibit,2005,85.
    [38]J Guckenheimer, P Holmes. Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields[M]. New York, Springer-Verlag,1983.
    [39]E Hairer, C Lubich, G Wanner. Geometric Numerical Integration[M]. New York, Springer,2002.
    [40]G Haller, A C Poje. Finite-time Transport in A Periodic Flows[J]. Physica D,1998, 119:352-380.
    [41]G Haller, G Yuan. Lagrangian Coherent Structures and Mixing in Two-dimensional Turbulence[J], Physica D,2000,147:352-370.
    [42]F Lekien, J Marsden. Tricubic Interpolation in Three Dimensions[J]. International Journal of Numerical Methods and Engineering,2005,63(3):455-471.
    [43]N Otsu. A Threshold Selection Method from Gray-level Histograms[J]. IEEE Transactions on Systems, Man and Cybernetics,1979,9(1):62-66.
    [44]H Freeman. Computer Processing of Line-drawing Images[J]. ACM Computing Surveys,1974,6(l):57-97.
    [45]J Han, M Kamber(范明,孟小峰等译).数据挖掘概念与技术(第一版)[M].北京,机械工业出版社,2006：185-217.
    [46]U Kandaswamy, A D A djeroh, M C Lee. Efficient Texture Analysis of SAR Imagery[J]. IEEE Transactions on Geoscience and Remote Sensing,2005, 43(9):2075-2083.
    [47]徐光裕.计算机视觉[M].北京,清华大学第七界优秀讲义,2002：160-161.
    [48]F T Ulaby, F Kouyate, B Brisco, et al. Textural Information in SAR Images[J]. IEEE Transactions on Geoscience and Remote Sensing,1986,24 (2):235-2451.
    [49]A Baraldi, F Parmiggiani. An Investigation of the Textual Characteristics Associated with Gray Level Co-occurrence Matrix Statistical Parameters[J]. IEEE Transactions on Geoscience and Remote Sensing,1995,33 (2):293-304.
    [50]薄华,马缚龙,焦李成.图像纹理的灰度共生矩阵计算问题的分析[J].电子学报,2006,34(1)：155-158

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700