多视场景异常目标描述
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
多视场景异常目标描述是在多视场景中对发生异常的目标进行描述,是计算机视觉领域内一个具有挑战性的课题。研究多视场景异常目标描述具有重要的学术价值和广阔的应用前景,对于人类活动的探索研究以及国防安全、公共安全都有重要的意义。多视场景异常目标描述的难点是:传统算法对复杂场景中运动目标检测的准确率不高;异常目标的行为具有很大的随机性;多视场景对异常目标描述具有较大的影响;对目标长时间的跟踪与描述困难。针对上述问题,本文展开多视场景异常目标描述的研究,本文主要工作和创新点总结如下:
     1).针对传统的运动目标检测方法准确率较低的问题,提出一种基于条件随机场模型的运动目标检测方法,该方法通过提取视频序列的运动特征和颜色特征,然后利用条件随机场模型对特征向量建模,实现对运动目标的检测。实验表明,该方法的误差率为14.38%,比传统的帧间差分法误差率81.34%、光流法误差率33.59%、混合高斯模型法误差率19.73%要低,时间复杂度低于光流法和混合高斯模型,接近帧间差分法。
     2).为了利用异常目标的多类特征,提出一种MCRF模型的异常目标描述方法,提取目标的多类特征,利用基本CRF模型对每类特征建模,形成多个CRF单元,组合所有的CRF单元得到MCRF模型,通过模型训练获取MCRF模型的参数,最终,通过模型的推断描述异常目标。实验结果表明该方法能较准确描述目标的某些特定的异常。
     3).针对传统场景描述方法描述准确率低的问题,提出一种基于隐含语义模型的场景描述方法,通过提取场景的多类特征,用K-means算法进行聚类,形成视觉单词,再利用pLSA模型将视觉单词划分为具有语义的主题分布,最后采用CRF模型对语义主题分布进行建模来描述场景。实验表明,该方法的描述准确率可以达到91.4%,优于SVM、Bayes模型。
     4).由于传统的异常目标描述方法并未考虑场景的影响,提出一种基于团块与轨迹特征的异常目标描述方法。通过场景描述形成团块,提取运动目标的轨迹特征,并组合团块和运动目标的轨迹特征,形成组合特征向量,利用HMM模型对组合特征向量建模,描述异常目标,实验表明该方法能将场景的语义状态融合到目标的运动轨迹中,实现了场景和目标的结合,对于描述某些特定场景的异常目标具有较大的意义。
     5).针对单视场景不能长时跟踪和描述目标的问题,提出一种多视场景异常目标描述方法。利用场景描述方法对多视场景依照设定的顺序进行语义描述,提取运动目标在多视场景中的轨迹特征,组合多视场景的语义描述与轨迹特征,形成组合特征向量,再利用HMM对异常目标进行描述。实验表明多视场景的异常目标描述方法,可以较准确的描述特定多视场景中的异常目标。
Abnormal object description of multi-view scene is to describe the abnormal object in multi-view scene, which is a challenging topic in the field of computer vision. Research on abnormal object description of multi-viwe scene has important academic value and broad application prospects. And it also has an important significance for the exploration of human activities, national security and public safety. There are some difficulties of abnormal object description of multi-view scene. For example, the accuracy of the traditional algorithm for moving target detection in complex scene accuracy is not high; the object always has more complex behavior; multi-view scene has a great influence on the abnormal target description; it is difficult to track and describe object in long time. In order to solve these problems, we carry out research on multi-view scene abnormal object description. The main work and innovation are listed as follows:
     1). A method of moving object detection based on CRF is proposed in allusion to the problem of low accuracy of the traditional moving object detection. The movement features and color features are extracted and then the feature vector is modeled by CRF and then abnormal object is described. The experimental results show that the error rate of this method is14.38%, less than traditional method such as frames subtraction with error rate81.34%, optical flow with error rate33.59%, Gaussian mixture model with error rate19.73%. The computation time of this time is less than optical flow and Gaussian mixture model and closed to frames subtraction.
     2). In order to use many types of features, a method of abnormal object description based on MCRF model is proposed. Several features subsets can be formed through more Features extraction. Then we made use of CRF model to each feature subset and got CRF units. Finally, we combined all the CRF units to produce MCRF model which was utilized to detect abnormal activity. The experimental results indicate that the accuracy rate of this method is better.
     3). A method of scene description based on latent topic model is proposed according to the problem of the low accurate of traditional methods. The several features are extracted and clustered to visual words by k-means algorithm. The visual words are divided into semantic topic distributions by pLSA model. And then, the semantic topic distributions are modeled by CRF to describe scene. The experimental results show that the accurate of this method is91.4%, better than the SVM and Bayesian.
     4). A method of abnormal object description based on blob and trajectory is proposed for the problem of traditional method with no consideration of the impact of the scene. The scene is described as blobs and trajectory of the object is extracted. Blobs and trajectory are combined to form feature vector. The feature vector is modeled by HMM to describe abnormal object. The experimental results show that this method can achieve a combination of semantic state of scene and object trajectory, and has a larger significance for describe abnormal object of certain scene.
     5). A method of abnormal object description of multi-view scene is proposed to solve the problem that the object of single visual scene can not be tracked and described in long time. The multi-view scenes are described as semantic states in accordance with the order of the camera location. Trajectory of the object is extracted. Semantic states of multi-view scenes and trajectory are combined to form feature vector. The feature vector is modeled by HMM to describe abnormal object. The experimental results show that this method can describe abnormal object of multi-view scene more accurately.
引文
AKAIKE H 1973. Information theory and an extension of the maximum likelihood principle [C] //.Hungary.267-281.
    Al-Diri B., Hunter A., Steel D.2009. An active contour model for segmenting and measuring retinal vessels[J]. Medical Imaging, IEEE Transactions on,28:1488-97.
    BALTIERI D, VEZZANI R, CUCCHIARA R 2011.3DPeS:3D people dataset for surveillance and forensics [C]//, ACM; New York.59-64.
    Besag J.1977. Some methods of statistical analysis for spatial data[J]. Bulletin of the Intelnational Statistical Institute,47:77-92.
    Bezdek J.C.1981. Pattern recognition with fuzzy objective function algorithms[M]. Kluwer Academic Publishers.
    Blei D.M., Ng A.Y., Jordan M.I.2003. Latent dirichlet allocation[J]. The Journal of Machine Learning Research,3:993-1022.
    Bosch A., Zisserman A., Munoz X.2006. Scene classification via pLSA[J]. Computer Vision-ECCV 2006,517-30.
    Boykov Y., Veksler O., Zabih R.2001. Fast approximate energy minimization via graph cuts[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,23:1222-39.
    Burt P., Adelson E.1983. The Laplacian pyramid as a compact image code[J]. Communications, IEEE Transactions on,31:532-40.
    Canny J.1986. A computational approach to edge detection[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,679-98.
    COMANICIU D, RAMESH V, MEER P 2000. Real-time tracking of non-rigid objects using mean shift [C]//, IEEE; City.142-149.
    Cremers D., Kohlberger T., Schnorr C.2003. Shape statistics in kernel space for variational image segmentation[J]. Pattern Recognition,36:1929-43.
    CUCCHIARA R, GRANA C, PICCARDI M, et al.2001. Improving shadow suppression in moving object detection with HSV color information [C]//, IEEE.334-339.
    Dagum L., Menon R.1998. OpenMP:an industry standard API for shared-memory programming[J]. Computational Science & Engineering, IEEE,5:46-55.
    DAVIS J W, BOBICK A F 1997. The representation and recognition of human movement using temporal templates [C]//, IEEE.928-934.
    DAVIS L, CHELAPPA R, ROSENFELD A, et al.1998. Visual surveillance and monitoring [C]//73-76.
    DOLL R P, RABAUD V, COTTRELL G, et al.2005. Behavior recognition via sparse spatio-temporal features [C]//, IEEE.65-72.
    Doucet A., Godsill S.J., Robert C.P.2002. Marginal maximum a posteriori estimation using Markov chain Monte Carlo[J]. Statistics and Computing,12:77-84.
    Duncan J.H., Chou T.C.1992. On the detection of motion and the computation of optical flow[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,14:346-52.
    Fahlman S.E.1988. An empirical study of learning speed in back-propagation networks[J].
    FAN Q, BARNARD K, AMIR A, et al.2006. Matching slides to presentation videos using SIFT and scene background matching [C]//, ACM.239-248.
    Faugeras O.D., Hebert M.1986. The representation, recognition, and locating of 3-D objects[J]. The Intelnational journal of robotics research,5:27-52.
    FRIGO M, JOHNSON S G 1998. FFTW:An adaptive software architecture for the FFT [C]//IEEE.1381-1384.
    Galata A., Johnson N., Hogg D.2001. Learning variable-length Markov models of behavior[J]. Computer Vision and Image Understanding,81:398-413.
    Gauvain J.L., Lee C.H.1994. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains[J]. Speech and Audio Processing, IEEE Transactions on, 2:291-8.
    GIBBS J W, BUMSTEAD H A, LONGLEY W R 1928. The collected works of J. Willard Gibbs [M]. Longmans, Green and Company.
    Girifalco L.A., Weizer V.G.1959. Application of the Morse potential function to cubic metals[J]. Physical Review,114:687.
    GROPP W, LUSK E, SKJELLUM A 1999. Using MPI:portable parallel programming with the message passing interface [M]. MIT press.
    Hafner J., Sawhney H.S., Equitz W., et al.1995. Efficient color histogram indexing for quadratic form distance functions [J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,17:729-36.
    Hammersley J.M., Clifford P.1968. Markov fields on finite graphs and lattices[J].
    Hanson R.K., Morton-Bourgon K.2004. Public Safety and Emergency Preparedness Canada Predictors of Sexual Recidivism:An Updated Meta-Analysis[M]. Public Works and Government Services Canada.
    Hartigan J.A., Wong M.A.1979. Algorithm AS 136:A k-means clustering algorithm[J]. Journal of the Royal Statistical Society Series C (Applied Statistics),28:100-8.
    HAYES-ROTH B, WASHINGTON R, HEWETT R, et al.1989. Intelligent monitoring and control [C]//243-249.
    Hearst M.A., Dumais ST, Osman E., et al.1998. Support vector machines[J]. Intelligent Systems and their Applications, IEEE,13:18-28.
    Hill B.M.1968. Posterior distribution of percentiles:Bayes' theorem for sampling from a population[J]. Journal of the American Statistical Association,677-91.
    Horn B.K.P., Schunck B.G.1981. Determining optical flow[J]. Artificial intelligence, 17:185-203.
    Hosmer D.W., Lemeshow S.2000. Applied logistic regression[M]. Wiley-Intelscience.
    HUANG X, METAXAS D, CHEN T 2004. Metamorphs:Deformable shape and texture models [C]//IEEE. I-496-I-503 Vol.491.
    Jager G.2007. Maximum entropy models and stochastic Optimality Theory[J]. Architectures, rules, and preferences:variations on themes by Joan W Bresnan Stanford:CSLI,467-79.
    Johansson G.1973. Visual perception of biological motion and a model for its analysis[J]. Attention, Perception, & Psychophysics,14:201-11.
    Kumar V, Grama A., Gupta A., et al.1994. Introduction to parallel computing[M]. Benjamin/Cummings USA.
    Kyung-Seok SEO.2001. Context-free marker-controlled watershed transform for efficient multi-object detection and segmentation[J]. IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences,84:1392-400.
    Lafferty J., McCallum A., Pereira F.C.N.2001. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[J].
    Laptev I.2005. On space-time Intelest points[J]. Intelnational journal of computer vision, 64:107-23.
    Lu J., Zhang E.2007. Gait recognition for human identification based on ICA and fuzzy SVM through multiple views fusion[J]. Pattern Recognition Letters,28:2401-11.
    Lugannani R., Rice S.1980. Saddle point approximation for the distribution of the sum of independent random variables[J]. Advances in Applied Probability,475-90.
    Manjunath B.S., Ma W.Y.1996. Texture features for browsing and retrieval of image data[J]. Pattern Analysis and Machine Intelligence, IEEE Transactions on,18:837-42.
    Marques F., Molina C.1997. Object-tracking technique for content-based functionalities.1997
    Menon V, Trefethen A.E.1997. MultiMATLAB Integrating MATLAB with High Performance Parallel Computing.1997 IEEE; Morris RD, Descombes X., Zerubia J.1996. The Ising/Potts model is not well suited to segmentation tasks.1996 IEEE;
    Murphy K.P., Weiss Y., Jordan M.I.1999. Loopy belief propagation for approximate inference: An empirical study.1999 Morgan Kaufmann Publishers Inc.; Okabe A., Boots B.N., Sugihara K., et al.1992. Spatial tessellations:concepts and applications of Voronoi diagrams[M]. Wiley & Sons Chichester.
    Osher S., Sethian J.A.1988. Fronts propagating with curvature-dependent speed:algorithms based on Hamilton-Jacobi formulations[J]. Journal of computational physics,79:12-49.
    Oshima M., Shirai Y.1979. A scene description method using three-dimensional information[J]. Pattern Recognition,11:9-17.
    Owens J., Hunter A.2000. Application of the self-organising map to trajectory classification. 2000 IEEE;
    Parker J.R.2010. Algorithms for image processing and computer vision[M]. Wiley Publishing.
    Partio M., Cramariuc B., Gabbouj M., et al.2002. Rock texture retrieval using gray level co-occurrence matrix.2002 Citeseer;
    Patterson D.A., Hennessy J.L.2009. Computer organization and design:the hardware/software Intelface[M]. Morgan Kaufmann.
    Phillips S.J., Anderson R.P., Schapire R.E.2006. Maximum entropy modeling of species geographic distributions [J]. Ecological modelling,190:231-59.
    Pillai S.U., Kwon B.H.1989. Forward/backward spatial smoothing techniques for coherent signal identification[J]. Acoustics, Speech and Signal Processing, IEEE Transactions on,37:8-15.
    Rasmussen C.E.2000. The infinite Gaussian mixture model[J]. Advances in neural information processing systems,12:2.
    Salembier P., Marques F., Pardas M., et al.1997. Segmentation-based video coding system allowing the manipulation of objects[J]. Circuits and Systems for Video Technology, IEEE Transactions on,7:60-74.
    Schmuckler M.A., Gibson E.J.1989. The effect of imposed optical flow on guided locomotion in young walkers[J]. British Journal of Developmental Psychology,7:193-206.
    Schuldt C., Laptev I., Caputo B.2004. Recognizing human actions:A local SVM approach. 2004 IEEE;
    Shertukde H.M., Bar-Shalom Y.1990. Detection and estimation for multiple targets with two omnidirectional sensors in the presence of false measurements[J]. Acoustics, Speech and Signal Processing, IEEE Transactions on,38:749-63.
    Shields ED, Bixler D., El-Kafrawy AM.1973. A proposed classification for heritable human dentine defects with a description of a new entity[J]. Archives of oral biology,18:543-53, IN7.
    Simoncelli E.P., Adelson E.H., Heeger D.J.1991. Probability distributions of optical flow.1991 IEEE;
    Smith J.R., Chang S.F.1997. VisualSEEk:a fully automated content-based image query system. 1997 ACM;
    Smith S.W. Video surveillance system. Google Patents; 2004.
    Sobel M.E.1982. Asymptotic confidence Intelvals for indirect effects in structural equation models[J]. Sociological methodology,13:290-312.
    Stauffer C., Grimson W.E.L.1999. Adaptive background mixture models for real-time tracking. 1999 IEEE;
    Stricker M., Orengo M.1995. Similarity of color images.1995 Citeseer;
    Suh E., Rudolph L., Devadas S.2001. CSAIL.2001
    Terwilliger T.C.2002. Automated main-chain model building by template matching and iterative fragment extension[J]. Acta Crystallographica Section D:Biological Crystallography, 59:38-44.
    Thang N.D., Kim T.S., Lee Y.K., et al.2011. Estimation of 3-D human body posture via co-registration of 3-D human model and sequential stereo information[J]. Applied Intelligence, 35:163-77.
    Thurau C.2007. Behavior histograms for action recognition and human detection[J]. Human Motion-Understanding, Modeling, Capture and Animation,299-312.
    Tod M., Prevot M., Chalom J., et al.1991. Luminarin 4 as a labelling reagent for carboxylic acids in liquid chromatography with peroxyoxalate chemiluminescence detection[J]. Journal of Chromatography A,542:295-306.
    Van Den Heuvel M., Mandl R., Pol H.H.2008. Normalized cut group clustering of resting-state FMRI data[J]. PLoS One,3:e2001.
    Weiss Y., Freeman W.T.2001. On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs[J]. Information Theory, IEEE Transactions on, 47:736-44.
    Welch L.R.2003. Hidden markov models and the baum-welch algorithm[J]. IEEE Information Theory Society Newsletter,53:1-14.
    Yamato J., Ohya J., Ishii K.1992. Recognizing human action in time-sequential images using hidden Markov model.1992 IEEE;
    Yilmaz A., Javed O., Shah M.2006. Object tracking:A survey[J]. Acm Computing Surveys (CSUR),38:13.
    Yoma N.B., McInnes F.R., Jack M.A.1998. Improving performance of spectral subtraction in speech recognition using a model for additive noise[J]. Speech and Audio Processing, IEEE Transactions on,6:579-82.
    Yu H., Li M., Zhang H.J., et al.2002. Color texture moments for content-based image retrieval. 2002 IEEE;
    Zhong P., Wang R.2007. A multiple conditional random fields ensemble model for urban area detection in remote sensing optical images [J]. Geoscience and Remote Sensing, IEEE Transactions on,45:3978-88.
    Zinkevich M.2003. Online convex programming and generalized infinitesimal gradient ascent[J].
    陈运必.2011.高性能运动估计的架构设计与优化的研究[D].中国科学技术大学.
    谷军霞,丁晓青,王生进.2010.基于人体行为3D模型的2D行为识别[J].自动化学报,36:
    康立山,谢云,尤矢勇.1994.非数值并行算法:模拟退火算法[M].科学出版社.
    阮秋琦.2007.数字图像处理学[M].电子工业出版社.
    史伯平,郭立,陈运必,et al.2010.基于人眼视觉特性的一种x.264改进方法[J].中国科学技术大学学报,40:796-800.
    史册,徐胜荣.1997.基于团块的特征提取[J].计算机学报,20:1124-8.
    陶霖密,徐光佑.2001.机器视觉中的颜色问题及应用[J].科学通报,46:178-90.
    向友君.2003.运动估计快速块匹配算法[J].计算机工程,29:62-4.
    伊君翰.2009.基于多核处理器的并行编程模型[J].计算机工程,35:62-4.
    岳丽,陈希孺.2004.广义线性模型中拟极大似然估计的强相合性及收敛速度[J].中国科学,A辑,34:203-14.
    章毓晋.2002.图像处理和分析基础[M].高等教育出版社.
    周明,运筹学,孙树栋,et al.1999.遗传算法原理及应用[M].国防工业出版社.
    周颜军,王双成,王辉.2003.基于贝叶斯网络的分类器研究[J].东北师大学报:自然科学版,35:21-7.
    朱明旱,罗大庸,曹倩霞.2005.帧间差分与背景差分相融合的运动目标检测算法[J].计算机测量与控制,13:215-7.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700