基于多种特征的视频分类研究

英文题名：Video Classification Based on Multi-features
作者：宋刚
论文级别：硕士
学科专业名称：计算机系统结构
中文关键词：体育视频 ; 主动相关反馈 ; SVM多分类 ; 多特征 ; 主成分分析法
英文关键词：multi-features ; active relevance feedback ; principal component analysis (PCA) ; support vector machine (SVM)
学位年度：2010
导师：肖国强
学科代码：081201
学位授予单位：西南大学
论文提交日期：2010-04-30

摘要

随着计算机技术和多媒体技术的迅猛发展,人们越来越容易制作和存储数字视频,并且在通信与互联网普及的今天,数字视频在网络上的传播也更加容易,在全世界范围内形成了海量的数据库。视频作为声音、图像、文字等信息的载体,给用户展现了不同的信息。人们总是希望从这些海量的视频数据库中搜索出一些有用的信息,找出一些自己感兴趣的视频。这就必须要先对视频进行分类整理,使人们在搜索视频时有一定的规律可循。但是,如何对这些海量的视频数据进行分类整理是视频处理中亟待解决的问题之一。近年来,对视频进行分类逐渐成为了研究的热点,也是极具挑战性的研究课题。
     本文对视频分类作了较深入的研究,首先分析了视频分类的研究现状与发展趋势,并在总结现有算法优劣的基础上提出了一种通过提取多种特征,利用基于主动相关反馈的支持向量机(sVM)实现体育视频分类的方法。
     不同类别的体育视频在场地颜色,场地位置,区域亮度,纹理,运动强度及运动员的运动方式和运动区域上都有一定的区别。因此,可通过提取这些方面的特征对视频进行刻画,用以表示视频信息。文中提出了一种基于区域的视频特征提取方法,首先将视频按区域分块,再计算视频关键帧中各块的颜色矩作为颜色特征,并对块之间亮度均值进行比较得到块亮度比较编码(BICC)作为亮度特征,其次提取视频各个区域中的运动强度,运动方向等信息作为运动特征,再通过关键帧的灰度共生矩阵提取出纹理特征。为了提高处理效率,通过主成分分析法(PCA)对特征进行降维处理。本文在此基础上设计了一个基于主动相关反馈的SVM树型多分类模型用于视频分类,在模型的每个分类节点中,都使用一个或多个SVM二类分类器,并通过投票法对每个节点的分类结果进行统计以进行视频样本类别属性的判定。最后利用搜集的视频进行测试,实验结果表明,本文提出的通过多种特征及基于主动相关反馈的SVM树型分类模型实现视频分类的方法,具有良好的性能。
With the rapid development of computer and multimedia technique, it is possible to make and store digital videos easily, so the digital videos become an appropriate source of information for various users like researchers. Additionally, Current information and communication technologies provide the infrastructure to transport bits anywhere. But it is difficult for users to search the videos in which they are interested from the mountains of video databases. In other words, many of these videos recording data are currently hardly usable, and this is mainly due to the lack of appropriate techniques, which can make the video content more accessible. So with both the rapid increase in the amount of generated video data and the wide range of video applications, an efficient and effective management of video records is much demanded. Manually indexing video content is currently the most accurate method, however, it is a very time-consuming process. For an user to retrieve the required information, automatic classification and categorization of the video content is essential.
     This dissertation makes a deep research on video automatic classification including analysis the state of the art, and summarizes the progress trend and drawbacks or advantages of these existing algorithms. Then we propose a new algorithm about video automatic classification based on various features and support vector machine (SVM) based on active relevance feedback.
     Considering that the average intensity, motion intensity and color distribution are different among regions of video, we propose a novel feature extracted method based on video regions. First, it divides the video into blocks, then according to comparison of the average intensity among different blocks of key frames to get the feature block intensity comparison code (BICC), and get the block color histogram through the statistics of color components in each block and extract the texture of key frames. Then extract the motion intensity of every region of videos. Furthermore, using principal component analysis (PCA), the extracted features are reduced the redundancy while exploiting the correlations between the feature elements. Finally, we design a tree classification model based on SVM with active relevance feedback, and use it to classify the videos with extracted features. The experimental results show that the proposed approach in the dissertation outperforms other methods which are based on features such as video saliency regions or only BICC.

引文

[1]Konstantinos Rapantzikos, Nicolas Tsapatsolis, Yannis Avrithis, "Spatiotemporal saliency for video classification" [J], Signal Processing: Image Communication,2009,8:557-571.
    [2]Yu-Fei Ma, Hong-Jiang Zhang, "Motion Pattern based Video Classification and Retrieval" [J], EURASIP Journal on Applied Signal Processing,2003,2:199-208.
    [3]M. Kalaiselvi Geetha, S.Palanivel, "A novel block intensity comparison code for video classification and retrieval" [J], Expert Systems with Applications,2009,36:6415-6420.
    [4]Rene Cavet, Stephan Volmer, Edda Leopold, Joerg Kindermann, Gerhard Paass, "Revealing the connoted visual code a new approach to video classification" [J], Computer & Graphics,2004, 28:361-369.
    [5]Cheng-Chang Lien, Chiu-Lung Chiang, Chang-Hsing Lee, "Scene-based event detection for baseball videos" [J], visual communication arid image representation,2007,18:1-14.
    [6]David Liu, Tsuhan Chen, "Video retrieval based on object discovery" [J],Computer Vision and Image Understanding,2008,08:1-8.
    [7]Haoran Yi, Deepu Rajan, Liang-Tien Chia, "A new motion histogram to index motion content in video segments" [J], Pattern Recognition Letters,2005,26:1221-1231.
    [8]MC Roh, B Christmas, J Kittler, SW Lee, "Gesture spotting for low-resolution sports video annotation" [J], PATTEN RECOGNITION,2008,41:1124-1137.
    [9]X Gao, X Li, J Feng, D Tao, "Shot-based video retrieval with optical flow tensor and HMMs" [J], Pattern Recognition Letters,2009,30:140-147.
    [10]Wensheng Zhoua, Son Dao, C.-C. Jay Kuo, "On-line knowledge-and rule-based video classification system for video indexing and dissemination" [J], Information Systems,2002, 27:559-586.
    [11]Sarah Porter, Majid Mirmehdi, Barry Thomas, "Temporal video segmentation and classification of edit effects" [J], Image and Vision Computing,2003,21:1097-1106.
    [12]Dong Xu, and Shih-Fu Chang, "Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment" [J], IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2008,11(30):1985-1997.
    [13]Yael Pritch, Alex Rav-Acha, Shmuel Peleg, "Nonchronological Video Synopsis and Indexing" [J], IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008,11(30):1971-1984.
    [14]W. Ren, S.Singh, M.Singh, Y.S.Zhu, "State-of-the-art on spatio-temporal information-based video retrieval " [J], Pattern Recognition,2007(42):267-282.
    [15]Nikhil Rasiwasia, Pedro J. Moreno, Nuno Vasconcelos, "Bridging the Gap:Query by Semantic Example" [J], IEEE TRANSACTIONS ON MULTIMEDIA,2007,8(9):923-938.
    [16]Jinhui Yuan, Huiyi Wang, Lan Xiao, Wujie Zheng, Jianmin Li, Fuzong Lin, Bo Zhang,; "A Formal Study of Shot Boundary Detection" [J], IEEE Trans. Circuits Syst. Video Technol,2007, 2(17):168-186.
    [17]小野定康,铃木纯司著,叶明译,“JPEG/MPEG技术”[M]，北京：科学出版社，2004。
    [18]余兆明,李晓飞,陈来春,“MPEG标准及其应用”[M],北京：北京邮电大学出版社,2002。
    [19]刘峰，“视频图像编码技术及国际标准”[M]，北京：北京邮电大学出版社，2006。
    [20]张春田,苏育挺,张静,“数字图像压缩编码”[M]，北京：清华大学出版社,2006。
    [21]Platt J,J.Smola, P.L.Bartlett. "Probabilities for sv machines" [C], Cambridge, MA:Advances in Large Margin Classifiers,2000.61-74.
    [22]Y Zhao, Y Zhao, Z Zhu, "TSVM-HMM:Transductive SVM based hidden Markov model for automatic image annotation" [J], Expert System with Applications,2009,36:9813-9818.
    [23]王雪松,程玉虎,“机器学习理论、方法及应用”[M],北京：科学出版社,2009。
    [24]蒋艳凰,赵强利，“机器学习方法”[M],北京：电子工业出版社,2009。
    [25]孙即祥,“现代模式识别”[M],北京：高等教育出版社,2008。
    [26]边肇祺,张学工,“模式识别”[M]，北京：清华大学出版社，2000。
    [27]Daniel Hemandez-Lobato, Jose Miguel Hernandez-Lobato, "Bayes Machines for binary classification"[J], Pattern Recognition Letters,2008,29:1466-1473.
    [28]Marco Ramoni, Paola Sebastiani, "Robust Bayes classifiers" [J], Artificial Intelligence,2001, 125:209-226.
    [29]Yingquan Wu, Krassimir Ianakiev, Venu Govindaraju, "Improved k-nearest neighbor classification" [J], Pattern Recognition,2002,35:2311-2318.
    [30]Yihua Liao, V. Rao, Vemuri, "Use of K-Nearest Neighbor classifier for intrusion detection" [J], Computers & Security,2002,5(21):439-448.
    [31]SIMONE SANTINI, ALBERTO DEL BIMBO, "Properties of Block Feedback Neural Networks" [J], Neural Networks,1995,4(8):579-596.
    [32]Raphael Feraud, Fabrice Clerot, "A Methodology to Explain Neural Networks Classification" [J], Neural Networks,2002,15:237-246.
    [33]SHAI FINE, YORAM SINGER, NAFTALI TISHBY, "The Hierarchical Hidden Markov Model: Analysis and Applications" [J], Machine Learning,1998,32:41-62.
    [34]Jia Liu, Xiaofeng Tong, Wenlong Li, Tao Wang, Yimin Zhang, Hongqi Wang, "Automatic player detection, labeling and tracking in broadcast soccer video" [J], Pattern Recognition Letters,2009,30:103-113.
    [35]Arnon Amir, Sankar Basu, Giridharan Iyengar, Ching-Yung Lin, Milind Naphade, John R. Smith, Savitha Srinivasan, Belle Tseng,"A multi-modal system for the retrieval of semantic video events" [J], Computer Vision and Image Understanding,2004,96:216-236.
    [36]Ying Luo, "Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks" [J], Computer Vision and Image Understanding,2003,92:196-216.
    [37]K. Messer, "A unified approach to the generation of semantic cues for sports video annotation" [J], SIGNAL PROCESSING,2005,85:357-383.
    [38]Meng Wang, "Semi-supervised kernel density estimation for video annotation" [J], Computer Vision and Image Understanding,2008(08) 1-13.
    [39]Carlos Lopez, "Using object and trajectory analysis to facilitate indexing and retrieval of video" [J], Knowledge-Based Systems,2006,19:639-646.
    [40]Ya-Chun Cheng, Shu-Yuan Chen, "Image classification using color, texture and regions" [J], Image and Vision Computing,2003,23:759-776.
    [41]Thomas Hurtut, Yann Gousseau, Francis Schmitt, "Adaptive image retrieval based on the spatial organization of colors" [J], Computer Vision and Image Understanding,2008,112:101-113.
    [42]Tzu-Chuen Lu, Chin-Chen Chang, "Color image retrieval technique based on color features and image bitmap" [J], Information Processing and Management,2007,43:461-472.
    [43]Linh Viet Tran, Reiner Lenz, "Compact colour descriptors for colour-based image retrieval" [J], Signal Processing,2005,85:233-246.
    [44]P.W. Huang, S.K. Dai, "Image retrieval by texture similarity" [J], Pattern Recognition,2003, 36:665-679.
    [45]Marcela X. Ribeiro, Agma J. M. Traina, Caetano Traina, Jr, Paulo M. Azevedo-Marques, "An Association Rule-Based Method to Support Medical Image Diagnosis With Efficiency" [J], IEEE TRANSACTIONS ON MULTIMEDIA,2008,10(2):277-285.
    [46]JC Felipe, AJM Traina, C Traina Jr, "Retrieval by content of medical images using texture for tissue identification" [J], Computer-Based Med Systems,2003,16:175-180.
    [47]Hock C. Chan, Yue Wang, "Human factors in color-based image retrieval: an empirical study on size estimate accuracies" [J], Visual communication and Image Representation,2004,15:113-131.
    [48]Chuen-Horng Lin, Rong-Tai Chen, Yung-Kuan Chan, "A smart content-based image retrieval system based on color and texture feature" [J], Image and Vision Computing,2009,27:658-665.
    [49]Haralick, R.M., K. Shanmugan, and I. Dinstein, "Textural Fetures for Image Classification" [J], IEEE Transactions on Systems, Man and Cybernetics,1973,3(6):610-621.
    [50]P.W. Huang, S.K. Dai, P.L. Lin, "Texture image retrieval and image segmentation using composite sub-band gradient vectors" [J], Visual communication and Image Representation, 2006,17:947-957.
    [51]Haim Permutera, Joseph Francos, Ian Jermyn, "A study of Gaussian mixture models of color and texture features for image classification and segmentation" [J], Pattern Recognition,2006, 39:695-706.
    [52]Nurettin Acir, O zcan Ozdamar, Cuneyt Guzelis, "Automatic classification of auditory brainstem responses using SVM-based feature selection algorithm for threshold detection" [J], Engineering Applications of Artificial Intelligence,2006,19:209-218.
    [53]Chao-Ton Su, Chien-Hsin Yang, "Feature selection for the SVM:An application to hypertension diagnosis" [J], Expert Systems with Applications,2008,34:754-763.
    [54]Meng-Dar Shieh, Chih-Chieh Yang, "Multiclass SVM-RFE for product form feature selection" [J], Expert Systems with Applications,2008,35:531-541.
    [55]Marco Ramoni, Paola Sebastiani, "Robust Bayes classifiers" [J], Artificial Intelligence,2001, 125:209-226.
    [56]K. Shima, M. Todoriki, A. Suzuki, "SVM-based feature selection of latent semantic features" [J], Pattern Recognition Letters,2004,25:1051-1057.
    [57]Zhiguo Yan, Zhizhong Wang, Hongbo Xie, "The application of mutual information-based feature selection and fuzzy LS-SVM-based classifier in motion classification" [J], computer methods and programs in biomedicine,2008,9:275-284.
    [58]Huawen Liu, Jigui Suna,b, Lei Liua, Huijie Zhang, "Feature selection with dynamic mutual information" [J], Pattern Recognition,2009,42:1330-1339.
    [59]Mark A.Kramer, "Nonlinear Principal Component Analysis Using Autoassociative Neural Networks" [J], Aiche,1991,1:233-243.
    [60]J. Zhao, Q. Jiang "Probabilistic PCA for t distributions" [J], Neurocomputing,2006,69:2217-2226.
    [61]MS Park, JY Choi, "Theoretical analysis on feature extraction capability of class-augmented PCA" [J], Pattern Recognition,2009,42:2353-2362.
    [62]ZHANG Long-fei, CAO Yuan-da, ZHOU Yi-hua, "Support Vector Machine Mate Classifier Based Sports Video Classification" [J], Transactions of Beijing Institute of Technology,2006, 26:41-44,67.
    [63]http://www.csie.ntu.edu.tw/%7Ecilin/papers/guide/guide.pdf.
    [64]Berthold K.P. Horn and Brian G. Schunck, "Determining Optical Flow" [J], Artificial Intelligence 1981,17:185-203.
    [65]S. S. BEAUCHEMIN AND J. L. BARRON, "The Computation of Optical Flow" [J], ACM Computing Surveys, Vol 27, No.3, September 1995,9(27):433-467.
    [66]Chin-Hung Teng, Shang-Hong Lai, Yung-Sheng Chen, Wen-Hsing Hsu, "Accurate optical flow computation under non-uniform brightness variations" [J], Computer Vision and Image Understanding,2005,19:315-346.
    [67]Michael J. Black, P. Anandan, "A framework for the robust estimation of optical flow" [C], Berlin, Germany:Proc. Fourth Int. Conf. on computer Vision(ICCV'93),1993:1-6.
    [68]Ming-Huwi Horng, "Multi-class support vector machine for classification of the ultrasonic images of supraspinatus" [J], Expert Systems with Applications,2009,36:8124-8133.
    [69]Yong Rui, Thomas S. Huang, Michael Ortega, Sharad Mehrotra. "Relevance Feedback: A Power Tool for Interactive Content-Based Image Retrieval" [J], IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,1998,9:644-655.
    [70]Miguel Arevalillo-Herrdez, FrancescJ.Ferri, JuanDomingo. "A naive relevance feedback model for content-based image retrieval using multiple similarity measures" [J], Pattern Recognition 2010,43:619-629.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700