核可鉴别的特征分块稀疏表示的视频语义分析

英文篇名：Video Semantic Analysis Based on Kernel Discriminative Features-Blocked Sparse Representation
作者：詹永照 ; 田华锋 ; 毛启容
英文作者：Zhan Yongzhao;Tian Huafeng;Mao Qirong;School of Computer Science and Communication Engineering,Jiangsu University;
关键词：视频语义分析 ; 核可鉴别性 ; 特征分块稀疏表示 ; 融合分析
英文关键词：video semantic analysis;;kernel discriminative;;features-blocked sparse representation;;fusing analysis
中文刊名：JSJF
英文刊名：Journal of Computer-Aided Design & Computer Graphics
机构：江苏大学计算机科学与通信工程学院;
出版日期：2014-08-15
出版单位：计算机辅助设计与图形学学报
年：2014
期：v.26
基金：国家自然科学基金(61170126,61272211)
语种：中文;
页：JSJF201408011
页数：7
CN：08
ISSN：11-2925/TP
分类号：82-88

摘要

针对视频特征的多样性和稀疏字典的冗余特点,提出一种基于核可鉴别的特征分块稀疏表示的视频语义分析方法.首先按照实际需求提取视频段多种特征,并根据各种特征的维数大小分别建立其分块稀疏字典,对每个分块字典在K-SVD算法基础上加入核可鉴别准则进行优化,使各种特征的稀疏表示特征具有更好的类别鉴别能力;在对视频段进行语义分析时,使用优化字典求解各种特征的稀疏表示特征,并对各种特征的稀疏表示特征采用加权KNN算法进行类别分类分析,最后依据各种特征对决策分析的支持度进行视频段的语义融合分析.实验结果表明,该方法有效地提高了视频语义分析的准确性和分析速度.
For the characteristics of video feature diversity and sparse dictionary redundancy,a method of video semantics based on discriminative features-blocked sparse representation is presented in this paper.Firstly,various features are extracted from video segments as actual requirement.Then each blocked dictionary for the sparse representation is constructed in terms of the dimensional size of each feature.The dictionaries are optimized by embedding discriminative criterion into K-SVD algorithm in order to improve category identification ability of sparse representation feature.In the semantic analysis for video segment,the sparse features of any kinds of primitive features are obtained by the optimized dictionaries.The weighted KNN is adopted for classification analysis of these sparse features.Finally,the video segment semantic is analyzed by fusing analysis of the sparse features according to the decision support degrees.The experimental results show that the proposed method can effectively improve the accuracy and speed of video semantic analysis.

引文

[1]Inoue N,Shinoda K.A fast and accurate video semanticindexing system using fast MAP adaptation and GMM supervectors[J].IEEE Transactions on Multimedia,2012,14(4):1196-1205
    [2]Xu G,Ma Y F,Zhang H J,et al.An HMM-based framework for video semantic analysis[J].IEEE Transactions on Circuits and Systems for Video Technology,2005,15(11):1422-1433
    [3]Lin L,Shyu M L,Ravitz G,et al.Video semantic concept detection via associative classification[C]//Proceedings of IEEE International Conference on Multimedia and Expo.Los Alamitos:IEEE Computer Society Press,2009:418-421
    [4]Cong Y,Yuan J S,Liu J.Abnormal event detection in crowded scenes using sparse representation[J].Pattern Recognition,2012,46(7):1851-1864
    [5]Jia X,Lu H C,Yang M H.Visual tracking via adaptive structural local sparse appearance model[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2012:1822-1829
    [6]Bai T X,Li Y F.Robust visual tracking with structured sparse representation appearance model[J].Pattern Recognition,2011,45(6):2390-2404
    [7]Martins A F T,Smith N A,Aguiar P M Q,et al.Structured sparsity in structured prediction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.East Stroudsburg:Association for Computational Linguistics Press,2011:1500-1511
    [8]Huang J Z,Zhang T,Metaxas D.Learning with structured sparsity[C]//Proceedings of the 26th International Conference on Machine Learning.Madison:Omnipress,2009:417-424
    [9]Elhamifar E,Vidal R.Robust classification using structured sparse representation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2011:1873-1879
    [10]Mairal J,Bach F,Ponce J,et al.Discriminative learned dictionaries for local image analysis[C]//Proceedings of the26th IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2008:Article No.4587652
    [11]Zhang Q,Li B X.Discriminative K-SVD for dictionary learning in face recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2010:2691-2698
    [12]Zhan Yongzhao,Zhang Shanshan,Cheng Keyang.Video semantics analysis based on nonlinear identifiable sparse representation[J].Journal of Jiangsu University:Natural Science Edition,2013,34(6):669-674(in Chinese)(詹永照,张珊珊,成科扬.基于非线性可鉴别的稀疏表示视频语义分析方法[J].江苏大学学报:自然科学版,2013,34(6):669-674)
    [13]Feng Bailan,Bao Lei,Cao Juan,et al.An effective video retrieval approach based on multi-modality concept correlation graph[J].Journal of Computer-Aided Design&Computer Graphics,2010,22(5):827-832(in Chinese)(冯柏岚,包蕾,曹娟,等.基于多模态概念关联图的视频检索[J].计算机辅助设计与图形学学报,2010,22(5):827-832)
    [14]Zhan Yongzhao,Li Ting,Zhou Gengtao.Facial expression recognition based on hybrid features and multiple HMMs fusion for image sequences[J].Journal of Computer-Aided Design&Computer Graphics,2008,20(7):900-905(in Chinese)(詹永照,李婷,周庚涛.基于混合特征和多HMM融合的图像序列表情识别[J].计算机辅助设计与图形学学报,2008,20(7):900-905)
    [15]Gribonval R.Fast matching pursuit with a multi-scale dictionary of Gaussian chirps[J].IEEE Transactions on Signal Processing,2001,49(5):994-1001
    [16]Cotter S F,Rao B D.Sparse channel estimation via matching pursuit with application to equalization[J].IEEE Transactions on Communications,2002,50(3):374-377
    [17]Tropp J A,Gilbert A C.Signal recovery from random measurements via orthogonal matching pursuit[J].IEEE Transactions on Information Theory,2007,53(12):4655-4666
    [18]Aharon M,Elad M,Bruckstein A.K-SVD:an algorithm for designing overcomplete dictionaries for sparse representation[J].IEEE Transactions on Signal Processing,2006,54(11):4311-4322

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700