基于复合特征及深度学习的人群行为识别算法

英文篇名：Crowd Behavior Recognition Algorithm Based on Combined Features and Deep Learning
作者：袁亚军 ; 李菲菲 ; 陈虬
英文作者：YUAN Ya-jun;LEE Fei-fei;CHEN Qiu;School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology;
关键词：人群行为识别 ; 静态特征 ; 动态特征 ; CNN ; 数据提取
英文关键词：Crowd behavior recognition;;Static characteristic;;Dynamic characteristic;;CNN;;Data extraction
中文刊名：JSJA
英文刊名：Computer Science
机构：上海理工大学光电信息与计算机工程学院;
出版日期：2019-06-15
出版单位：计算机科学
年：2019
期：v.46
基金：上海市高校特聘教授(东方学者)岗位计划(ES2015XX)资助
语种：中文;
页：JSJA201906046
页数：6
CN：06
ISSN：50-1075/TP
分类号：311-316

摘要

分析人群行为的目的是更好地分析与管理人群运动的状态与趋势。针对人群行为的两种特征信息,提出了一种基于深度学习的人群行为识别方法。先将人群作为主要对象,通过前景提取方法来提取人群静态信息,利用人群运动的变化获取人群动态信息,借助卷积神经网络(CNN)模型学习这两种不同的人群行为特征,再综合这两种特征来分析常见的人群行为。同时,人群数据提取位置与间隔是影响人群行为分析的重要因素。实验结果表明,这两种人群特征能更好地描述空间维度上的人群状态和时间维度上的人群变化,合理的数据位置与数据间隔可以有效地提高人群信息的表达能力。最后将提出的方法与其他人群行为分析方法进行比较,定量与定性的实验结果验证了所提方法的有效性,同时也表明了所提方法能得到更优的混淆矩阵和更高的准确度。
The target of analyzing crowd behavior is to better analyze and manage the state and tendency of crowd movement.This paper proposed a novel deep learning based crowd behavior recognition method by using two types of crowd behavior features.Firstly,the crowd is regarded as the main object,a foreground extraction method is used to extract the static information of crowd,and the dynamic information of crowd is obtained by the change of the crowd movement.Then two different crowd behavior characteristics are learned by using convolution neural network(CNN) model,so as to analyze crowd behaviors in the end.Additionally,the extraction location and interval of crowd data are crucial factors in the crowd behavior recognition.Experimental results show that two crowd characteristics can better describe crowd states on the spatial dimension and crowd changes on the temporal dimension.The rational data location and data interval can effectively improve the expression ability of crowd information.At last,this method was compared with other crowd behavior recognition algorithms.The quantitative and qualitative experimental results demonstrate the validity of the proposed method.Besides,better confusion matrix and higher precision can be obtained by this method.

引文

[1] SHAO J,KANG K,CHEN C L,et al.Deeply learned attributes for crowded scene understanding[C]//International Conference on Computer Vision and Pattern Recognition.2015:4657-4666.
    [2] PALANISAMY G,MANIKANDAN T T.Group Behaviour Profiling for Detection of Anomaly in Crowd[C]//International Conference on Technical Advancements in Computers and Communications.2017:11-15.
    [3] RODRIGUES F,LOURENCO M,RIBEIRO B,et al.Learning Supervised Topic Models for Classification and Regression from Crowds[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,PP(99):1-1.
    [4] YI S,LI H,WANG X.Pedestrian Behavior Modeling from Stationary Crowds with Applications to Intelligent Surveillance[J].IEEE Transactions on Image Processing,2016,25(9):4354-4368.
    [5] SENGUPTA S,WANG H,BLACKBURN W,et al.Spatial information in classification of activity videos[C]//2015 Federated Conference on Computer Science and Information Systems.2015:145-153.
    [6] SHAO J,CHEN C L,WANG X.Learning Scene-Independent Group Descriptors for Crowd Understanding[J].IEEE Transactions on Circuits & Systems for Video Technology,2017,27(6):1290-1303.
    [7] ZHANG C,KANG K,LI H,et al.Data-Driven Crowd Understanding:A Baseline for a Large-Scale Crowd Dataset[J].IEEE Transactions on Multimedia,2016,18(6):1048-1061.
    [8] JI S,YANG M,YU K.3D convolutional neural networks for human action recognition [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2013,35(1):221-231.
    [9] SIMONYAN K,ZISSERMAN A.Two-Stream Convolutional Networks for Action Recognition in Videos[J].Advances in Neural Information Processing Systems,2014,1(4):568-576.
    [10] SHAO J,CHEN C L,KANG K,et al.Slicing Convolutional Neural Network for Crowd Video Understanding[C]//International Conference Conference on Computer Vision and Pattern Recognition.2016:5620-5628.
    [11] JING S,CHEN C L,KAI K,et al.Crowded Scene Understan- ding by Deeply Learned Volumetric Slices[J].IEEE Transactions on Circuits & Systems for Video Technology,2017,27(3):613-623.
    [12] KARPATHY A,TODERICI G,SHETTY S,et al.Large-Scale Video Classification with Convolutional Neural Networks[C]//International Conference Computer Vision and Pattern Recognition.2014:1725-1732.
    [13] SHAO J,LOY C C,WANG X.Scene-Independent Group Profiling in Crowd[C]//International Conference on Computer Vision and Pattern Recognition.2014:2227-2234.
    [14] BURNEY A,SYED T Q.Crowd Video Classification Using Convolutional Neural Networks[C]//International Conference on Frontiers of Information Technology.2017:1255-1259.
    [15] YI S,WANG X.Profiling stationary crowd groups[C]//International Conference on Multimedia and Expo.2014:1-6.
    [16] YI S,LI H,WANG X.Understanding pedestrian behaviors from stationary crowd groups[C]//International Conference on Computer Vision and Pattern Recognition.2015:3488-3496.
    [17] SENST T,EISELEIN V,SIKORA T.A local feature based on lagrangian measures for violent video classification[C]//International Conference on Imaging for Crime Prevention and Detection.2015:1-6.
    [18] FEICHTENHOFER C,PINZ A,ZISSERMAN A.Convolutional Two-Stream Network Fusion for Video Action Recognition[C]//International Conference on Computer Vision and Pattern Re-cognition.2016:1933-1941.
    [19] GOMEZ-DONOSO F,GARCIA-GARCIA A,GARCIA-RO- DRIGUEZ J,et al.LonchaNet:A sliced-based CNN architecture for real-time 3D object recognition[C]//International Joint Conference on Neural Networks.2017:412-418.
    [20] MARSDEN M,MCGUINNESS K,LITTLE S,et al.Fully Convolutional Crowd Counting On Highly Congested Scenes[J].arXiv preprint arXiv:1612.00220.
    [21] SHI Y,TIAN Y,WANG Y,et al.Sequential Deep Trajectory Descriptor for Action Recognition with Three-Stream CNN[J].IEEE Transactions on Multimedia,2017,19(7):1510-1520.