基于多尺度多任务卷积神经网络的人群计数

英文篇名：Crowd counting using multi-scale multi-task convolutional neural network
作者：曹金梦 ; 倪蓉蓉 ; 杨彪
英文作者：CAO Jinmeng;NI Rongrong;YANG Biao;School of Information Science & Engineering, Changzhou University;Department of Energy Management, Changzhou Vocational Institute of Textile and Garment;
关键词：人群计数 ; 多尺度 ; 多任务学习 ; 卷积神经网络 ; 自适应人形核 ; 加权损失函数
英文关键词：crowd counting;;multi-scale;;multi-task learning;;Convolutional Neural Network(CNN);;adaptive human-shaped kernel;;weighted loss function
中文刊名：JSJY
英文刊名：Journal of Computer Applications
机构：常州大学信息科学与工程学院;常州纺织服装职业技术学院能源管理科;
出版日期：2018-08-16 15:59
出版单位：计算机应用
年：2019
期：v.39;No.341
基金：国家自然科学基金资助项目(61501060);; 江苏省自然科学基金资助项目(BK20150271);; 江苏省道路载运工具新技术应用重点实验室开放课题项目(BM20082061708)~~
语种：中文;
页：JSJY201901036
页数：6
CN：01
ISSN：51-1307/TP
分类号：205-210

摘要

在智能监控领域,实现人群计数具有重要价值,针对人群尺度不一、人群密度分布不均及遮挡等问题,提出一种多尺度多任务卷积神经网络(MMCNN)进行人群计数的方法。首先提出一种新颖的自适应人形核生成密度图描述人群信息,消除人群遮挡影响;其次通过构建多尺度卷积神经网络解决人群尺度不一问题,以多任务学习机制同时估计密度图及人群密度等级,解决人群分布不均问题;最后设计一种加权损失函数,提高人群计数准确率。在UCF_CC_50和World Expo'10数据库上进行了评估,验证了自适应人形核的有效性。实验结果表明:所提算法比Sindagi等的方法 (SINDAGI V A,PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway,NJ:IEEE,2017:1-6)在UCF_CC_50数据库上平均绝对误差(MAE)数值和均方误差(MSE)数值分别降低约1. 7和45;与Zhang等的方法(ZHANG Y,ZHOU D,CHEN S,et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington,DC:IEEE Computer Society,2016:589-597)相比,在World Expo'10数据库上所提算法的MAE值降低约1. 5,且在真实公共汽车数据库上仅0～3人的计数误差,表明其实用性较强。
Crowd counting has played a significant role in the field of intelligent surveillance. Concerning the problem of scale variation, non-uniform density distribution and partial occlusion of crowds, a method of crowd counting using Multi-scale Multi-task Convolutional Neural Network( MMCNN) was proposed to solve existing challenges in crowd counting. Initially, a novel adaptive human-shaped kernel was used to generate a density map which described the population information, and the partial occlusion was eliminated. Then, scale variation was handled through constructing a multi-scale convolutional neural network and non-uniform density distribution was resolved by the multi-task learning mechanism, which simultaneously estimate the density map and density level of crowds. Further, a weighted loss function was proposed to improve the accuracy of crowd counting. Evaluations in UCF_CC_50 and World Expo'10 datasets revealed the effectiveness of the proposed adaptive human-shaped kernel. The experimental results show that, compared with the method proposed by Sindagi et al.( SINDAGI V A, PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting.Proceedings of the 2017 14 th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway,NJ: IEEE, 2017: 1-6), the Mean Absolute Error( MAE) and Mean Squared Error( MSE) of the proposed method in UCF_CC_50 dataset is decreased by 1. 7 and 45 respectively. Compared with the method proposed by Zhang et al.( ZHANG Y,ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2016: 589-597), the MAE of the proposed method in World Expo'10 dataset is decreased by 1. 5. Simultaneously, evaluations in practical bus videos with an error of approximately 0-3, which verifies the practicability of the proposed counting approach.

引文

[1]RYAN D, DENMAN S, SRIDHARAN S, et al. An evaluation of crowd counting methods, features and regression models[J]. Com-puter Vision and Image Understanding, 2015, 130(C):1-17.
    [2]FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al.Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645.
    [3]GAO C, LIU J, FENG Q, et al. People-flow counting in complex environments by combining depth and color information[J]. Multimedia Tools and Applications, 2016, 75(15):9315-9331.
    [4]LUO J, WANG J, XU H, et al. Real-time people counting for indoor scenes[J]. Signal Processing, 2016, 124:27-35.
    [5]ANTIC B, LETIC D, CULIBRK D, et al. K-means based segmentation for real-time zenithal people counting[C]//Proceedings of the2009 16th IEEE International Conference on Image Processing. Piscataway, NJ:IEEE, 2009:2565-2568.
    [6]RAO A S, GUBBI J, MARUSIC S, et al. Estimation of crowd density by clustering motion cues[J]. The Visual Computer, 2015, 31(11):1533-1552.
    [7]CHAN A B, LIANG Z S J, VASCONCELOS N. Privacy preserving crowd monitoring:counting people without people models or tracking[C]//Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2008:1-7.
    [8]姬丽娜,陈庆奎,陈圆金,等.基于GPU的视频流人群实时计数[J].计算机应用,2017,37(1):145-152.(JI L N, CHEN Q K,CHEN Y J, et al. Real-time crowd counting method from video stream based on GPU[J]. Journal of Computer Applications, 2017,37(1):145-152.)
    [9]HASHEMZADEH M, FARAJZADEH N. Combining keypoint-based and segment-based features for counting people in crowded scenes[J]. Information Sciences, 2016, 345:199-216.
    [10]SIVA P, SHAFIEE M J, JAMIESON M, et al. Scene invariant crowd segmentation and counting using scale-normalized Histogram of Moving Gradients(Ho MG)[J]. Ar Xiv Preprint, 2016, 2016:1602. 00386.
    [11]ZHANG C, LI H, WANG X, et al. Cross-scene crowd counting via deep convolutional neural networks[C]//Proceedings of the2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2015:833-841.
    [12]OORO-RUBIO D, LPEZ-SASTRE R J. Towards perspectivefree object counting with deep learning[C]//Proceedings of the2016 European Conference on Computer Vision. Berlin:Springer,2016:615-629.
    [13]HU Y, CHANG H, NIAN F, et al. Dense crowd counting from still images with convolutional neural networks[J]. Journal of Visual Communication and Image Representation, 2016, 38:530-539.
    [14]SHENG B, SHEN C, LIN G, et al. Crowd counting via weighted VLAD on dense attribute feature maps[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 28(8):1788-1797.
    [15]KANG D, DHAR D, CHAN A B. Crowd counting by adapting convolutional neural networks with side information[J]. Ar Xiv Preprint, 2016, 2016:1611. 06748.
    [16]时增林,叶阳东,吴云鹏,等.基于序的空间金字塔池化网络的人群计数方法[J].自动化学报,2016,42(6):866-874.(SHI Z L, YE Y D, WU Y P, et al. Crowd counting using rank-based spatial pyramid pooling network[J]. Acta Automatica Sinica,2016, 42(6):866-874.)
    [17]ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:589-597.
    [18]SINDAGI V A, PATEL V M. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting[C]//Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway, NJ:IEEE, 2017:1-6.
    [19]MARSDEN M, MCGUINNESS K, LITTLE S, et al. Resnet Crowd:a residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification[C]//Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway, NJ:IEEE, 2017:1-7.
    [20]ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:589-597.
    [21]ZEILER M D, RANZATO M, MONGA R, et al. On rectified linear units for speech processing[C]//Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ:IEEE, 2013:3517-3521.
    [22]WANG T, LI G, LEI J, et al. Crowd counting based on MMCNN in still images[C]//Proceedings of the 2017 Scandinavian Conference on Image Analysis. Berlin:Springer, 2017:468-479.
    [23]FU M, XU P, LI X, et al. Fast crowd density estimation with convolutional neural networks[J]. Engineering Applications of Artificial Intelligence, 2015, 43:81-88.
    [24]IDREES H, SALEEMI I, SEIBERT C, et al. Multi-source multiscale counting in extremely dense crowd images[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2013:2547-2554.
    [25]KANG D, MA Z, CHAN A B. Beyond counting:comparisons of density maps for crowd analysis tasks—counting, detection, and tracking[J]. IEEE Transactions on Circuits&Systems for Video Technology, 2017, PP(99):1-1.
    [26]覃勋辉,王修飞,周曦,等.多种人群密度场景下的人群计数[J].中国图象图形学报,2013,18(4):392-398.(QIN X H,WANG X F, ZHOU X, et al. Counting people in various crowed density scenes using support vector regression[J]. Journal of Image and Graphics, 2013, 18(4):392-398.)

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700