基于多任务卷积网络的参会人员人数统计算法

英文篇名：People counting algorithm for participants based on multi-task convolutional network
作者：刘宇明 ; 凌志祥 ; 吴强 ; 赵闻迪 ; 李辉
英文作者：LIU Yuming;LING Zhixiang;WU Qiang;ZHAO Wendi;LI Hui;Electric Power Dispatching Control Center,Yunnan Power Grid Company Limited;Chengdu Information Technology of Chinese Academy of Sciences Corporation Limited;
关键词：多任务 ; 卷积神经网络 ; 人脸对齐与检测 ; 时空特征 ; 人数统计
英文关键词：multi-task;;Convolutional Neural Network(CNN);;face alignment and detection;;spatial-temporal feature;;people counting
中文刊名：JSJY
英文刊名：Journal of Computer Applications
机构：云南电网有限责任公司电力调度控制中心;中科院成都信息技术股份有限公司;
出版日期：2018-12-25
出版单位：计算机应用
年：2018
期：v.38
语种：中文;
页：JSJY2018S2011
页数：4
CN：S2
ISSN：51-1307/TP
分类号：56-59

摘要

室内会场由于其环境背景的复杂性和人员之间彼此的遮挡,是传统的人脸检测与人数统计的一个研究难点。针对云南电网视频会议中的人脸检测和人脸特征点回归,提出了一种优化之后的人数统计算法。基于多任务级联卷积神经网络,充分利用其任务间的差异性和相关性,融合了权重自学习模块,得到了多个网络层任务之间的最佳权重分布,提高了视频流中参会人员人脸对齐的实时性和准确性,改善了人数统计算法的检测效率;同时,利用视频流生成图像序列,引入多尺度的时空特征,实现帧间前后人员检测信息的关联标记,解决了图像帧间模糊的问题;并剔除了环境背景带来的间歇性干扰信息,从而判断出是否有人员被遮挡,进一步提升了算法的准确性。
Due to the complexity of environmental background and the occlusion among the people, the indoor venue is a difficult research point for traditional face detection and number statistics. Aiming at face detection and face feature point regression in Yunnan Power Grid video conference, an optimized demographic algorithm was proposed. Based on multi-task cascading convolutional neural network, and full use of the differences and correlations between tasks, combined with the weight self-learning module, the distribution of the optimal weights among the tasks of multiple network layers was obtained,thus the real-time and accuracy of the face alignment of participants in the video stream were improved, and the detection efficiency of the people counting. At the same time, video streams were used to generate image sequences, multi-scale spatiotemporal features were introduced to mark correlation tags of human detection information before and after frames. The problem of blur between image frames was solved, and intermittent interference information caused by environmental background was eliminated to determine whether people were occluded, further enhancing the accuracy of the algorithm.

引文

[1]张常亮,马渝勇,刘一谦,等. MCU级联的省—市—县三级高清视频会议系统设计[J].电视技术, 2012, 36(9):137-141.
    [2]宋玲,陈燕.基于H. 332的纯软件视频会议系统研究与实现[J].计算机工程与应用, 2006, 42(19):208-211.
    [3]王丽婷,丁晓青,方驰.一种鲁棒的全自动人脸特征点定位方法[J].自动化学报, 2009, 35(1):9-16.
    [4]马海军,王文中,翟素兰,等.基于卷积神经网络的监控视频人数统计算法[J].安徽大学学报(自然科学版),2016,40(3):22-28.
    [5]杨韬.基于级联形状回归的人脸对齐方法研究[D].无锡:江南大学, 2017.
    [6]徐全生,李美怡.人脸图像特征点的定位与提取方法的研究[J].沈阳工业大学学报, 2007, 29(1):90-94.
    [7] CHEN D, REN S, REN S, et al. Joint cascade face detection and alignment[C]//Proceedings of the 2004 European Conference on Computer Vision. Cham:Springer, 2014:109-122.
    [8] ZHANG K, ZHANG Z, LI Z, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10):1499-1503.
    [9] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis&Machine Intelligence, 2017, 39(6):1137-1149.
    [10]孙玉,刘贵全,汪中.基于不平衡分类的人脸检测系统[J].计算机应用与软件, 2012, 29(12):24-26.
    [11]胡文斌,孟波,王少梅.基于贝叶斯网络的权重自学习方法研究[J].计算机集成制造系统, 2005, 11(12):1781-1784.
    [12]段立娟,席涛,吴春鹏,等.融合时空特征的视频图像视觉显著程度检测方法:中国,CN103793925[P]. 2014-05-14.
    [13]许少华,梁久祯,何新贵.模糊神经网络学习样本的选取与网络扩展能力研究[J].计算机科学, 2001, 28(6):94-96.
    [14]周祥全,张津.深层网络中的梯度消失现象[J].科技展望,2017(27):284.
    [15]董昕.高清晰视频会议系统的研究与设计[D].北京:北京邮电大学,2012.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700