一种上下文敏感的多尺度人脸检测方法

英文篇名：Context-Sensitive Multi-Scale Face Detection
作者：陈龙 ; 庞彦伟
英文作者：Chen Long;Pang Yanwei;School of Electrical and Information Engineering,Tianjin University;
关键词：图像处理 ; 人脸检测 ; 深度卷积神经网络 ; 上下文敏感 ; 多尺度
英文关键词：image processing;;face detection;;deep convolution neural network;;context sensitivity;;multi-scale
中文刊名：JGDJ
英文刊名：Laser & Optoelectronics Progress
机构：天津大学电气自动化与信息工程学院;
出版日期：2019-02-25
出版单位：激光与光电子学进展
年：2019
期：v.56;No.639
基金：国家自然科学基金(61632081)
语种：中文;
页：JGDJ201904010
页数：10
CN：04
ISSN：31-1690/TN
分类号：104-113

摘要

针对非约束环境下,受姿态、遮挡、尺度变化等因素的影响,密集、分辨率较低的人脸难以检测问题,提出了一种上下文敏感的多尺度人脸检测(CSMS)方法。该方法引入一种结合人脸上下文信息的提取模块,通过有效地融合多感受野特征来丰富目标的判别性信息。从模型结构设计的角度出发,利用多尺度特征提取尺度专门化的特征向量,使人脸检测中尺度变化具有很好的稳健性。在训练阶段采用端到端的学习方式,并引入专注于难分负例样本的训练方法来解决小尺度目标检测中的类间不平衡问题,提高了网络对难例样本的判别能力。实验结果表明,该方法对于非约束环境下的人脸检测具有很好的稳健性,在Wider Face数据集上实现了先进的检测效果。
The dense and low-resolution face is difficult to be detected under the influence of attitude,occlusion and scale change.We propose a context-sensitive multi-scale face detection(CSMS)method to solve this problem.First,the CSMS method introduces an extraction module which combines the face context information to enrich the discriminant information by effectively fusing the features of multiple receptive fields.Secondly,from the point of view of model structure design,the CSMS method uses multi-scale features to extract scale-specific feature vectors and achieve the robust scale variety in face detection.In the training phase,the CSMS method adopts the end-to-end learning method,and introduces the training method focusing on the hard negative examples to solve the class imbalance problem in the small-scale target detection,and improves the ability of the network to distinguish the difficult examples.Experimental results show that the proposed method is robust in unconstrained environments and achieves advanced detection performance on the Wider Face dataset.

引文

[1]Wang L L,Liu J H,Fu X M.Facial expression recognition based on fusion of local features and deep belief network[J].Laser&Optoelectronics Progress,2018,55(1):011002.王琳琳,刘敬浩,付晓梅.融合局部特征与深度置信网络的人脸表情识别[J].激光与光电子学进展,2018,55(1):011002.
    [2]Viola P,Jones M J.Robust real-time face detection[J].International Journal of Computer Vision,2004,57(2):137-154.
    [3]Cao J L,Pang Y W,Li X L.Pedestrian detection inspired by appearance constancy and shape symmetry[J].IEEE Transactions on Image Processing,2016,25(12):5538-5551.
    [4]Dollar P,Appel R,Belongie S,et al.Fast feature pyramids for object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(8):1532-1545.
    [5]Zhang S S,Benenson R,Schiele B.Filtered channel features for pedestrian detection[C]∥2015IEEEConference on Computer Vision and Pattern Recognition,June 7-12,2015,Boston,MA,USA.New York:IEEE,2015:1751-1760.
    [6]Kong Y P,Liu X,Xie X Q,et al.Face liveness detection method based on histogram of oriented gradient[J].Laser&Optoelectronics Progress,2018,55(3):031009.孔月萍,刘霞,谢心谦,等.基于梯度方向直方图的人脸活体检测方法[J].激光与光电子学进展,2018,55(3):031009.
    [7]Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition,June 23-28,2014,Columbus,OH,USA.New York:IEEE,2014:580-587.
    [8]Girshick R.Fast R-CNN[C]∥2015IEEE Conference on Computer Vision,December 7-13,2015,Santiago,Chile.New York:IEEE,2015:1440-1448.
    [9]Ren S Q,He K M,Girshick R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
    [10]Redmon J,Divvala S,Girshick R,et al.You only look once:Unified,real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas NV,USA.New York:IEEE,2016:779-788.
    [11]Liu W,Anguelov D,Erhan D,et al.SSD:Single shot multiBox detector[M]∥Leibe B,Matas J,Sebe N,et al.eds.Computer Vision-ECCV 2016.Cham:Springer,2016:21-37.
    [12]Lin T Y,Goyal P,Girshick R,et al.Focal loss for dense object detection[C]∥2017IEEE International Conference on Computer Vision,October 22-29,2017,Venice,Italy.New York:IEEE,2017:2999-3007.
    [13]Yang S,Luo P,Loy C C,et al.From facial parts responses to face detection:A deep learning approach[C]∥2015 IEEE International Conference on Computer Vision,December 7-13,2015,Santiago,Chile.New York:IEEE,2015:3676-3684.
    [14]Zhang K P,Zhang Z P,Li Z F,et al.Joint face detection and alignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
    [15]Sun X D,Wu P C,Hoi S C H.Face detection using deep learning:An improved faster RCNN approach[J].Neurocomputing,2018,299:42-50.
    [16]Zhu C,Zheng Y,Luu K,et al.CMS-RCNN:Contextual multi-scale region-based CNN for unconstrained face detection[M]∥Bhanu B,Kumar A.eds.Deep Learning for Biometrics.Advances in Computer Vision and Pattern Recognition.Cham:Springe,2017:57-79.
    [17]Wan S,Chen Z,Zhang T,et al.Bootstrapping face detection with hard negative examples[EB/OL].(2016-08-07)[2018-08-07]http:∥arxiv.org/abs/1608.02236.
    [18]Huang J,Rathod V,Sun C,et al.Speed/accuracy trade-offs for modern convolutional object detectors[C]∥2017 IEEE International Conference on Computer Vision and Pattern Recognition,July,21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:3296-3297.
    [19]Deng X Q,Zhu Q B,Huang M.Variety discrimination for single rice seed by integrating spectral,texture and morphological features based on hyperspectral image[J].Laser&Optoelectronics Progress,2015,52(2):021001.邓小琴,朱启兵,黄敏.融合光谱、纹理及形态特征的水稻种子品种高光谱图像单粒鉴别[J].激光与光电子学进展,2015,52(2):021001.
    [20]Hou Z Q,Wang L P,Guo J X,et al.An object tracking algorithm based on color,space and texture information[J].Opto-Electronic Engineering,2018,45(5):39-46.侯志强,王利平,郭建新等.基于颜色、空间和纹理信息的目标跟踪[J].光电工程,2018,45(5):39-46.
    [21]Sun Y J,Dong J N,Wang Z F.Estimation of lighting parameters for uniform texture image[J].Laser&Optoelectronics Progress,2017,54(6):061002.孙玉娟,董军宇,王增锋.灰度一致纹理图像的光参数估算方法[J].激光与光电子学进展,2017,54(6):061002.
    [22]Hu P Y,Ramanan D.Finding tiny faces[C]∥2017IEEE International Conference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:1522-1530.
    [23]Najibi M,Samangouei P,Chellappa R,et al.SSH:Single stage headless face detector[C]∥IEEEInternational Conference on Computer Vision,October 22-29,2017,Venice,Italy.New York:IEEE,2017:4885-4894.
    [24]Samangouei P,Najibi M,Davis L,et al.FaceMagNet:Magnifying feature maps to detect small faces[C]∥2018 IEEE Winter Conference on Applications of Computer Vision,March 12-15,2018,Lake Tahoe,NV,USA.New York:IEEE,2018:122-130.
    [25]Yang S,Luo P,Loy C C,et al.Wider face:A face detection benchmark[C]∥2016IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:5525-5533.
    [26]Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2014-09-04)[2018-08-07].https:∥arxiv.org/abs/1409.1556.
    [27]Cao J L,Pang Y W,Li X L.Exploring multi-branch and high-level semantic networks for improving pedestrian detection[EB/OL].(2018-04-03)[2018-08-07].http://arxiv.org/abs/1804.00872.
    [28]Lin T Y,Dollár P,Girshick R B,et al.Feature pyramid networks for object detection[C]∥2017IEEE Conference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:936-944.
    [29]Huang G,Liu Z,van der Maaten L,et al.Densely connected convolutional networks[C]∥2017IEEEConference on Computer Vision and Pattern Recognition,July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:2261-2269.
    [30]Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions[EB/OL].(2015-11-23)[2018-08-07].http:∥arxiv.org/abs/1511.07122.
    [31]Chen L C,Papandreou G,Kokkinos I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848.
    [32]Shensa M J.The discrete wavelet transform:Wedding the atrous and Mallat algorithms[J].IEEETransactions on Signal Processing,1992,40(10):2464-2482.
    [33]Jia Y,Shelhamer E,Donahue J,et al.Caffe:Convolutional architecture for fast feature embedding[C]∥Proceedings of the 22nd ACM international conference on Multimedia,November 4-7 2014,Dallas,Texas,USA.New York:ACM,2014:675-678.
    [34]Glorot X,Bengio Y.Understanding the difficulty of training deep feed forward neural networks[J].Journal of Machine Learning Research,2010,9:249-256.
    [35]Shrivastava A,Gupta A,Girshick R.Training region-based object detectors with online hard example mining[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:761-769.
    [36]Yang S,Xiong Y,Loy C C,et al.Face detection through scale-friendly deep convolutional networks[EB/OL].(2017-06-09)[2018-08-07].http:∥arxiv.org/abs/1706.02863.
    [37]Ohn-Bar E,Trivedi M M.To boost or not to boost?On the limits of boosted trees for object detection[C]∥IEEEInternational Conference on Pattern Recognition,December 4-8,2016,Cancun,Mexico.New York:IEEE,2016:3350-3355.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700