基于级联网络的快速人脸检测算法

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于级联网络的快速人脸检测算法

详细信息查看全文 | 推荐本文 |

英文篇名：Fast face detection algorithm based on cascade network
作者：包晓安 ; 胡玲玲 ; 张娜 ; 吴彪 ; 桂江生
英文作者：BAO Xiaoan;HU Lingling;ZHANG Na;WU Biao;GUI Jiangshen;School of Information Science and Technology, Zhejiang Sci-Tech University;Department of East Asian Studies, Yamaguchi University;
关键词：人脸检测 ; 金字塔网络 ; 网络加速 ; 小型化 ; 级联网络
英文关键词：face detection;;pyramid network;;network acceleration;;miniaturization;;cascade network
中文刊名：ZJSG
英文刊名：Journal of Zhejiang Sci-Tech University(Natural Sciences Edition)
机构：浙江理工大学信息学院;山口大学东亚研究科;
出版日期：2018-12-03 11:21
出版单位：浙江理工大学学报(自然科学版)
年：2019
期：v.41
基金：国家自然科学基金项目(61502430,61562015);; 广西自然科学重点基金项目(2015GXNSFDA139038);; 浙江理工大学521人才培养计划项目
语种：中文;
页：ZJSG201903009
页数：7
CN：03
ISSN：33-1338/TS
分类号：74-80

摘要

采用卷积神经网络可有效提高人脸检测算法的精度,然而其模型参数过于复杂,在一般设备上检测速度很慢。针对这个问题,提出了一种三层网络级联的人脸检测算法,利用级联方式实现网络小型化,通过多任务方式提高人脸检测的精度。在网络的第一级采用金字塔结构网络,结合anchor机制提取多尺度人脸建议框,在此基础上结合卷积分解策略和网络加速的方法,进一步增强网络特征提取的有效性并减少模型参数。实验结果表明:在FDDB上该算法的检测精度和检测速度均优于MTCNN;在主频为2.0 GHz的八核设备上,检测速度可以达到80 fps。
Convolutional Neural Network(CNN) can effectively improve the accuracy of face detection algorithm. However, the complex parameters in the model always cause the slow detection speed on common devices. To solve this problem, a face detection algorithm based on three-layer network cascaded was proposed in this paper. It can achieve miniaturization of the network by cascade system, and can improve the accuracy of face detection by using multitask mode. In the first stage of the network, the network with pyramid structure was applied, and multi-scale face prediction boxes were extracted by combining anchor mechanism. On this basis, convolutional decomposition strategy and network acceleration method were combined to further enhance effectiveness of network characteristic extraction and reduce model parameters. The results showed that, both detection precision and detection speed of the algorithm on FDDB were better than that of MTCNN. The speed could reach 80 fps on the eight-core device with the main frequency of 2.0 GHz.

引文

[1] 梁路宏,海舟.脸检测研究综述[J].计算机学报,2002,25(5):449-458.
    [2] Viola P,Jones M J.Robust real-time face detection[J].International Journal of Computer Vision,2004,57(2):137-154.
    [3] Ojala T,Pietik?inen M,M?enp?? T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):971-987.
    [4] Dalal N,Triggs B.Histograms of oriented gradients for human detection[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005.IEEE,2005,1:886-893.
    [5] Ng P C,Henikoff S.SIFT:Predicting amino acid changes that affect protein function[J].Nucleic acids research,2003,31(13):3812-3814.
    [6] Wu B,Haizhou A I,Huang C,et al.Fast rotation invariant multi-view face detection based on real adaboost[C] //IEEE International Conference on Automatic Face and Gesture Recognition.IEEE,2004:79-84.
    [7] Benini L,Bogliolo A,De Micheli G.A survey of design techniques for system-level dynamic power management[J].IEEE Transactions on Very Large Scale Integration (VLSI) Systems,2000,8(3):299-316.
    [8] Zhang K,Zhang Z,Li Z,et al.Joint face detection and alignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
    [9] Ranjan R,Patel V M,Chellappa R.Hyperface:A deep multi-task learning framework for face detection,landmark localization,pose estimation,and gender recognition[J/OL].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017.(2017-12-08) [2018-08-12].https://doi.org/10.1109/TPAMI.2017.2781233.
    [10] Ren S,He K,Girshick R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].International Conference on Neural Information Processing Systems,2015,39(6):91-99.
    [11] 郑云飞,张雄伟,曹铁勇,等.基于全卷积网络的语义显著性区域检测方法研究[J].电子学报,2017,45(11):2593-2601.
    [12] Li H,Lin Z,Shen X,et al.A convolutional neural network cascade for face detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:5325-5334.
    [13] Xue Y,Liao X,Carin L,et al.Multi-task learning for classification with dirichlet process priors[J].Journal of Machine Learning Research,2007,8(1):35-63.
    [14] 伍凯,朱恒亮,郝阳阳,等.级联回归的多姿态人脸配准[J].中国图象图形学报,2017,22(2):257-264.
    [15] 李鸣,张鸿.基于卷积神经网络迭代优化的图像分类算法[J].计算机工程与设计,2017,38(1):198-202.
    [16] Ioffe S,Szegedy C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning.JMLR,2015:448-456.
    [17] Chollet F.Xception:Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:1800-1807.
    [18] Shore J,Johnson R.Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy[J].IEEE Transactions on Information Theory,1980,26(1):26-37.
    [19] Hinton G,Vinyals O,Dean J.Distilling the knowledge in a neural network[J].Computer Science,2015,14(7):38-39.
    [20] Bodla N,Singh B,Chellappa R,et al.Soft-nms:Improving object detection with one line of code[C]//Computer Vision (ICCV),2017 IEEE International Conference on.IEEE,2017:5562-5570.
    [21] 张娜,周冬,张俊为,等.基于H.264编码帧内预测模式的视频监控算法研究[J].浙江理工大学学报,2017,37(3):432-437.
    [22] Wessely S,Unwin C,Hotopf M,et al.Stability of recall of military hazards over time[J].British Journal of Psychiatry,2003,183(4):314-322.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700