摘要
采用卷积神经网络可有效提高人脸检测算法的精度,然而其模型参数过于复杂,在一般设备上检测速度很慢。针对这个问题,提出了一种三层网络级联的人脸检测算法,利用级联方式实现网络小型化,通过多任务方式提高人脸检测的精度。在网络的第一级采用金字塔结构网络,结合anchor机制提取多尺度人脸建议框,在此基础上结合卷积分解策略和网络加速的方法,进一步增强网络特征提取的有效性并减少模型参数。实验结果表明:在FDDB上该算法的检测精度和检测速度均优于MTCNN;在主频为2.0 GHz的八核设备上,检测速度可以达到80 fps。
Convolutional Neural Network(CNN) can effectively improve the accuracy of face detection algorithm. However, the complex parameters in the model always cause the slow detection speed on common devices. To solve this problem, a face detection algorithm based on three-layer network cascaded was proposed in this paper. It can achieve miniaturization of the network by cascade system, and can improve the accuracy of face detection by using multitask mode. In the first stage of the network, the network with pyramid structure was applied, and multi-scale face prediction boxes were extracted by combining anchor mechanism. On this basis, convolutional decomposition strategy and network acceleration method were combined to further enhance effectiveness of network characteristic extraction and reduce model parameters. The results showed that, both detection precision and detection speed of the algorithm on FDDB were better than that of MTCNN. The speed could reach 80 fps on the eight-core device with the main frequency of 2.0 GHz.
引文
[1] 梁路宏,海舟.脸检测研究综述[J].计算机学报,2002,25(5):449-458.
[2] Viola P,Jones M J.Robust real-time face detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[3] Ojala T,Pietik?inen M,M?enp?? T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):971-987.
[4] Dalal N,Triggs B.Histograms of oriented gradients for human detection[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005.IEEE,2005,1:886-893.
[5] Ng P C,Henikoff S.SIFT:Predicting amino acid changes that affect protein function[J].Nucleic acids research,2003,31(13):3812-3814.
[6] Wu B,Haizhou A I,Huang C,et al.Fast rotation invariant multi-view face detection based on real adaboost[C] //IEEE International Conference on Automatic Face and Gesture Recognition.IEEE,2004:79-84.
[7] Benini L,Bogliolo A,De Micheli G.A survey of design techniques for system-level dynamic power management[J].IEEE Transactions on Very Large Scale Integration (VLSI) Systems,2000,8(3):299-316.
[8] Zhang K,Zhang Z,Li Z,et al.Joint face detection and alignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[9] Ranjan R,Patel V M,Chellappa R.Hyperface:A deep multi-task learning framework for face detection,landmark localization,pose estimation,and gender recognition[J/OL].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017.(2017-12-08) [2018-08-12].https://doi.org/10.1109/TPAMI.2017.2781233.
[10] Ren S,He K,Girshick R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].International Conference on Neural Information Processing Systems,2015,39(6):91-99.
[11] 郑云飞,张雄伟,曹铁勇,等.基于全卷积网络的语义显著性区域检测方法研究[J].电子学报,2017,45(11):2593-2601.
[12] Li H,Lin Z,Shen X,et al.A convolutional neural network cascade for face detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:5325-5334.
[13] Xue Y,Liao X,Carin L,et al.Multi-task learning for classification with dirichlet process priors[J].Journal of Machine Learning Research,2007,8(1):35-63.
[14] 伍凯,朱恒亮,郝阳阳,等.级联回归的多姿态人脸配准[J].中国图象图形学报,2017,22(2):257-264.
[15] 李鸣,张鸿.基于卷积神经网络迭代优化的图像分类算法[J].计算机工程与设计,2017,38(1):198-202.
[16] Ioffe S,Szegedy C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning.JMLR,2015:448-456.
[17] Chollet F.Xception:Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:1800-1807.
[18] Shore J,Johnson R.Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy[J].IEEE Transactions on Information Theory,1980,26(1):26-37.
[19] Hinton G,Vinyals O,Dean J.Distilling the knowledge in a neural network[J].Computer Science,2015,14(7):38-39.
[20] Bodla N,Singh B,Chellappa R,et al.Soft-nms:Improving object detection with one line of code[C]//Computer Vision (ICCV),2017 IEEE International Conference on.IEEE,2017:5562-5570.
[21] 张娜,周冬,张俊为,等.基于H.264编码帧内预测模式的视频监控算法研究[J].浙江理工大学学报,2017,37(3):432-437.
[22] Wessely S,Unwin C,Hotopf M,et al.Stability of recall of military hazards over time[J].British Journal of Psychiatry,2003,183(4):314-322.