引入全局约束的精简人脸关键点检测网络

英文篇名：Streamlined Face Landmark Detection Network with Global Constraint
作者：张伟 ; 钱沄涛
英文作者：Zhang Wei;Qian Yuntao;College of Computer Science, Zhejiang University;
关键词：深度学习 ; 卷积神经网络 ; 全局约束 ; 人脸关键点检测
英文关键词：deep learning;;convolutional neural network;;global constraints;;face landmark detection
中文刊名：XXCN
英文刊名：Journal of Signal Processing
机构：浙江大学计算机学院;
出版日期：2019-03-25
出版单位：信号处理
年：2019
期：v.35;No.235
基金：国家重点研发计划(2018YFB0505000)
语种：中文;
页：XXCN201903024
页数：9
CN：03
ISSN：11-2406/TN
分类号：195-203

摘要

人脸关键点检测是计算机视觉中的典型问题之一,对于人脸三维重建、表情识别、头部姿态估计、人脸跟踪等有重要影响。目前基于深度神经网络的模型在人脸关键点检测性能表现最为突出,已被广泛采用。但是现有关键点检测深度神经网络结构设计越来越复杂,对于训练和测试需要的计算和存储资源要求越来越高。本文提出一种新的精简的关键点检测网络结构以代替现有的网络结构。相对其他网络结构,精简网络只包含一个特征提取模块,以及由几层反卷积层组成的上采样模块。此外我们在网络结构中加入对人脸所有关键点的全局约束,以减少预测离群点的产生。实验表明引入全局约束的精简网络结构在300-W数据集上取得的检测性能超出了目前典型深度神经网络检测模型。
Face landmark detection is one of the typical problems in computer vision, which has important impact on face 3D reconstruction, expression recognition, head pose estimation, face tracking and so on. At present, deep neural network based approaches have demonstrated the superior detection performance and have been widely used. However, as the structure design of the existing key point detection deep neural network is getting more complex, it requires much more the computing and storage resources. In this paper, we propose a new streamlined landmark detection network structure, compared with other network structures, which only consists of one feature extraction module and an up-sampling module made up of several deconvolution layers. In addition, we add global constraints on all key points of a face in the network structure to reduce the redicted outliers. Experiments show that the detection performance of this network structure with global constraints on the 300-W data has better performance than the state of the art deep neural network detection methods.

引文

[1] Cootes T F,Taylor C J,Cooper D H,et al.Active shape models—their training and application[J].Computer Vision and Image Understanding,1995,61(1):38-59.
    [2] Edwards G J,Cootes T F,Taylor C J,et al.Face Recognition Using Active Appearance Models[C]//European Conference on Computer Vision,1998:581-595.
    [3] Lucey S,Wang Y,Cox M,et al.Efficient Constrained Local Model Fitting for Non-Rigid Face Alignment[J].Image & Vision Computing,2009,27(12):1804-1813.
    [4] Wang Y,Lucey S,Cohn J F.Enforcing convexity for improved alignment with constrained local models[C]//IEEE Conference on Computer Vision & Pattern Recognition,2008:1- 8.
    [5] Saragih J M,Lucey S,Cohn J F.Deformable Model Fitting by Regularized Landmark Mean-Shift[M].Kluwer Academic Publishers,2011.
    [6] Papandreou G,Maragos P.Adaptive and constrained algorithms for inverse compositional Active Appearance Model fitting[C]//IEEE Conference on Computer Vision & Pattern Recognition,2008:1- 8.
    [7] Matthews I,Baker S.Active Appearance Models Revisited[J].International Journal of Computer Vision,2004,60(2):135-164.
    [8] Coughlan J M,Ferreira S J.Finding Deformable Shapes Using Loopy Belief Propagation[C]//European Conference on Computer Vision,2002:453- 468.
    [9] Liang L,Wen F,Xu Y Q,et al.Accurate Face Alignment using Shape Constrained Markov Network[C]//IEEE Conference on Computer Vision & Pattern Recognition,2006:1313-1319.
    [10] Dollár P,Welinder P,Perona P.Cascaded pose regression[C]//IEEE Conference on Computer Vision & Pattern Recognition,2010:1078-1085.
    [11] Wei Y.Face alignment by Explicit Shape Regression[C]//IEEE Conference on Computer Vision & Pattern Recognition,2012:2887-2894.
    [12] Tang X,Wang X,Luo P.Hierarchical face parsing via deep learning[C]//IEEE Conference on Computer Vision & Pattern Recognition,2012:2480-2487.
    [13] Wu Y,Wang Z,Ji Q.Facial Feature Tracking Under Varying Facial Expressions and Face Poses Based on Restricted Boltzmann Machines[C]//IEEE Conference on Computer Vision & Pattern Recognition,2013:3452-3459.
    [14] Zhang J,Shan S,Kan M,et al.Coarse-to-Fine Auto-Encoder Networks(CFAN) for Real-Time Face Alignment[C]//European Conference on Computer Vision,2014:1-16.
    [15] Sun Y,Wang X,Tang X.Deep Convolutional Network Cascade for Facial Point Detection[C]//IEEE Conference on Computer Vision & Pattern Recognition,2013:3476-3483.
    [16] Bulat A,Tzimiropoulos G.How Far are We from Solving the 2D & 3D Face Alignment Problem?(and a Dataset of 230,000 3D Facial Landmarks)[C]//International Conference on Computer Vision,2017:1021-1030.
    [17] Dong X,Yu S,Weng X,et al.Supervision-by-Registration:An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors[C]//IEEE Conference on Computer Vision & Pattern Recognition,2018:360-368.
    [18] Sagonas C,Tzimiropoulos G,Zafeiriou S,et al.300 Faces in-the-Wild Challenge:The First Facial Landmark Localization Challenge[C]//International Conference on Computer Vision,2013.
    [19] Tompson J,Jain A,Lecun Y,et al.Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation[C]//Conference and Workshop on Neural Information Processing Systems(NIPS),2014:1799-1807.
    [20] Pfister T,Charles J,Zisserman A,et al.Flowing ConvNets for Human Pose Estimation in Videos[C]//International Conference on Computer Vision,2015:1913-1921.
    [21] Xiao B,Wu H,Wei Y,et al.Simple Baselines for Human Pose Estimation and Tracking[C]//European Conference on Computer Vision,2018:472- 487.
    [22] Deng J,Guo J,Zafeiriou S,et al.ArcFace:Additive Angular Margin Loss for Deep Face Recognition[J].arXiv:1801- 07698,2018.
    [23] Wang F,Jiang M,Qian C,et al.Residual Attention Network for Image Classification[C]//IEEE Conference on Computer Vision & Pattern Recognition,2017:6450- 6458.
    [24] Kuen J,Wang Z,Wang G,et al.Recurrent Attentional Networks for Saliency Detection[C]//IEEE Conference on Computer Vision & Pattern Recognition,2016:3668-3677.
    [25] Chu X,Yang W,Ouyang W,et al.Multi-context Attention for Human Pose Estimation[C]//IEEE Conference on Computer Vision & Pattern Recognition,2017:5669-5678.
    [26] Ren S,Cao X,Wei Y,et al.Face Alignment at 3000 FPS via Regressing Local Binary Features[C]//IEEE Conference on Computer Vision & Pattern Recognition,2014:1685-1692.
    [27] Trigeorgis G,Snape P,Nicolaou M A,et al.Mnemonic Descent Method:A Recurrent Process Applied for End-to-End Face Alignment[C]//IEEE Conference on Computer Vision & Pattern Recognition,2016:4177- 4187.
    [28] Zhang Z,Luo P,Loy C C,et al.Facial Landmark Detection by Deep Multi-task Learning[C]//European Conference on Computer Vision,2014:94-108.
    [29] Zhu S,Li C,Loy C C,et al.Face alignment by coarse-to-fine shape searching[C]//IEEE Conference on Computer Vision & Pattern Recognition,2015:4998-5006.
    [30] Lv J,Shao X,Xing J,et al.A Deep Regression Architecture with Two-Stage Re-initialization for High Performance Facial Landmark Detection[C]//IEEE Conference on Computer Vision & Pattern Recognition,2017:3691-3700.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700