Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation
详细信息    查看全文
  • 作者:Sijin Li ; Weichen Zhang ; Antoni B. Chan
  • 关键词:Structured learning ; Deep learning ; Human pose estimation
  • 刊名:International Journal of Computer Vision
  • 出版年:2017
  • 出版时间:March 2017
  • 年:2017
  • 卷:122
  • 期:1
  • 页码:149-168
  • 全文大小:
  • 刊物类别:Computer Science
  • 刊物主题:Computer Imaging, Vision, Pattern Recognition and Graphics; Artificial Intelligence (incl. Robotics); Image Processing and Computer Vision; Pattern Recognition;
  • 出版者:Springer US
  • ISSN:1573-1405
  • 卷排序:122
文摘
This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We also propose an efficient recurrent neural network for performing inference with the learned image-embedding. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700