Estimate Hand Poses Efficiently from Single Depth Images

详细信息查看全文

作者：Chi Xu ; Ashwin Nanjappa ; Xiaowei Zhang…
关键词：Hand pose estimation ; Depth images ; GPU acceleration ; Regression forests ; Consistency analysis ; Annotated hand image dataset
刊名：International Journal of Computer Vision
出版年：2016
出版时间：January 2016
年：2016
卷：116
期：1
页码：21-45
全文大小：4,777 KB
参考文献：Andrews, H. C., & Patterson, C. L. (1976). Digital interpolation of discrete images. IEEE Transactions on Computers, C–25(2), 196–202.CrossRef
Ballan, L., Taneja, A., Gall, J., Gool, L., & Pollefeys, M. (2012). Motion capture of hands in action using discriminative salient points. In ECCV.
Biau, G., Devroye, L., & Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. Journal on Machine Learning Research, 9, 2015–2033.MathSciNet MATH
Biau, G. (2012). Analysis of a random forests model. Journal on Machine Learning Research, 13, 1063–1095.MathSciNet MATH
Breiman, L. (2004). Consistency for a simple random forests. Tech. rep. UC Berkeley.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRef MATH
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.CrossRef
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR (Vol. 1, pp. 886–893).
de La Gorce, M., Fleet, D., & Paragios, N. (2011). Model-based 3d hand pose estimation from monocular video. IEEE Transaction on Pattern Analysis and Machine, 33(9), 1793–1805.CrossRef
Denil, M., Matheson, D., & de Freitas, N. (2014). Narrowing the gap: Random forests in theory and practice. In ICML.
Erol, A., Bebis, G., Nicolescu, M., Boyle, R., & Twombly, X. (2007). Vision-based hand pose estimation: A review. Computer Vision Image Understanding, 108(1–2), 52–73.CrossRef
Fanelli, G., Gall, J., & Gool, L. V. (2011). Real time head pose estimation with random regression forests. In CVPR.
Gall, J., & Lempitsky, V. (2013). Class-specific hough forests for object detection. In Decision forests for computer vision and medical image analysis (pp. 143–157). Berlin: Springer.
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In ICCV.
Gustus, A., Stillfried, G., Visser, J., Jorntell, H., & van der Smagt, P. (2012). Human hand modelling: Kinematics, dynamics, applications. Biological Cybernetics, 106(11–12), 741–755.MathSciNet CrossRef MATH
Gyröfi, L., Kohler, M., Krzyzak, A., & Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Berlin: Springer.CrossRef
Hackenberg, G., McCall, R., & Broll, W. (2011). Lightweight palm and finger tracking for real-time 3d gesture control. In IEEE virtual reality conference (pp. 19–26).
Hamming, R. W. (1997). Digital filters (3rd ed.). Dover Publications.
Hansard, M., Lee, S., Choi, O., & Horaud, R. (2013). Time-of-flight cameras: Principles, methods and applications. Berlin: Springer.CrossRef
Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., & Navab, N. (2010). Dominant orientation templates for real-time detection of textureless objects. In CVPR.
Keskin, C., Kirac, F., Kara, Y., & Akarun, L. (2012). Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In ECCV.
Kinect. (2011). http://www.xbox.com/en-US/kinect/ .
Leapmotion. (2013). http://www.leapmotion.com .
Lewis, J. (1995). Fast normalized cross-correlation. In Vision interface (Vol. 10, pp. 120–123).
Melax, S., Keselman, L., & Orsten, S. (2013). Dynamics based 3d skeletal hand tracking. In Graphics interface.
Oikonomidis, N., & Argyros, A. (2011). Efficient model-based 3d tracking of hand articulations using kinect. In BMVC.
Oikonomidis, I., Lourakis, M., & Argyros, A. (2014). Evolutionary quasi-random search for hand articulations tracking. In CVPR.
Peachey, D. (1990). Texture on demand. Tech. rep.
ShapeHand. (2009). http://www.shapehand.com/shapehand.html .
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In CVPR.
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., et al. (2013). Real-time human pose recognition in parts from single depth images. Communication of the ACM, 56(1), 116–124.CrossRef
Softkinetic. (2012). http://www.softkinetic.com .
Sridhar, S., Oulasvirta, A., & Theobalt, C. (2013). Interactive markerless articulated hand motion tracking using rgb and depth data. In ICCV.
Sueda, S., Kaufman, A., & Pai, D. (2008). Musculotendon simulation for hand animation. In SIGGRAPH (pp. 83:1–83:8).
Tang, D., Tejani, A., Chang, H., & Kim, T. (2014) Latent regression forest: Structured estimation of 3d articulated hand posture. In CVPR.
Taylor, J., Stebbing, R., Ramakrishna, V., Keskin, C., Shotton, J., Izadi, S., Fitzgibbon, A., & Hertzmann, A. (2014). User-specific hand modeling from monocular depth sequences. In CVPR.
Tzionas, D., & Gall, J. (2013). A comparison of directional distances for hand pose estimation. In German conference on pattern recognition.
Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 376380.CrossRef
Wang, R., & Popović, J. (2009). Real-time hand-tracking with a color glove. In SIGGRAPH (pp. 63:1–63:8).
Xu, C., & Cheng, L. (2013). Efficient hand pose estimation from a single depth image. In ICCV.
Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., & Gall, J. (2013). Time-of-flight and depth imaging. Sensors, algorithms, and applications, chap. A survey on human motion analysis from depth data (pp. 149–187). Berlin: Springer.
Zhao, W., Chai, J., & Xu, Y. (2012). Combining marker-based mocap and rgb-d camera for acquiring high-fidelity hand motion data. In Eurographics symposium on computer animation.
作者单位：Chi Xu (1)
Ashwin Nanjappa (1)
Xiaowei Zhang (1)
Li Cheng (1) (2)

1. The Bioinformatics Institute, A*STAR, Singapore, Singapore
2. School of Computing, National University of Singapore, Singapore, Singapore
刊物类别：Computer Science
刊物主题：Computer Imaging, Vision, Pattern Recognition and Graphics
Artificial Intelligence and Robotics
Image Processing and Computer Vision
Pattern Recognition
出版者：Springer Netherlands
ISSN：1573-1405

文摘

This paper aims to tackle the practically very challenging problem of efficient and accurate hand pose estimation from single depth images. A dedicated two-step regression forest pipeline is proposed: given an input hand depth image, step one involves mainly estimation of 3D location and in-plane rotation of the hand using a pixel-wise regression forest. This is utilized in step two which delivers final hand estimation by a similar regression forest model based on the entire hand image patch. Moreover, our estimation is guided by internally executing a 3D hand kinematic chain model. For an unseen test image, the kinematic model parameters are estimated by a proposed dynamically weighted scheme. As a combined effect of these proposed building blocks, our approach is able to deliver more precise estimation of hand poses. In practice, our approach works at 15.6 frame-per-second (FPS) on an average laptop when implemented in CPU, which is further sped-up to 67.2 FPS when running on GPU. In addition, we introduce and make publicly available a data-glove annotated depth image dataset covering various hand shapes and gestures, which enables us conducting quantitative analyses on real-world hand images. The effectiveness of our approach is verified empirically on both synthetic and the annotated real-world datasets for hand pose estimation, as well as related applications including part-based labeling and gesture classification. In addition to empirical studies, the consistency property of our approach is also theoretically analyzed. Keywords Hand pose estimation Depth images GPU acceleration Regression forests Consistency analysis Annotated hand image dataset

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700