Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
详细信息    查看全文
  • 作者:Mehrtash Harandi ; Richard Hartley ; Chunhua Shen…
  • 关键词:Riemannian geometry ; Grassmann manifolds ; Sparse coding ; Dictionary learning
  • 刊名:International Journal of Computer Vision
  • 出版年:2015
  • 出版时间:September 2015
  • 年:2015
  • 卷:114
  • 期:2-3
  • 页码:113-136
  • 全文大小:2,098 KB
  • 参考文献:Absil, P.-A., Mahony, R., & Sepulchre, R. (2004). Riemannian geometry of grassmann manifolds with a view on algorithmic computation. Acta Applicandae Mathematica, 80(2), 199鈥?20.CrossRef MathSciNet MATH
    Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization algorithms on matrix manifolds. Princeton: Princeton University Press.CrossRef MATH
    Aharon, M., Elad, M., & Bruckstein, A. (2006). K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311鈥?322.CrossRef
    Arsigny, V., Fillard, P., Pennec, X., & Ayache, N. (2006). Log-euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine, 56(2), 411鈥?21.CrossRef
    Basri, R., & Jacobs, D. W. (2003). Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218鈥?33.CrossRef
    Begelfor, E., & Werman, M. (2006). Affine invariance revisited. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2087鈥?094).
    Cand猫s, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489鈥?09.CrossRef MATH
    Cetingul, H. E., & Vidal, R. (2009), Intrinsic mean shift for clustering on stiefel and grassmann manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1896鈥?902).
    Cetingul, H.E., & Vidal, R. (2011). Sparse riemannian manifold clustering for HARDI segmentation. In IEEE International Symposium on Biomedical Imaging: From Nano to Macro (pp. 1750鈥?753).
    Cetingul, H. E., Wright, M. J., Thompson, P. M., & Vidal, R. (2014). Segmentation of high angular resolution diffusion MRI using sparse riemannian manifold clustering. IEEE Transactions on Medical Imaging, 33(2), 301鈥?17.CrossRef
    Cevikalp, H., & Triggs, B. (2010). Face recognition based on image sets. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2567鈥?573).
    Chan, A.B., & Vasconcelos, N. (2005). Probabilistic kernels for the classification of auto-regressive visual processes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 846鈥?51).
    Chen, S., Sanderson, C., Harandi, M., & Lovell, B. C. (2013). Improved image set classification via joint sparse approximated nearest subspaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 452鈥?59).
    Chikuse, Y. (2003). Statistics on special manifolds (Vol. 174). New York: Springer.MATH
    Cock, K. D., & Moor, B. D. (2002). Subspace angles between ARMA models. Systems and Control Letters, 46, 265鈥?70.CrossRef MathSciNet MATH
    Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 886鈥?93).
    Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289鈥?306.CrossRef MathSciNet MATH
    Doretto, G., Chiuso, A., Wu, Y. N., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51, 91鈥?09.CrossRef MATH
    Elad, M. (2010). Sparse and redundant representations鈥擣rom theory to applications in signal and image processing. New York: Springer.MATH
    Elhamifar, E., & Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2765鈥?781.CrossRef
    Gallivan, K. A., Srivastava, A., Liu, X., & Van Dooren, P. (2003). Efficient algorithms for inferences on Grassmann manifolds. In IEEE Workshop on Statistical Signal Processing (pp. 315鈥?18).
    Ghanem, B., & Ahuja, N. (2010). Maximum margin distance learning for dynamic texture recognition. Proceedings of the European Conference on Computer Vision (ECCV), 6312, 223鈥?36.
    Goh, A., & Vidal, R. (2008). Clustering and dimensionality reduction on Riemannian manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1鈥?).
    Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore: Johns Hopkins University Press.MATH
    Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2066鈥?073).
    Gopalan, R., Li, R., & Chellappa, R. (2014). Unsupervised adaptation across domain shifts by generating intermediate data representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 2288鈥?302.CrossRef
    Guo, K., Ishwar, P., & Konrad, J. (2013). Action recognition from video using feature covariance matrices. IEEE Transactions on Image Processing (TIP), 22(6), 2479鈥?494.CrossRef MathSciNet
    Hamm, J., & Lee, D. D. (2008). Grassmann discriminant analysis: a unifying view on subspace-based learning. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 376鈥?83).
    Harandi, M., Sanderson, C., Shen, C., & Lovell, B. C. (2013). Dictionary learning and sparse coding on Grassmann manifolds: An extrinsic solution. In: Proceedings of the International Conference on Computer Vision (ICCV).
    Harandi, M.T., Hartley, R., Lovell, B. C., & Sanderson, C. (2015). Sparse coding on symmetric positive definite manifolds using bregman divergences. IEEE Transaction on Neural Networks and Learning Systems (TNNLS) PP(99):1鈥?.
    Harandi, M. T., Sanderson, C., Shirazi, S., & Lovell, B. C. (2011). Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2705鈥?712).
    Hartley, R., Trumpf, J., Dai, Y., & Li, H. (2013). Rotation averaging. International Journal of Computer Vision, 103(3), 267鈥?05.CrossRef MathSciNet MATH
    Helmke, U., H眉per, K., & Trumpf, J. (2007). Newtons method on Grassmann manifolds. Preprint: arXiv:鈥?709.鈥?205 .
    Ho, J., Xie, Y., & Vemuri, B. (2013). On a nonlinear generalization of sparse coding and dictionary learning. In: Proceedings of the International Conference on Machine Learning (ICML) (pp. 1480鈥?488).
    Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on pure and applied mathematics, 30(5), 509鈥?41.CrossRef MathSciNet MATH
    Kim, M., Kumar, S., Pavlovic, V., & Rowley, H. (2008). Face tracking and recognition with visual constraints in real-world videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1鈥?).
    Kim, T.-K., & Cipolla, R. (2009). Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1415鈥?428.CrossRef
    Kim, T.-K., Kittler, J., & Cipolla, R. (2007). Discriminative learning and recognition of image set classes using canonical correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1005鈥?018.CrossRef
    Kokiopoulou, E., Chen, J., & Saad, Y. (2011). Trace optimization and eigenproblems in dimension reduction methods. Numerical Linear Algebra with Applications, 18(3), 565鈥?02.CrossRef MathSciNet MATH
    Lee, J. M. (2012). Introduction to smooth manifolds (Vol. 218). New York: Springer.CrossRef
    Li, B., Ayazoglu, M., Mao, T., Camps, O. I., & Sznaier, M. (2011). Activity recognition using dynamic subspace angles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3193鈥?200).
    Lui, Y. M. (2012). Human gesture recognition on product manifolds. Journal of Machine Learning Research, 13, 3297鈥?321.MathSciNet MATH
    Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791鈥?04.CrossRef
    Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19鈥?0.MathSciNet MATH
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2008). Discriminative learned dictionaries for local image analysis. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1鈥?). IEEE.
    Mairal, J., Elad, M., & Sapiro, G. (2008). Sparse representation for color image restoration. IEEE Transactions on Image Processing (TIP), 17(1), 53鈥?9.CrossRef MathSciNet
    Manton, J. H. (2004). A globally convergent numerical algorithm for computing the centre of mass on compact lie groups. In International Conference on Control, Automation, Robotics and Vision 3 (pp. 2211鈥?216).
    Ojala, T., Pietik盲inen, M., & M盲enp盲盲, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 971鈥?87.CrossRef
    Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583), 607鈥?09.CrossRef
    Ramamoorthi, R. (2002). Analytic PCA construction for theoretical analysis of lighting variability in images of a Lambertian object. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(10), 1322鈥?333.CrossRef
    Rao, S. R., Tron, R., Vidal, R., & Ma, Y. (2008). Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1鈥?).
    Ravichandran, A., Favaro, P., & Vidal, R. (2011). A unified approach to segmentation and categorization of dynamic textures. In Proceedings of the Asian Conference on Computer Vision (ACCV) (pp. 425鈥?38). Springer.
    Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323鈥?326.CrossRef
    Sanderson, C., Harandi, M. T., Wong, Y., & Lovell, B. C. (2012). Combined learning of salient local descriptors and distance metrics for image set face verification. In Proceedings of the International Conference on Advanced Video and Signal-Based Surveillance (pp. 294鈥?99).
    Sankaranarayanan, A., Turaga, P., Baraniuk, R., & Chellappa, R. (2010). Compressive acquisition of dynamic scenes. Proceedings of the European Conference on Computer Vision (ECCV), 6311, 129鈥?42.
    Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.CrossRef
    Shirazi, S., Sanderson, C., McCool, C., & Harandi, M. T. (2015). Bags of affine subspaces for robust object tracking. Preprint: arXiv:鈥?408.鈥?313 .
    Srivastava, A., & Klassen, E. (2004). Bayesian and geometric subspace tracking. Advances in Applied Probability, 36(1), 43鈥?6.CrossRef MathSciNet MATH
    Subbarao, R., & Meer, P. (2009). Nonlinear mean shift over Riemannian manifolds. International Journal of Computer Vision, 84(1), 1鈥?0.CrossRef
    Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267鈥?88.MathSciNet MATH
    Turaga, P., Veeraraghavan, A., Srivastava, A., & Chellappa, R. (2011). Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2273鈥?286.CrossRef
    Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71鈥?6.CrossRef
    Vemulapalli, R., Pillai, J. K., & Chellappa, R. (2013). Kernel learning for extrinsic classification of manifold features. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1782鈥?789).
    Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137鈥?54.CrossRef
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., & Gong, Y. (2010). Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3360鈥?367).
    Wang, Y., & Mori, G. (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1762鈥?774.CrossRef
    Wikipedia. Min-max theorem 鈥?wikipedia, the free encyclopedia, 2015. [Online; accessed 27-May-2015].
    Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T. S., & Yan, S. (2010). Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE, 98(6), 1031鈥?044.CrossRef
    Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210鈥?27.CrossRef
    Xu, Y., Quan, Y., Ling, H., & Ji, H. (2011). Dynamic texture classification using dynamic fractal analysis. In Proceedings of the International Conference on Computer Vision (ICCV).
    Yang, J., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1794鈥?801).
    Yu, K., & Zhang, T. (2010). Improved local coordinate coding using local tangents. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 1215鈥?222).
    Yu, K., Zhang, T., & Gong, Y. (2009). Nonlinear learning using local coordinate coding. In Proceedings of the Advances in Neural Information Processing Systems (NIPS) 9 (p 1).
    Yu, S., Tan, T., Huang, K., Jia, K., & Wu, X. (2009). A study on gait-based gender classification. IEEE Transactions on Image Processing (TIP), 18(8), 1905鈥?910.CrossRef MathSciNet
    Yuan, C., Hu, W., Li, X., Maybank, S., & Luo, G. (2010). Human action recognition under log-euclidean Riemannian metric. In H. Zha, R.-I. Taniguchi, & S. Maybank editors, Proc. Asian Conference on Computer Vision (ACCV), volume 5994 of Lecture Notes in Computer Science, pages 343鈥?53. Springer Berlin Heidelberg.
    Zhao, G., & Pietik盲inen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(6), 915鈥?28.CrossRef
    Zheng, S., Zhang, J., Huang, K., He, R., & Tan, T. (2011). Robust view transformation model for gait recognition. In International Conference on Image Processing (ICIP) (pp. 2073鈥?076).
  • 作者单位:Mehrtash Harandi (1) (2)
    Richard Hartley (1) (2)
    Chunhua Shen (3)
    Brian Lovell (4)
    Conrad Sanderson (2) (4)

    1. College of Engineering and Computer Science, Australian National University, Canberra, Australia
    2. NICTA, Canberra, Australia
    3. School of Computer Science, The University of Adelaide, Adelaide, SA, 5005, Australia
    4. The University of Queensland, Brisbane, Australia
  • 刊物类别:Computer Science
  • 刊物主题:Computer Imaging, Vision, Pattern Recognition and Graphics
    Artificial Intelligence and Robotics
    Image Processing and Computer Vision
    Pattern Recognition
  • 出版者:Springer Netherlands
  • ISSN:1573-1405
文摘
Sparsity-based representations have recently led to notable results in various visual recognition tasks. In a separate line of research, Riemannian manifolds have been shown useful for dealing with features and models that do not lie in Euclidean spaces. With the aim of building a bridge between the two realms, we address the problem of sparse coding and dictionary learning in Grassmann manifolds, i.e., the space of linear subspaces. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping. This in turn enables us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we propose an algorithm for learning a Grassmann dictionary, atom by atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann sparse coding and dictionary learning algorithms through embedding into higher dimensional Hilbert spaces. Experiments on several classification tasks (gender recognition, gesture classification, scene analysis, face recognition, action recognition and dynamic texture classification) show that the proposed approaches achieve considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelized Affine Hull Method and graph-embedding Grassmann discriminant analysis. Keywords Riemannian geometry Grassmann manifolds Sparse coding Dictionary learning

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700