基于张量分解融合RGB-D图像的物体识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Object Recognition Based on Tensor Decomposition Fusing RGB-D Image
  • 作者:余霆 ; 文元美 ; 凌永权
  • 英文作者:YU Tingsong;WEN Yuanmei;LING Yongquan;School of Information Engineering, Guangdong University of Technology;
  • 关键词:RGB-D图像融合 ; 卷积神经网络 ; 张量分解 ; Tucker分解 ; 物体识别
  • 英文关键词:RGB-D image fusion;;convolutional neural network;;tensor decomposition;;Tucker decomposition;;object recognition
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:广东工业大学信息工程学院;
  • 出版日期:2018-07-04 17:19
  • 出版单位:计算机工程与应用
  • 年:2019
  • 期:v.55;No.921
  • 基金:国家自然科学基金(No.61372173,No.61671163,No.11871168);; 广东省自然科学基金(No.2018A030310593);; 广东工业大学2017年中央财政支持地方高校发展专项资金项目(No.201707)
  • 语种:中文;
  • 页:JSGG201902028
  • 页数:5
  • CN:02
  • 分类号:180-184
摘要
为了充分利用RGB-D图像的深度图像信息,提出了基于张量分解的物体识别方法。首先将RGB-D图像构造成一个四阶张量,然后将该四阶张量分解为一个核心张量和四个因子矩阵,再利用相应的因子矩阵将原张量进行投影,获得融合后的RGB-D数据,最后输入到卷积神经网络中进行识别。RGB-D数据集中三组相似物体的识别结果表明,利用张量分解融合RGB-D图像的物体识别准确率高于未采用张量分解的物体识别准确率,并且单一错分实例的准确率最高可提升99%。
        To make full use of the depth information for RGB-D image recognition, this paper proposes a new object recognition method based on tensor decomposition. Firstly, it represents the RGB-D image as a fourth-order tensor. Then, it decomposes the fourth-order tensor into a core tensor and four factor matrices. Finally, after projecting the fourth-order tensor by factor matrices, the newly obtained tensor is sent to a convolution neural network for object recognition. Comparative experimental results of three group similar objects on RGB-D dataset show that the proposed method obtains higher recognition accuracy than method that no-tensor fusing. Moreover, the single-object recognition accuracy can be improved by up to 99%.
引文
[1]Schwarz M,Schulz H,Behnke S.RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features[C]//IEEE International Conference on Robotics and Automation,2015:1329-1335.
    [2]Sanchez-Riera J,Hua K L,Hsiao Y S,et al.A comparative study of data fusion for RGB-D based visual recognition[J].Pattern Recognition Letters,2016,73:1-6.
    [3]卢良锋,谢志军,叶宏武.基于RGB特征与深度特征融合的物体识别算法[J].计算机工程,2016,42(5):186-193.
    [4]Sharma A,Sankar K P.Enhancing RGB CNNs with depth[C]//IAPR Asian Conference on Pattern Recognition,2015:31-35.
    [5]Lee S,Park S J,Hong K S.RDFNet:RGB-D multi-level residual feature fusion for indoor semantic segmentation[C]//IEEE International Conference on Computer Vision,2017:4990-4999.
    [6]Socher R,Huval B,Bhat B,et al.Convolutional-recursive deep learning for 3D object classification[C]//NIPS,2012:665-673.
    [7]Wang A,Cai J,Lu J,et al.MMSS:multi-modal sharable and specific feature learning for RGB-D object recognition[C]//IEEE International Conference on Computer Vision,2015:1125-1133.
    [8]Zhu H,Weibel J B,Lu S.Discriminative multi-modal feature fusion for RGBD indoor scene recognition[C]//IEEEConference on Computer Vision and Pattern Recognition,2016:2969-2976.
    [9]Couprie C,Farabet C,Najman L,et al.Indoor semantic segmentation using depth information[J].Eprint Arxiv,2013.
    [10]张贤达.矩阵分析与应用[M].2版.北京:清华大学出版社,2013.
    [11]Vasilescu M A O,Terzopoulos D.Multilinear image analysis for facial recognition[C]//International Conference on Pattern Recognition,2002:511-514.
    [12]Hazan T,Polak S,Shashua A.Sparse image coding using a 3D non-negative tensor factorization[C]//Tenth IEEEInternational Conference on Computer Vision,2005:50-57.
    [13]Kolda T G,Bader B W.Tensor decompositions and applications[J].SIAM Review,2009,51(3):455-500.
    [14]Lai K,Bo L,Ren X,et al.A large-scale hierarchical multi-view RGB-D object dataset[C]//IEEE International Conference on Robotics and Automation,Shanghai,China,2011:1817-1824.
    [15]Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition,2015:3431-3440.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700