基于深度体素卷积神经网络的三维模型识别分类

英文篇名：Recognition and Classification for Three-Dimensional Model Based on Deep Voxel Convolution Neural Network
作者：杨军 ; 王顺 ; 周鹏
英文作者：Yang Jun;Wang Shun;Zhou Peng;School of Electronic and Information Engineering, Lanzhou Jiaotong University;School of Automation and Electrical Engineering, Lanzhou Jiaotong University;
关键词：图像处理 ; 计算机视觉 ; 三维模型识别 ; 卷积神经网络 ; 体素化 ; Softmax分类器
英文关键词：image processing;;computer vision;;recognition of three-dimensional model;;convolutional neural network;;voxelization;;Softmax classifier
中文刊名：GXXB
英文刊名：Acta Optica Sinica
机构：兰州交通大学电子与信息工程学院;兰州交通大学自动化与电气工程学院;
出版日期：2018-12-27 12:05
出版单位：光学学报
年：2019
期：v.39;No.445
基金：国家自然科学基金(61862039,61462059)
语种：中文;
页：GXXB201904037
页数：11
CN：04
ISSN：31-1252/O4
分类号：314-324

摘要

提出一种基于深度体素卷积神经网络的三维(3D)模型识别分类算法,该算法使用体素化技术将3D多边形网格模型转化为体素矩阵,并通过深度体素卷积神经网络提取该矩阵的深层特征,以增强特征的表达能力和差异性。在ModelNet40数据集上的实验结果表明:所提算法对3D网格模型识别分类的准确率能够达到87%左右。所构建的深度体素卷积神经网络能够有效地增强3D模型的特征提取和表达能力,提高对大规模复杂3D网格模型分类识别的准确率,所提方法优于当前的主流方法。
An algorithm of recognition and classification of three-dimensional(3 D) model based on deep voxel convolution neural network is proposed. The voxelization technology is used to transform 3 D polygon mesh model into a voxel matrix, and the deep features of the matrix are extracted by the deep voxel convolution neural network to enhance the expressive ability and difference of the features. The experimental results on ModelNet40 dataset show that the accuracy of the algorithm can reach about 87% for recognizing and classifying 3 D mesh model. The constructed deep voxel convolution neural network can effectively enhance the feature extraction and expression ability of 3 D model, as well as improve the classification accuracy of large-scale complex 3 D mesh models, which is better than the current mainstream methods.

引文

[1] Silberman N,Hoiem D,Kohli P,et al.Indoor segmentation and support inference from RGBD images[M]∥Silberman N,Hoiem D,Kohli P,et al.Heidelberg:Computer Vision-ECCV 2012,Springer,2012:746-760.
    [2] Xiao J X,Owens A,Torralba A.SUN3D:a database of big spaces reconstructed using SfM and object labels[C].Sydney:2013 IEEE International Conference on Computer Vision,2013:1625-1632.
    [3] Song S R,Lichtenberg S P,Xiao J X.SUN RGB-D:a RGB-D scene understanding benchmark suite[C].2015 IEEE Conference on Computer Vision and Pattern Recognition,2015:567-576.
    [4] Chang A X,Funkhouser T,Guibas L,et al.ShapeNet:an information-rich 3D model repository[J].Computer Science,2015,1512:3-12.
    [5] Qu L,Wang K R,Chen L L,et al.Fast road detection based on RGBD images and convolutional neural network[J].Acta Optica Sinica,2017,37(10):1010003.曲磊,王康如,陈利利,等.基于RGBD图像和卷积神经网络的快速道路检测[J].光学学报,2017,37(10):1010003.
    [6] Masci J,Boscaini D,Bronstein M M,et al.Geodesic convolutional neural networks on Riemannian manifolds[C].2015 IEEE International Conference on Computer Vision Workshop,2015:832-840.
    [7] Wu Z R,Song S R,Khosla A,et al.3D ShapeNets:a deep representation for volumetric shapes[C].Boston:2015 IEEE Conference on Computer Vision and Pattern Recognition,2015:1912-1920.
    [8] Boscaini D,Masci J,Rodolà E,et al.Learning shape correspondence with anisotropic convolutional neural networks[C].Proceedings of the 30th International Conference on Neural Information Processing Systems,2016:3197-3205.
    [9] Boscaini D,Masci J,Melzi S,et al.Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks[J].Computer Graphics Forum,2015,34(5):13-23.
    [10] Bronstein A M,Bronstein M M,Guibas L J,et al.Shape google[J].ACM Transactions on Graphics,2011,30(1):1-20.
    [11] Osada R,Funkhouser T,Chazelle B,et al.Shape distributions[J].ACM Transactions on Graphics,2002,21(4):807-832.
    [12] Chaudhuri S,Koltun V.Data-driven suggestions for creativity support in 3D modeling[C].ACM Transactions on Graphics,2010:183-191.
    [13] Sun J,Ovsjanikov M,GuibasL.A concise and provably informative multi-scale signature based on heat diffusion[J].Computer Graphics Forum,2009,28(5):1383-1392.
    [14] Bronstein M M,Kokkinos I.Scale-invariant heat kernel signatures for non-rigid shape recognition[C].2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,San Francisco,2010:1704-1711.
    [15] Knopp J,Prasad M,Willems G,et al.Hough transform and 3D SURF for robust three dimensional classification[M]//Knopp J,Prasad M,Willems G,et al.Heidelberg:Springer,2010:589-602.
    [16] Kazhdan M,Funkhouser T,Rusinkiewicz S.Rotation invariant spherical harmonic representation of 3D shape descriptors[C].Proceedings of the 2003 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing,2003:156-164.
    [17] Chen D Y,Tian X P,Shen Y T,et al.On visual similarity based 3D model retrieval[J].Computer Graphics Forum,2003,22(3):223-232.
    [18] Xiao J S,Liu E Y,Zhu L,et al.Improved image super-resolution algorithm based on convolutional neural network[J].Acta Optica Sinica,2017,37(3):0318011.肖进胜,刘恩雨,朱力,等.改进的基于卷积神经网络的图像超分辨率算法[J].光学学报,2017,37(3):0318011.
    [19] He M Y,Li B,Chen H H.Multi-scale 3D deep convolutional neural network for hyperspectral image classification[C].2017 IEEE International Conference on Image Processing,2017:3904-3908.
    [20] Gupta S,Girshick R,Arbeláez P,et al.Learning rich features from RGB-D images for object detection and segmentation[M]∥Gupta S,Girshick R,Arbeláez P,et al.[S.l.]:Springer International Publishing,2014:345-360.
    [21] Maturana D,Scherer S.VoxNet:a 3D convolutional neural network for real-time object recognition[C].2015 IEEE/RSJ International Conference on Intelligent Robots and Systems,Hamburg,2015:922-928.
    [22] Gomez-Donoso F,Garcia-Garcia A,Garcia-Rodriguez J,et al.LonchaNet:a sliced-based CNN architecture for real-time 3D object recognition[C].2017 International Joint Conference on Neural Networks,2017:412-418.
    [23] Minto L,Zanuttigh P,Pagnutti G.Deep learning for 3D shape classification based on volumetric density and surface approximation clues[C].International Conference on Computer Vision Theory and Applications,2018:317-324.
    [24] Tran D,Bourdev L,Fergus R,et al.C3D:generic features for video analysis[J].Eprint Arxiv,2014,2(7):8-17.
    [25] Gwak J Y.3D model classification using convolutional neural network[R].[S.l.]:Stanford University,2016.
    [26] Nair V,Hinton G E.Rectified linear units improve restricted Boltzmann machines[C].International Conference on International Conference on Machine Learning,2010:807-814.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700