基于卷积神经网络和投票机制的三维模型分类与检索

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于卷积神经网络和投票机制的三维模型分类与检索

详细信息查看全文 | 推荐本文 |

英文篇名：3D Model Classification and Retrieval Based on CNN and Voting Scheme
作者：白静 ; 司庆龙 ; 秦飞巍
英文作者：Bai Jing;Si Qinglong;Qin Feiwei;School of Computer Science and Engineering, North Minzu University;School of Computer Science and Technology, Hangzhou Dianzi University;
关键词：三维模型检索 ; 卷积神经网络 ; 投票机制 ; 深度学习 ; 非刚性三维模型
英文关键词：3D model retrieval;;convolutional neural network;;voting scheme;;deep learning;;non-rigid 3D models
中文刊名：JSJF
英文刊名：Journal of Computer-Aided Design & Computer Graphics
机构：北方民族大学计算机科学与工程学院;杭州电子科技大学计算机学院;
出版日期：2019-02-15
出版单位：计算机辅助设计与图形学学报
年：2019
期：v.31
基金：国家自然科学基金(61762003,61502129);; 宁夏自然科学基金(2018AAC03124);; 宁夏高等学校一流学科建设(电子科学与技术:NXYLXK2017A07);; 国家民族事务委员会“图像与智能信息处理创新团队”;国家民族事务委员会中青年英才计划(2016GQR08);; 浙江省自然科学基金(LQ16F020004)
语种：中文;
页：JSJF201902013
页数：12
CN：02
ISSN：11-2925/TP
分类号：123-134

摘要

针对现有基于深度学习的三维模型多视图分类算法利用最大池化、平均池化等像素级运算完成视图信息的融合,可能造成模型有益信息淹没和混淆的问题,提出一种基于卷积神经网络和投票机制的三维模型分类检索算法.首先将三维模型转化为一组二维视图,然后基于丰富的数字图像库ImageNet和成熟的图像深度学习模型CaffeNet完成二维视图的分类,最后利用加权投票的方式完成三维模型的分类;同时基于投票机制,提出4种三维模型距离度量算法,支持三维模型的检索.将文中算法应用于刚性三维模型库ModelNet10,ModelNet40,非刚性三维模型库SHREC10, SHREC11和SHREC15中,分类准确率分别为93.18%, 93.07%, 99.5%, 99.5%和99.4%,检索性能突出;并通过实验验证该算法的有效性.
The existing deep learning algorithms for view-based 3D model classification use pixel-level operations, such as maximum pooling and average pooling, to fuse the views' information, which may lose or overwrite the useful information of 3D models. Aiming at the problem, a 3D model classification and retrieval algorithm based on convolutional neural network and voting scheme is proposed. Firstly, each 3D model is converted to a set of 2D views. Then, those 2D views are classified based on deep learning model CaffeNet with rich digital image library ImageNet. Finally, the 3D model is classified by weighted voting. Furthermore, based on the voting scheme, four distance measurement algorithms are proposed to retrieve 3D model. Experiments on the rigid 3D model libraries ModelNet10, ModelNet40, and the non-rigid 3D model libraries SHREC10, SHREC11 and SHREC15 demonstrate the effectiveness of the proposed algorithm. The classification accuracy rates for above five libraries are 93.18%, 93.07%, 99.5%, 99.5% and 99.4% respectively, and the retrieval performance is on par or better than state-of-the-art methods.

引文

[1]Qin F W,Li L Y,Gao S M,et al.A deep learning approach to the classification of 3D CAD models[J].Journal of Zhejiang University:Science C,2014,15(2):91-106
    [2]Xie J,Dai G X,Zhu F,et al.DeepShape:deep-learned shape descriptor for 3D shape retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(7):1335-1345
    [3]Wu Z R,Song S R,Khosla A,et al.3D ShapeNets:a deep representation for volumetric shapes[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2015:1912-1920
    [4]Xu X,Todorovic S.Beam search for learning a deep convolutional neural network of 3D shapes[C]//Proceedings of the23rd International Conference on Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:3506-3511
    [5]Maturana D,Scherer S.VoxNet:a 3D convolutional neural network for real-time object recognition[C]//Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.Los Alamitos:IEEE Computer Society Press,2015:922-928
    [6]Sedaghat N,Zolfaghari M,Brox T.Orientation-boosted voxel nets for 3D object recognition[OL].[2018-03-02].https://arxiv.org/abs/1604.03351
    [7]Garcia-Garcia A,Gomez-Donoso F,Garcia-Rodriguez J,et al.PointNet:a 3D convolutional neural network for real-time object class recognition[C]//Proceedings of the International Joint Conference on Neural Networks.Los Alamitos:IEEE Computer Society Press,2016:1578-1584
    [8]Xie Zhige,Wang Yueqing,Dou Yong,et al.3D feature learning via convolutional auto-encoder extreme learning machine[J].Journal of Computer-Aided Design and Computer Graphics,2015,27(11):2058-2064(in Chinese)(谢智歌,王岳青,窦勇,等.基于卷积-自动编码机的三维形状特征学习[J].计算机辅助设计与图形学学报,2015,27(11):2058-2064)
    [9]Riegler G,Ulusoy A O,Geiger A.OctNet:learning deep 3Drepresentations at high resolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2017:6620-6629
    [10]Wang P S,Liu Y,Guo Y X,et al.O-CNN:octree-based convolutional neural networks for 3D shape analysis[J].ACM Transactions on Graphics,2017,36(4):Article No.72
    [11]Brock A,Lim T,Ritchie J M,et al.Generative and discriminative voxel modeling with convolutional neural networks[OL].[2018-03-02].https://arxiv.org/abs/1608.04236
    [12]Shi B G,Bai S,Zhou Z C,et al.DeepPano:deep panoramic representation for 3-D shape recognition[J].IEEE Signal Processing Letters,2015,22(12):2339-2343
    [13]Sinha A,Bai J,Ramani K.Deep learning 3D shape surfaces using geometry images[C]//Proceedings of the European Conference on Computer Vision.Heidelberg:Springer,2016:223-240
    [14]Su H,Maji S,Kalogerakis E,et al.Multi-view convolutional neural networks for 3D shape recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2015:945-953
    [15]Wang C,Pelillo M,Siddiqi K,et al.Dominant set clustering and pooling for multi-view 3D object recognition[C]//Proceedings of the British Machine Vision Conference.Heidelberg:Springer,2017:1-11
    [16]Ma Y X,Zheng B,Guo Y L,et al.Boosting multi-view convolutional neural networks for 3D object recognition via view saliency[C]//Proceedings of the Chinese Conference on Image and Graphics Technologies.Heidelberg:Springer,2017:199-209
    [17]Johns E,Leutenegger S,Davison A J.Pairwise decomposition of image sequences for active multi-view recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:3813-3822
    [18]Bai S,Bai X,Zhou Z,et al.GIFT:a real-time and scalable 3Dshape search engine[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:5023-5032
    [19]Phong B T.Illumination for computer generated pictures[J].Communications of the ACM,1975,18(6):311-317
    [20]Jia Y Q,Shelhamer E,Donahue J,et al.Caffe:convolutional architecture for fast feature embedding[C]//Proceedings of the22nd ACM International Conference on Multimedia.New York:ACM Press,2014:675-678
    [21]Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems.New York:Curran Associates Inc,2012,1:1097-1105
    [22]Shilane P,Min P,Kazhdan M,et al.The princeton shape benchmark[C]//Proceedings of the Shape Modeling Applications.Los Alamitos:IEEE Computer Society Press,2004:167-178
    [23]Lian Z,Godil A,Fabry T,et al.SHREC’10 track:non-rigid 3Dshape retrieval[C]//Proceedings of the 3rd Eurographics Conference on 3D Object Retrieval.Aire-la-Ville:Eurographics Association Press,2010:101-108
    [24]Lian Z,Godil A,Bustos B,et al.SHREC’11 track:shape retrieval on non-rigid 3D watertight meshes[C]//Proceedings of the 4th Eurographics Conference 3D Object Retrieval.Aire-la-Ville:Eurographics Association Press,2011:79-88
    [25]Lian Z,Zhang J,Choi S,et al.SHREC’15 track:non-rigid 3Dshape retrieval[C]//Proceedings of the Eurographics Workshop on 3D Object Retrieval.Aire-la-Ville:Eurographics Association Press,2015:257-266
    [26]Hegde V,Zadeh R.FusionNet:3D object classification using multiple data representations[OL].[2018-03-02].https://arxiv.org/abs/1607.05695
    [27]Qi C R,Su H,Nie?ner M,et al.Volumetric and multi-view CNNs for object classification on 3D data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:5648-5656
    [28]Zhi S F,Liu Y X,Li X,et al.Towards real-time 3D object recognition:a lightweight volumetric CNN framework using multitask learning[J].Computers&Graphics,2018,71:199-207
    [29]Chen D Y,Tian X P,Shen Y T,et al.On visual similarity based3D model retrieval[J].Computer Graphics Forum,2003,22(3):223-232
    [30]Shen L,Makedon F.Spherical mapping for processing of 3Dclosed surfaces[J].Image and Vision Computing,2006,24(7):743-761
    [31]Gu X F,Wang Y L,Chan T F,et al.Genus zero surface conformal mapping and its application to brain surface mapping[J].IEEE Transactions on Medical Imaging,2004,23(8):949-958
    [32]Kazhdan M,Funkhouser T,Rusinkiewicz S.Rotation invariant spherical harmonic representation of 3D shape descriptors[C]//Proceedings of the Eurographics/ACM SIGGRAPH Symposium on Geometry Processing.Aire-la-Ville:Eurographics Association Press,2003:156-164

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700