摘要
基于深度学习的场景识别作为计算机视觉领域的重要方向,目前仍存在部分问题,如仅提取图像的高层语义特征而缺失了图像的底层特征,针对这个问题,提出基于改进SIFT特征与深度神经网络相结合的室内RGB-D图像识别方法。首先提取图像的SIFT特征,然后利用随机森林算法根据重要度对SIFT特征进行筛选,然后结合基于ResNet的深度神经网络,并提出基于深度直方图与深度均值直方图的深度损失函数,加速模型的收敛。实验结果表明,算法可以在NYUD v2数据集上达到71.52%的识别率,有效提升了室内场景识别的准确率。
As an important direction of computer vision,scene recognition based on deep learning still has some problems,such as only extracting the high-level semantic features and missing the bottom features of an image.To solve this problem,the paper proposes an indoor RGB-D image recognition method based on improved SIFT features and deep learning neural network.Firstly,the SIFT features of images are extracted,the SIFT features are filtered according to the importance degree by means of the Random Forest Algorithm,and then the depth loss function based on the depth histogram and the depth mean histogram is proposed to accelerate the convergence of the model by combining the ResNet-based deep neural network.The experimental results show that the algorithm can achieve 71.52% recognition rate on NYUD V2 data set,and effectively improve the accuracy of indoor scene recognition.
引文
[1]Lowe D G.Distinctive image features from scaleinvariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[2]Wang L,Guo S,Huang W,et al.Knowledge guided disambiguation for large-scale scene classification with multiresolution CNNs[J].IEEE Transactions on Image Process ng,2017,26(4):2055-2068.
[3]张春林,陈劲杰.基于改进SIFT和RANSAC的物体特征提取和匹配的研究[J].软件工程,2018,21(11):6-9.
[4]He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2016:770-778.
[5]Lin T-Y,Dollár P,Girshick R,et al.Feature pyramid networks for object detection[C].Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2117-2125.
[6]Ren S,He K,Girshick R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C].Advances in neural information processing systems,2015:91-99.
[7]Gupta S,Arbelaez P,Malik J.Perceptual organization and recognition of indoor scenes from RGB-D images[C]Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2013:564-571.
[8]Song X,Herranz L,Jiang S.Depth CNNs for RGB-D scene recognition:learning from scratch better than transferring from RGB-CNNs[C].Thirty-First AAAI Conference on Artificia Intelligence,2017.
[9]Herranz-Perdiguero C,Redondo-Cabrera C,López-Sastre RJ.In pixels we trust:From Pixel Labeling to Object Localization and Scene Categorization[C].IEEE/RSJ Internationa Conference on Intelligent Robots and Systems,2018:355-361.