基于改进SSD的高效目标检测方法

英文篇名：Efficient Multi-Object Efficient Object Detection Method Based on Improved SSD
作者：王文光 ; 李强 ; 林茂松 ; 贺贤珍
英文作者：WANG Wenguang;LI Qiang;LIN Maosong;HE Xianzhen;School of Information Engineering, Southwest University of Science and Technology;
关键词：深度学习 ; 目标检测 ; 特征融合 ; 样本失衡 ; 卷积神经网(CNN)
英文关键词：deep learning;;object detection;;feature fusion;;sample imbalance;;Convolutional Neural Network(CNN)
中文刊名：JSGG
英文刊名：Computer Engineering and Applications
机构：西南科技大学信息工程学院;
出版日期：2019-04-28 15:33
出版单位：计算机工程与应用
年：2019
期：v.55;No.932
基金：四川省科技计划项目(No.2018GZ0095)
语种：中文;
页：JSGG201913006
页数：8
CN：13
分类号：34-41

摘要

为改善一阶段目标检测算法检测精度较差的缺陷,提出一种基于SSD的高效多目标定位检测算法FSD。该算法主要从两个方面对一阶段目标检测算法进行改进:设计了一个更高效的密集残差网络,即R-DenseNet,通过采用一种更窄的密集网络结构形式,在保持特征提取容量的同时降低了计算复杂度,从而提高了算法的检测和收敛性能;改进了损失函数,通过抑制易分样本在损失函数中的权重,提高算法的鲁棒性,改善了目标检测中样本失衡的现象。采用Tensorflow深度学习框架部署网络,并在搭载Nvidia Titan X的Ubuntu上开展实验,实验表明FSD在COCO和PASCAL VOC这两个目标检测数据集上上都取得了最高的检测精度,其中FSD300D的检测精度相比SSD300有3.7%提升,检测相率比SSD有10.87%提升。
In order to improve the defect of poor detection accuracy of the one-stage object detection algorithm, an efficient multi-target location detection algorithm FSD based on SSD is proposed. The algorithm mainly improves the one-stage object detection algorithm from two aspects: on the one hand, it designs a more efficient dense residual network, namely R-DenseNet, by adopting a narrower dense network structure form to maintain feature extraction. The capacity reduces the computational complexity, which improves the detection and convergence performance of the algorithm. On the other hand, the loss function is improved. By suppressing the weight of the easily-divided samples in the loss function, the robustness of the algorithm is improved, and the phenomenon of sample imbalance in object detection is improved. The Tensorflow deep learning framework is used to deploy the network, and experiments are carried out on Ubuntu equipped with Nvidia Titan X. Experiments show that FSD achieves the highest detection accuracy on both COCO and PASCAL VOC object detection data sets, among which FSD300 detection accuracy compared with the SSD300, there is a 3.7%improvement, and the detection phase rate is 10.87% higher than that of the SSD.

引文

[1]王娇娇,刘政怡,李辉.特征融合与objectness加强的显著目标检测[J].计算机工程与应用,2017,53(2):195-200.
    [2]龙敏,佟越洋.应用卷积神经网络的人脸活体检测算法研究[J].计算机科学与探索,2018,12(10):1658-1670.
    [3]谢林江,季桂树,彭清,等.改进的卷积神经网络在行人检测中的应用[J].计算机科学与探索,2018,12(5):708-718.
    [4] He K,Zhang X,Ren S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
    [5] Girshick R.FastR-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision,2015:1440-1448.
    [6] Redmon J,Divvala S,Girshick R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2016:779-788.
    [7] Liu W.SSD:Single shot MultiBox detector[C]//Proceedings of European Conference on Computer Vision(ECCV 2016),2016:21-37.
    [8] Jeong J,Park H,Kwak N.Enhancement of SSD by concatenating feature maps for object detection[J].arXiv:1705.09587,2017.
    [9] Leng J,Liu Y.An enhanced SSD with feature fusion and visual reasoning for object detection[J].Neural Computing and Applications,2018(2):1-10.
    [10] Lin T Y,Dollár P,Girshick R,et al.Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2017:936-944.
    [11] Ren S,He K,Girshick R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,39(6):1137-1149.
    [12] Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
    [13] He K M,Zhang X Y,Ren S Q,et al.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
    [14] Huang G,Liu Z,Weinberger K.Densely connected convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2017:4700-4708.
    [15] Shrivastava A,Gupta A,Girshick R.Training region-based object detectors with online hard example mining[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2016:761-769.
    [16] Lin T Y,Goyal P,Girshick R,et al.Focal loss for dense object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,99:2999-3007.
    [17] Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[C]//Proceedings of International Conference on Neural Information Processing Systems,2012:1097-1105.
    [18] Szegedy C,Liu W,Jia Y Q,et al.Going deeper with convolutions[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2015:1-9.
    [19] Ioffe S,Szegedy C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning,2015:448-456.
    [20] Glorot X,Bordes A,Bengio Y.Deep sparse rectifier neural networks[C]//Proceedings of International Conference on Artificial Intelligence and Statistics,2012:315-323.
    [21] Erhan D,Szegedy C,Toshev A,et al.Scalable object detection using deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2014:2147-2154.
    [22] Szegedy C,Reed S,Erhan D,et al.Scalable,high-quality object detection[J].arXiv:1412.1441,2014.
    [23] Fu C Y,Liu W,Ranga A,et al.DSSD:Deconvolutional single shot detector[J].arXiv:1701.06659,2017.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700