改进的SSD算法及其对遥感影像小目标检测性能的分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Improved SSD Algorithm and Its Performance Analysis of Small Target Detection in Remote Sensing Images
  • 作者:王俊强 ; 李建胜 ; 周学文 ; 张旭
  • 英文作者:Wang Junqiang;Li Jiansheng;Zhou Xuewen;Zhang Xu;Institute of Geospatial Information, Information Engineering University;78123 Troops;
  • 关键词:遥感 ; 小目标检测 ; 深度学习 ; 多尺度预测 ; 特征金字塔 ; 平均准确率均值
  • 英文关键词:remote sensing;;small target detection;;deep learning;;multi-scale prediction;;feature pyramid;;mean average precision
  • 中文刊名:GXXB
  • 英文刊名:Acta Optica Sinica
  • 机构:信息工程大学地理空间信息学院;78123部队;
  • 出版日期:2019-03-19 09:09
  • 出版单位:光学学报
  • 年:2019
  • 期:v.39;No.447
  • 基金:国家自然科学基金(41876105);; 国家重点研发计划资助(2017YFF0206000)
  • 语种:中文;
  • 页:GXXB201906044
  • 页数:10
  • CN:06
  • ISSN:31-1252/O4
  • 分类号:373-382
摘要
针对以Faster R-CNN为代表的基于候选框方式的遥感影像目标检测方法检测速度慢,而现有SSD算法在小目标检测中性能低的问题,提出一种改进的SSD算法,综合利用现有基于候选框方式和一体化检测方式的优势,提升检测性能。该算法利用密集连接网络替换原有的VGGNet作为骨干网络,并且在密集连接模块之间构建特征金字塔,代替原有多尺度特征图。为验证所提算法的精度及性能,设计样本数据在线采集系统,并采集飞机及运动场目标样本集作为实验样本,通过对改进SSD算法的训练,验证了其网络结构的稳定性,在无迁移学习支持下依然能够达到良好效果,且训练过程不易发散。通过对比以101层的残差网络(ResNet101)作为基础网络的Faster R-CNN算法和R-FCN算法可知,改进SSD算法较Faster R-CNN算法和R-FCN算法的MAP在测试集上分别提升了9.13%和8.48%,小目标检测的MAP分别提升了14.46%和13.92%,检测单张影像耗时71.8 ms,较Faster R-CNN和R-FCN算法分别减少45.7 ms和7.5 ms。
        An improved single shot multibox detector(SSD) algorithm is proposed aiming at the problems of slow detection speed of the target proposal based remote sensing image target detection method represented by faster regions with convolutional neural network(R-CNN) and the low performance in small target detection by the SSD algorithm. The algorithm can combine the advantages of the existing detection methods based on target proposal and one-stage target detection to improve the target detection performance. Furthermore, the algorithm replaces the original visual geometry group net with a densely connected network as the backbone network and constructs a feature pyramid between the densely connected modules instead of the original multi-scale feature map. A sample data online acquisition system is designed to verify the accuracy and performance of the proposed algorithm. A sample set of aircraft and playground target is collected as the experimental sample. The network structure stability is verified by training the improved SSD algorithm. Consequently, good results can be achieved without the support of transfer learning. Moreover, the training process is not easy to diverge. By comparing the Faster R-CNN algorithm using ResNet101 as the backbone network and the R-FCN(region-based fully convolutional networks) algorithm, we find that the mean average precision(MAP) of the improved SSD algorithm is 9.13% and 8.48% higher than that of the faster R-CNN and R-FCN algorithms in the test set, respectively. The proposed SSD algorithm improves the MAP in the small target detection by 14.46% and 13.92% compared to the faster R-CNN and R-FCN algorithms, respectively. Detecting a single image takes 71.8 ms, which is 45.7 ms and 7.5 ms less than that of the faster R-CNN and R-FCN algorithms, respectively.
引文
[1] Wang G X,Huang X T,Zhou Z M.UWB SAR change detection of target in foliage based on local statistic distribution change analysis[J].Journal of Electronics & Information Technology,2011,33(1):49-54.王广学,黄晓涛,周智敏.基于邻域统计分布变化分析的UWB SAR隐蔽目标变化检测[J].电子与信息学报,2011,33(1):49-54.
    [2] Wu W.Research on knowledge-based target recognition and tracking techniques[D].Harbin:Harbin Institute of Technology,2007:10-28.吴畏.基于知识的目标识别与跟踪技术研究[D].哈尔滨:哈尔滨工业大学,2007:10-28.
    [3] Cao J Z,Song A G.Research on the texture image segmentation method based on Markov random field[J].Chinese Journal of Scientific Instrument,2015,36(4):776-786.曹家梓,宋爱国.基于马尔科夫随机场的纹理图像分割方法研究[J].仪器仪表学报,2015,36(4):776-786.
    [4] Russakovsky O,Deng J,Su H,et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
    [5] Girshick R.Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV),December 7-13,2015,Santiago,Chile.New York:IEEE,2015:1440-1448.
    [6] Ren S Q,He K M,Girshick R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
    [7] Dai J,Li Y,He K,et al.R-FCN:object detection via region-based fully convolutional networks[C]∥NIPS′16 Proceedings of the 30th International Conference on Neural Information Processing Systems,December 5-10,2016,Barcelona,Spain.USA:Curran Associates Inc.,2016:379-387.
    [8] Redmon J,Divvala S,Girshick R,et al.You only look once:unified,real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:779-788.
    [9] Liu W,Anguelov D,Erhan D,et al.SSD:single shot multibox detector[M]//Leibe B,Matas J,Sebe N,et al.Computer Vision-ECCV 2016.Cham:Springer,2016,9905:21-37.
    [10] Lin T Y,Goyal P,Girshick R,et al.Focal loss for dense object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018:2858826.
    [11] Huang J,Rathod V,Sun C,et al.Speed/accuracy trade-offs for modern convolutional object detectors[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:3296-3297.
    [12] Lin T Y,Maire M,Belongie S,et al.Microsoft COCO:common objects in context[M]∥Fleet D,Pajdla T,Schiele B,et al.Computer Vision-ECCV 2014.Cham:Springer,2014,8693:740-755.
    [13] Xu Y Z,Yao X J,Li X,et al.Object detection in high resolution remote sensing images based on fully convolution networks[J].Bulletin of Surveying and Mapping,2018(1):77-82.徐逸之,姚晓婧,李祥,等.基于全卷积网络的高分辨遥感影像目标检测[J].测绘通报,2018(1):77-82.
    [14] Zhang Z Y.Plane detection in optical remote sensing images based on deep learning[D].Xiamen:Xiamen University,2016:20-30.张志远.基于深度学习的光学遥感图像飞机检测[D].厦门:厦门大学,2016:20-30.
    [15] Wang J C,Tan X C,Wang Z H,et al.Faster R-CNN deep learning network based object recognition of remote sensing image[J].Journal of Geo-Information Science,2018,20(10):1500-1508.王金传,谭喜成,王召海,等.基于Faster R-CNN深度网络的遥感影像目标识别方法研究[J].地球信息科学学报,2018,20(10):1500-1508.
    [16] Feng X Y,Mei W,Hu D S.Aerial target detection based on improved faster R-CNN[J].Acta Optica Sinica,2018,38(6):0615004.冯小雨,梅卫,胡大帅.基于改进Faster R-CNN的空中目标检测[J].光学学报,2018,38(6):0615004.
    [17] Lin T Y,Dollár P,Girshick R,et al.Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:936-944.
    [18] Huang G,Liu Z,Maaten L V D,et al.Densely connected convolutional networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),July 21-26,2017,Honolulu,HI,USA.New York:IEEE,2017:2261-2269.
    [19] Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-10)[2018-12-22].https://arxiv.org/abs/1409.1556.
    [20] Ioffe S,Szegedy C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]∥ICML′15 Proceedings of the 32nd International Conference on International Conference on Machine Learning,July 6-11,2015,Lille,France.Massachusetts:JMLR.org,2015:448-456.
    [21] He K M,Zhang X Y,Ren S Q,et al.Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:770-778.
    [22] Parker J A,Kenyon R V,Troxel D E.Comparison of interpolating methods for image resampling[J].IEEE Transactions on Medical Imaging,1983,2(1):31-39.
    [23] de Boer P T,Kroese D P,Mannor S,et al.A tutorial on the cross-entropy method[J].Annals of Operations Research,2005,134(1):19-67.
    [24] Yosinski J,Clune J,Bengio Y,et al.How transferable are features in deep neural networks?[EB/OL].(2014-11-06)[2018-12-22].https://arxiv.org/abs/1411.1792.
    [25] Everingham M,van Gool L,Williams C K I,et al.The pascal visual object classes (VOC) challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
    [26] Loshchilov I,Hutter F.SGDR:stochastic gradient descent with warm restarts[EB/OL].(2017-03-03)[2018-12-25].https://arxiv.org/abs/1608.03983.
    [27] Szegedy C,Vanhoucke V,Ioffe S,et al.Rethinking the inception architecture for computer vision[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:2818-2826.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700