基于FD-SSD的遥感图像多目标检测方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:REMOTE SENSING IMAGE MULTI-TARGET DETECTION METHOD BASED ON FD-SSD
  • 作者:朱敏超 ; 冯涛 ; 张钰
  • 英文作者:Zhu Minchao;Feng Tao;Zhang Yu;School of Electronics and Information,Hangzhou Dianzi University;
  • 关键词:遥感图像 ; 目标检测 ; 特征融合 ; 空洞卷积 ; 深度学习
  • 英文关键词:Remote sensing image;;Target detection;;Feature fusion;;Dilated convolution;;Deep learning
  • 中文刊名:JYRJ
  • 英文刊名:Computer Applications and Software
  • 机构:杭州电子科技大学电子与信息学院;
  • 出版日期:2019-01-12
  • 出版单位:计算机应用与软件
  • 年:2019
  • 期:v.36
  • 基金:国家自然科学基金项目(61372156)
  • 语种:中文;
  • 页:JYRJ201901043
  • 页数:7
  • CN:01
  • ISSN:31-1260/TP
  • 分类号:238-244
摘要
针对遥感图像中目标物体过小,不易检测的难点,提出对SSD的改进网络FD-SSD(Feature Fusion and Dilated Convolution Single Shot Multibox Detector)。FD-SSD去掉了SSD网络数据预处理层的随机剪裁步骤,并结合FSSD将具有高分辨率的低层特征图和具有高语义信息的高层特征图进行融合。使用空洞卷积增大第三层特征图的感受野,利用具有高分辨率的低层特征图对小目标进行预测。同时不再使用1×1的顶层特征图产生目标框。模型训练阶段将原始遥感图像进行"二次切割"处理,增加训练样本量。在预测阶段先将原始图像进行切割预测,再将目标框映射回原图,并对原图所有的目标框进行二次非极大值抑制(NMS),保留最优目标框。FD-SSD在DOTA数据集上有良好的表现,比原始SSD的m AP提升31%。
        Aiming at the difficulty that the object in remote sensing image was too small to be detected easily,FDSSD( feature fusion and dilated convolution single shot multibox detector),an improved network of SSD,was proposed.FD-SSD removed the random tailoring steps in SSD network data preprocessing layer. It was combined with FSSD to integrate the feature map with high resolution in lower layer into the feature map with high semantic information in higher layer. Dilated convolution was adopted to enlarge the receptive field of the feature map in the third layer. The feature map with high resolution in lower layer was used to predict the small targets. The top-level feature map with dimension of1 × 1 was no longer used to generate the target box. In the model training phase,FD-SSD performed the secondary cutting for the original remote sensing image to increase the training samples. In the prediction stage,it cut and predicted the original image,and mapped the target frame back to the original image. All the target frames of the original image were subjected to quadratic NMS to preserve the optimal target frame. FD-SSD has an excellent performance on the DOTA dataset. Compared with m AP of the previous SSD,it increases by 31%.
引文
[1] Liu W,Anguelov D,Erhan D,et al. SSD:Single shot multibox detector[C]//European Conference on Computer Vision. Springer International Publishing,2016:21-37.
    [2] Li Z X,Zhou F Q. FSSD:Feature fusion single shot multibox detector[EB]. ar Xiv:1712. 00960v3,2018.
    [3] Chen L C,Papandreou G,Kokkinos I,et al. Deep Lab:Semantic image segmentation with deep convolutional nets,atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis&Machine Intelligence,2016,40(4):834-848.
    [4] Xia G S,Bai X,Ding J,et al. DOTA:A large-scale dataset for object detection in aerial images[EB]. ar Xiv:1711.10398v2,2017.
    [5] Ren S,He K,Girshick R,et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(6):1137-1149.
    [6] Redmon J,Divvala S,Girshick R,et al. You only look once:unified, real-time object detection[EB]. ar Xiv:1506. 02640v5,2015.
    [7] Van Etten A. You only look twice:rapid multi-scale object detection in satellite imagery[EB]. ar Xiv:1805. 09512v1,2018.
    [8] Mundhenk T N,Konjevod G,Sakla W A,et al. A Large Contextual Dataset for Classification,Detection and Counting of Cars with Deep Learning[C]//European Conference on Computer Vision. Springer,Cham,2016:785-800.
    [9] Lin T Y,Dollar P,Girshick R,et al. Feature pyramid networks for object detection[EB]. ar Xiv:1612. 03144v2,2016.
    [10] Simonyan K,Zisserman A. Very deep convolutional networks for large-scale image recognition[EB]. ar Xiv:1409.1556v6,2014.
    [11] He K,Zhang X,Ren S,et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society,2016:770-778.
    [12] Cai Z,Fan Q,Feris R S,et al. A unified multi-scale deep convolutional neural network for fast object detection[C]//European Conference on Computer Vision—ECCV 2016.2016:354-370.
    [13] Kingma D,Ba J. Adam:a method for stochastic optimization[C]//The 3rd International Conference for Learning Representations,San Diego,2015.
    [14] Szegedy C,Liu W,Jia Y,et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). IEEE,2015.
    [15] Dai J,Li Y,He K,et al. R-FCN:Object detection via region-based fully convolutional networks[EB]. ar Xiv:1605.06409v2,2016.
    [16] Ioffe S,Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//International Conference on International Conference on Machine Learning. JMLR. org,2015.
    [17] Cheng G,Han J,Zhou P,et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. Isprs Journal of Photogrammetry&Remote Sensing,2014,98(1):119-132.
    [18] Cheng G,Han J. A survey on objectdetection in optical remote sensing images[J]. ISPRS Journal of Photogrammetry andRemote Sensing,2016,117:11-28.
    [19] Cheng G,Zhou P,Han J. Learningrotation-invariant convolutional neural networks for object detection in VHRoptical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing,2016,54(12):7405-7415.
    [20] Long Y,Gong Y,Xiao Z,et al. Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks[J]. IEEE Transactions on Geoscience&Remote Sensing,2017,55(5):2486-2498.
    [21] Xiao Z,Liu Q,Tang G,et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing,2015,36(2):27.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700