摘要
提出了一种基于深度学习的红外与可见光决策级融合检测方法。首先,提出了一种介于深度学习模型之间的参数传递模型,进而从基于深度学习的可见光物体检测模型上抽取了用于红外物体检测的预训练模型,并在课题组实地采集的红外数据集上进行fine-tuning,从而得到基于深度学习的红外物体检测模型。在此基础上,提出了一种基于深度学习的红外与可见光决策级融合检测模型,并对模型设计、图像配准、决策级融合过程进行了详细地阐述。最后,进行了白天和傍晚条件下基于深度学习的单波段检测实验和双波段融合检测实验。定性分析上,由于波段之间的信息互补性,相比于单波段物体检测,双波段融合物体检测在检测结果上具有更高的置信度和更精确的物体框;定量分析上,白天时,双波段融合检测的mAP为86.0%,相比于红外检测和可见光检测分别提高了9.9%和5.3%;傍晚时,双波段融合检测的mAP为89.4%,相比于红外检测和可见光检测分别提高了3.1%和14.4%。实验结果表明:基于深度学习的双波段融合检测方法相比于单波段检测方法具有更好的检测性能和更强的鲁棒性,同时也验证了所提出方法的有效性。
A fusion detection methodology for infrared and visible spectra was presented based on deep learning. First, a parameter transfer model for deep learning models was proposed. Then a pretraining model for infrared object detection was extracted from a visible object detection model based on deep learning and was fine-tuned on a collected infrared image dataset to obtain an infrared object detection model based on deep learning. On this basis, a decision-level fusion model for infrared and visible detection based on deep learning was established, and the model design, image registration and decision-level fusion processes were discussed in detail. Finally, an experiment comparing single-band detection and dual-band fusion detection during the daytime and nighttime was presented. Qualitatively, compared with the results of single-band detection, the confidences and bounding boxes achieved through dual-band fusion detection are superior, owing to the utility of their complementary information. Quantitatively, in the daytime, the m AP of dual-band fusion detection is 86.0% and is higher than those of infrared detection and visible detection by 9.9% and 5.3%, respectively; at nighttime, the m AP of dual-band fusion detection is 89.4% and is higher by 3.1% and 14.4%, respectively. The experimental results show that the dual-band fusion detection method proposed in this paper shows better performance and stronger robustness than the single-band object detection methods do, thus verifying the effectiveness of the proposed method.
引文
[1] Erhan D, Szegedy C, Toshev A, et al. Scalable object detection using deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014:2147-2154.
[2] Luo Haibo, Xu Lingyun, Hui Bin, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5):0502002.(in Chinese)
[3] Wei P, Ball J E, Anderson D T. Fusion of an ensemble of augmented image detectors for robust object detection[J].Sensors, 2018, 18(3):894.
[4] Jeong Y N, Son S R, Jeong E H, et al. An integrated selfdiagnosis system for an autonomous vehicle based on an IoT gateway and deep learning[J]. Applied Sciences, 2018, 7:1164.
[5] Tian Y, Luo P, Wang X, et al. Deep learning strong parts for pedestrian detection[C]//IEEE International Conference on Computer Vision, 2015:1904-1912.
[6] Hall D L, Llinas J. An introduction to multisensor data fusion[C]//Proceedings of the IEEE, 1997, 85(1):6-23.
[7] Petrovic V S, Xydeas C S. Gradient-based multiresolution image fusion[J]. IEEE Transactions on Image Processing,2004, 13(2):228-237.
[8] Davis J W, Sharma V. Background-subtraction using contourbased fusion of thermal and visible imagery[J]. Computer Vision and Image Understanding, 2007, 106(2):162-182,.
[9] Zeng D, Xu J, Xu G. Data fusion for traffic incident detection using DS evidence theory with probabilistic SVMs[J]. Journal of Computers, 2008, 3(10):36-43.
[10] Fendri E, Boukhriss R R, Hammami M. Fusion of thermal infrared and visible spectra for robust moving object detection[J]. Pattern Analysis&Applications, 2017, 20(10):1-20.
[11] Guo Y, Liu Y, Oerlemans A, et al. Deep learning for visual understanding:A review[J]. Neurocomputing, 2016, 187(C):27-48.
[12] Erhan D, Bengio Y, Courville A, et al. Why does unsupervised pre-training help deep learning?[J]. Journal of Machine Learning Research, 2010, 11(3):625-660.
[13] He K, Sun J. Convolutional neural networks at constrained time cost[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015:5353-5360.
[14] Shotton J, Blake A, Cipolla R. Contour-based learning for object detection[C]//Proceedings of the IEEE Conference on Computer Vision, 2005, 1:503-510.
[15] Shen W, Wang X, Wang Y, et al. Deepcontour:A deep convolutional feature learned by positive-sharing loss for contour detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015:3982-3991.
[16] Russakovsky O, Deng J, Su H, et al. Image net large scale visual recognition challenge[J]. International Journal of Computer Vision, 2014, 115(3):211-252.
[17] Vicente S, Carreira J, Agapito L, et al. Reconstructing PASCAL VOC[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014:41-48.
[18] Liu W, Anguelov D, Erhan D, et al. SSD:Single shot multibox detector[C]//Proceedings of European Conference on Computer Vision, 2016:21-37.
[19] Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence, 2017, 39(6):1137-1149.
[20] Tang Cong, Ling Yongshun, Zheng Kedong, et al. Object detection method of multi-view SSD based on deep learning[J]. Infrared and Laser Engineering, 2018, 47(1):0126003.(in Chinese)
[21] ZitováB, Flusser J. Image registration methods:a survey[J].Image&Vision Computing, 2003, 21(11):977-1000.
[22] Heather J P, Smith M I. Multimodal image registration with applications to image fusion[C]//Proceedings of the IEEE International Conference on Information Fusion, 2005:372-379.