改进的Faster RCNN煤矿井下行人检测算法

英文篇名：Improved Faster RCNN Approach for Pedestrian Detection in Underground Coal Mine
作者：李伟山 ; 卫晨 ; 王琳
英文作者：LI Weishan;WEI Chen;WANG Lin;School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications;School of Economics and Management, Xi'an University of Posts and Telecommunications;
关键词：深度学习 ; Faster ; RCNN ; 行人检测
英文关键词：deep learning;;Faster RCNN;;pedestrian detection
中文刊名：JSGG
英文刊名：Computer Engineering and Applications
机构：西安邮电大学通信与信息工程学院;西安邮电大学经济与管理学院;
出版日期：2018-05-24 08:50
出版单位：计算机工程与应用
年：2019
期：v.55;No.923
基金：陕西省科技厅资源主导型产业关键技术(链)工业领域项目(No.2015KTCXSF-10-13)
语种：中文;
页：JSGG201904030
页数：8
CN：04
分类号：205-212

摘要

针对煤矿井下环境恶劣、光照差、背景混杂、行人模糊、行人多尺度等问题,提出了一种改进的Faster RCNN煤矿井下行人检测方法,使用深度卷积神经网络代替传统的手工设计特征方式自动地从图片中提取特征。利用深度学习通用目标检测框架Faster RCNN,以Faster RCNN算法为基础,对候选区域网络(Region Proposals Network,RPN)结构进行了改进,提出了一种"金字塔RPN"结构,来解决井下行人存在的多尺度问题;同时算法中加入了特征融合技术,将不同卷积层输出的特征图进行融合,增强煤矿井下模糊、遮挡和小目标行人的检测性能。实验结果表明:改进的Faster RCNN可以有效解决井下行人检测问题,在井下行人数据集上获得了90%的检测准确率,并在公测数据集VOC 07上对改进算法进行了验证。
In order to solve the problems of harsh underground environment, poor lighting, mixed background and multiscale pedestrian, this paper proposes a pedestrian detection method of underground coal mine based on improved Faster RCNN. Deep convolutional neural network can replace traditional manual design feature to extract features automatically from images. Based on the Faster RCNN algorithm, RPN(Region Proposals Network)structure is improved and a"pyramid RPN"structure is proposed to solve multi-scale detection problem of pedestrian underground. At the same time, by adding feature fusion technology, the feature maps of different convolution layers are merged to improve the detetion performance for under-mine blur, occlusion and tiny pedestrian. The experimental results indicate that the improved Faster RCNN can effectively solve the pedestrian detection problem of underground coal mine, which obtains 90% detection accurary on the under-mine pedestrian dataset. The improved Faster RCNN algorithm is validated in the VOC 07 benchmark.

引文

[1]Dalal N,Triggs B.Histograms of oriented gradients for human detection[C]//Proc IEEE Conf Comput Vis Pattern Recognit,2005:886-893.
    [2]Mita T,Kaneko T,Hori O.Joint Haar-like features for face detection[C]//10th IEEE International Conference on Computer Vision,2005:1619-1626.
    [3]Ahonen T,Hadid A,Pietik?inen M.Face recognition with local binary patterns[C]//European Conference on Computer Vision.Berlin,Heidelberg:Springer,2004:469-481.
    [4]Dollár P,Tu Z,Perona P,et al.Integral channel features[C]//British Machine Vision Conference,London,Sep 7-10,2009.
    [5]Krizhevsky A,Sutskever I,Hinton G E.Imagenet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems,2012:1097-1105.
    [6]Lecun Y,Boser B,Denker J S,et al.Backpropagation applied to handwritten zip code recognition[J].Neural Computation,2014,1(4):541-551.
    [7]Li Y,He K,Sun J.R-FCN:object detection via regionbased fully convolutional networks[C]//Advances in Neural Information Processing Systems,2016:379-387.
    [8]曹诗雨,刘跃虎,李辛昭.基于Fast R-CNN的车辆目标检测[J].中国图象图形学报,2017,22(5):671-677.
    [9]闫喜亮,王黎明.卷积深度神经网络的手写汉字识别系统[J].计算机工程与应用,2017,53(10):246-250.
    [10]Redmon J,Divvala S,Girshick R,et al.You only look once:unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:779-788.
    [11]Liu W,Anguelov D,Erhan D,et al.SSD:single shot multibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
    [12]熊丽婷,张青苗,沈克永.基于搜索区域条件概率CNN的精确目标探测方法[J].计算机工程与应用,2017,53(20):134-140.
    [13]杜玉龙,李建增,张岩,等.基于深度交叉CNN和免交互GrabCut的显著性检测[J].计算机工程与应用,2017,53(3):32-40.
    [14]Li J,Liang X,Shen S M,et al.Scale-aware Fast R-CNN for pedestrian detection[J].IEEE Transactions on Multimedia,2018,20(4):985-996.
    [15]Everingham M,Gool L,Williams C K,et al.The Pascal visual object classes(VOC)challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
    [16]Lin T Y,Maire M,Belongie S,et al.Microsoft COCO:common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
    [17]Russakovsky O,Deng J,Su H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
    [18]Ren S,He K,Girshick R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems,2015:91-99.
    [19]Dollár P,Wojek C,Schiele B,et al.Pedestrian detection:an evaluation of the state of the art[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2012,34(4):743-761.
    [20]Benenson R,Omran M,Hosang J,et al.Ten years of pedestrian detection,what have we learned?[C]//European Conference on Computer Vision.Cham:Springer,2014:613-627.
    [21]Hosang J,Omran M,Benenson R,et al.Taking a deeper look at pedestrians[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:4073-4082.
    [22]宋焕生,张向清,郑宝峰,严腾.基于深度学习方法的复杂场景下车辆目标检测[J/OL].[2017-03-31].http://www.arocmag.com/article/02-2018-04-004.html.
    [23]Sun X,Wu P,Hoi S C H.Face detection using deep learning:an improved Faster RCNN approach[J].arXiv:1701.08289,2017.
    [24]He K,Zhang X,Ren S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,37(9):1904.
    [25]He K,Gkioxari G,Dollár P,et al.Mask RCNN[J].arXiv:1703.06870,2017.
    [26]Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
    [27]He K,Zhang X,Ren S,et al.Identity mappings in deep residual networks[C]//European Conference on Computer Vision.Springer International Publishing,2016:630-645.
    [28]Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition,Boston,2015:3431-3440.
    [29]Zhao H,Shi J,Qi X,et al.Pyramid scene parsing network[C]//IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,2017:2881-2890.
    [30]王娇娇,刘政怡,李辉.特征融合与objectness加强的显著目标检测[J].计算机工程与应用,2017,53(2):195-200.
    [31]Jia Y,Shelhamer E,Donahue J,et al.CAFFE:convolutional architecture for fast feature embedding[C]//ACMInternational Conference on Multi-Media,2014:675-678.
    [32]桑军,郭沛,项志立,等.Faster-RCNN的车型识别分析[J].重庆大学学报(自然科学版),2017,40(7):32-36.
    [33]Zeiler M D,Fergus R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision.Springer International Publishing,2014:818-833.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700