多层卷积特征的真实场景下行人检测研究

英文篇名：Research on pedestrian detection based on multi-layer convolution feature in real scene
作者：伍鹏瑛 ; 张建明 ; 彭建 ; 陆朝铨
英文作者：WU Pengying;ZHANG Jianming;PENG Jian;LU Chaoquan;Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha University of Science and Technology;School of Computer and Communication Engineering, Changsha University of Science and Technology;
关键词：行人检测 ; 卷积神经网络 ; SSD ; 真实场景 ; 多尺度特征 ; 目标检测 ; 小目标行人 ; 行人数据集
英文关键词：pedestrian detection;;CNN;;single shot multibox detector;;real scene;;multi-scale features;;object detection;;small target pedestrians;;Pedestrian dataset
中文刊名：ZNXT
英文刊名：CAAI Transactions on Intelligent Systems
机构：长沙理工大学综合交通运输大数据智能处理湖南省重点实验室;长沙理工大学计算机与通信工程学院;
出版日期：2018-04-16 15:41
出版单位：智能系统学报
年：2019
期：v.14;No.76
基金：国家自然科学基金项目(61402053);; 湖南省教育厅科研重点项目(16A008);; 湖南省交通厅科技项目(201446);; 长沙理工大学研究生科研创新项目(CX2017SS19);长沙理工大学研究生课程建设项目(KC201611)
语种：中文;
页：ZNXT201902016
页数：10
CN：02
ISSN：23-1538/TP
分类号：104-113

摘要

针对真实场景下的行人检测方法存在漏检、误检率高,以及小尺寸目标检测精度低等问题,提出了一种基于改进SSD网络的行人检测模型(PDIS)。PDIS通过引出更底层的输出特征图改进了原始SSD网络模型,并采用卷积神经网络不同层输出的抽象特征对行人目标分别做检测,融合多层检测结果,提升了小目标行人的检测性能。此外,针对数据集样本多样性能有效地提升检测算法的泛化能力,本文采集了不同光照、姿态、遮挡等复杂场景下的行人图像,对背景比较复杂的INRIA行人数据集进行了扩充,在扩增的行人数据集上训练的PDIS模型,提高了在真实场景下的行人检测精度。实验表明:PDIS在INRIA测试集上测试结果达到93.8%的准确率,漏检率低至7.4%。
Pedestrian detection methods in real scenes face some problems due to the high miss detection and false detection as well as the low detection accuracy of small size objects. To solve these problems, a pedestrian detection model based on improved SSD(PDIS) is proposed. The PDIS method improves the original SSD network model by extracting the lower-level output feature maps. It employs the abstract features of different convolutional neural network layers to detect pedestrians respectively, and then integrates the detection results of multi layers to increase the pedestrian detection performance for small sizes. Considering that the diversity of dataset can effectively enhance the generalization ability of detection algorithm, the paper expands the INRIA pedestrian dataset with complex background by collecting pedestrian images with different illumination, pose and occlusion. The PDIS method trained on expanded pedestrian dataset increases the precision rate of pedestrian detection in real scenes. The experiment results on INRIA test set indicate that the precision rate of PDIS algorithm is up to 93.8% and the miss rate is as low as 7.4%.

引文

[1]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报, 2017, 12(6):770–780.SONG Wanru, ZHAO Qingqing, CHEN Changhong, et al.Survey on pedestrian re-identification research[J]. CAAI transactions on intelligent systems, 2017, 12(6):770–780.
    [2]YE Qixiang, LIANG Jixiang, JIAO Jianbin. Pedestrian detection in video images via error correcting output code classification of manifold subclasses[J]. IEEE transactions on intelligent transportation systems, 2012, 13(1):193–202.
    [3]LIU Wei, ANGUELOV D, ERHAN D, et al. SSD:single shot multibox detector[C]//Proceedings of 2016 European Conference on Computer Vision. Cham, Germany, 2016:21–37.
    [4]DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA, 2005:886–893
    [5]苏松志,李绍滋,陈淑媛,等.行人检测技术综述[J].电子学报, 2012, 40(4):814–820.SU Songhi, LI Shaozi, CHEN Shuyuan, et al. A survey on pedestrian detection[J]. Acta electronica sinica, 2012,40(4):814–820.
    [6]LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision,2004, 60(2):91–110.
    [7]VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001IEEE Computer Society Conference Computer Vision and Pattern Recognition. Kauai, USA, 2001:511–518.
    [8]FERREIRA A J, FIGUEIREDO M A T. Boosting algorithms:a review of methods, theory, and applications[M].New York, USA:Springer, 2012:35–85.
    [9]VAPNIK V. The nature of statistical learning theory[M].2nd eds. New York:Springer-Verlag, 2000.
    [10]BREIMAN L. Random forests[J]. Machine learning,2001, 45(1):5–32.
    [11]DOLLáR P, APPEL R, BELONGIE S, et al. Fast feature pyramids for object detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(8):1532–1545.
    [12]NAM W, DOLLáR P, HAN J H. Local decorrelation for improved detection[J]. Advances in neural information processing systems, 2014, 1:424–432.
    [13]ZHANG Shanshan, BENENSON R, SCHIELE B.Filtered channel features for pedestrian detection[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:1751–1760.
    [14]KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25(2):1097–1105.
    [15]SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
    [16]GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition.Columbus, USA, 2014:580–587.
    [17]REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6):1137–1149.
    [18]REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:779–788.
    [19]王梦来,李想,陈奇,等.基于CNN的监控视频事件检测[J].自动化学报, 2016, 42(6):892–903.WANG Menglai, LI Xiang, CHEN Qi, et al. Surveillance event detection based on CNN[J]. Acta automatica sinica,2016, 42(6):892–903.
    [20]HOSANG J, OMRAN M, BENENSON R, et al. Taking a deeper look at pedestrians[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Boston, USA, 2015:4073–4082.
    [21]BENENSON R, OMRAN M, HOSANG J, et al. Ten years of pedestrian detection, what have we learned?[C]//Proceedings of 2014 European Conference on Computer Vision. Cham, Germany, 2015:613–627.
    [22]吕静,高陈强,杜银和,等.基于双通道特征自适应融合的红外行为识别方法[J].重庆邮电大学学报(自然科学版), 2017, 29(3):389–395.LYU Jing, GAO Chenqiang, DU Yinhe, et al. Infrared action recognition method based on adaptive fusion of dual channel features[J]. Journal of Chongqing university of posts and telecommunications(natural science edition),2017, 29(3):389–395.
    [23]TIAN Yonglong, LUO Ping, WANG Xiaogang, et al.Deep learning strong parts for pedestrian detection[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:1904–1912.
    [24]张雅俊,高陈强,李佩,等.基于卷积神经网络的人流量统计[J].重庆邮电大学学报(自然科学版), 2017, 29(2):265–271.ZHANG Yajun, GAO Chenqiang, LI Pei, et al. Pedestrian counting based on convolutional neural network[J].Journal of Chongqing university of posts and telecommunications(natural science edition), 2017, 29(2):265–271.
    [25]ZHANG Liliang, LIN Liang, LIANG Xiaodan, et al. Is faster r-cnn doing well for pedestrian detection?[C]//Proceeding of 2016 European Conference on Computer Vision. Cham, Germany, 2016:443–457.
    [26]ENZWEILER M, GAVRILA D M. Monocular pedestrian detection:survey and experiments[J]. IEEE transactions on pattern analysis and machine intelligence, 2009,31(12):2179–2195.
    [27]MOHAN A, PAPAGEORGIOU C, POGGIO T. Example-based object detection in images by components[J].IEEE transactions on pattern analysis and machine intelligence, 2001, 23(4):349–361.
    [28]OVERETT G, PETERSSON L, BREWER N, et al. A new pedestrian dataset for supervised learning[C]//Proceedings of 2008 IEEE Intelligent Vehicles Symposium.Eindhoven, Netherlands, 2008:373–378.
    [29]GIRSHICK R. Fast R-CNN[C]//Proceedings of 2015IEEE International Conference on Computer Vision. Santiago, Chile, 2015:1440–1448.
    [30]王成济,罗志明,钟准,等.一种多层特征融合的人脸检测方法[J].智能系统学报, 2018, 13(1):138–146.WANG Chengji, LUO Zhiming, ZHONG Zhun, et al.Face detection method fusing multi-layer features[J].CAAI transactions on intelligent systems, 2018, 13(1):138–146.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700