基于深度学习的交通目标感兴趣区域检测

英文篇名：Traffic Object Detection Based on Deep Learning with Region of Interest Selection
作者：丁松涛 ; 曲仕茹
英文作者：DING Song-tao;QU Shi-ru;School of Automation,Northwestern Polytechnical University;
关键词：交通工程 ; 交通目标检测 ; 卷积神经网络 ; 时空兴趣点 ; 感兴趣区域
英文关键词：traffic engineering;;traffic object detection;;convolutional neural network;;spatiotemporal interest point;;region of interest
中文刊名：ZGGL
英文刊名：China Journal of Highway and Transport
机构：西北工业大学自动化学院;
出版日期：2018-09-15
出版单位：中国公路学报
年：2018
期：v.31;No.181
基金：教育部高等学校博士学科点专项科研基金项目(20096102110027);; 航天科技创新基金项目(CASC201104);; 航空科学基金项目(2012ZC53043)
语种：中文;
页：ZGGL201809020
页数：8
CN：09
ISSN：61-1313/U
分类号：171-178

摘要

为了提高交通目标检测的实时性和准确性,针对交通目标检测过程中普遍存在的背景复杂、光线变化、物体遮挡等干扰问题,以及基于深度学习的目标检测算法在进行区域选择时滑动窗口遍历搜索耗时的问题,提出一种基于时空兴趣点(STIP)的交通多目标感兴趣区域快速检测算法。像素级时空兴趣点检测在处理目标遮挡时具有较好的鲁棒性,利用这一特点,首先在传统兴趣点检测算法的基础上加入背景点抑制和时空点约束,以减少无效兴趣点对有效兴趣点检测带来的干扰。通过改进均值漂移算法,使得聚类中心数量随目标数目的变化而改变。然后对被检测出的多目标附近的候选兴趣点分别进行聚类,获取各个目标聚类中心位置信息。根据聚类中心点与筛选后的目标兴趣点之间的相对位置关系进行特定组合获得感兴趣区域。在这些感兴趣区域上使用选择性搜索算法生成1 000~2 000个候选区域,并将这些候选区域放入训练好的深度卷积神经网络模型中进行特征提取。最后将特征提取结果送入支持向量机中进行目标种类判别并使用回归器精细修正目标识别框的位置。研究结果表明:通过对候选区域进行预处理,送入模型中的候选区域数量减少了82%,对应算法整体运行时间减少了74%,能够满足智能交通监控的实际需求。
Traffic multi-target detection involves several issues,including obstruction of complex background,light variations,object occlusion,and sliding window time consumption.Hence,to improve the real-time accuracy of traffic object detection,a rapid detection algorithm incorporating region of interest based on spatio-temporal interest point(STIP)is proposed.Pixel-level STIP can provide robustness while addressing the issue of target occlusion.By employing the said feature,background suppression and spatio-temporal constraints were adopted for reducing the interference of ineffectual interest points, whose detection is based on conventional interest point detection algorithms.The mean shift clustering method was enhanced for varying the number of cluster centers in accordance with the number of objects.The candidate points of interest detected near the multi-target region were then clustered for obtaining their respective target cluster center position information.Furthermore,the region of interest wasattained by combining the relative positional relation between STIP and the cluster center points.The selective search algorithm was thereby applied in the region of interest for obtaining approximately 1 000 to 2 000 candidate regions.The candidate regions were incorporated into the convolutional neural network model for feature extraction.The extracted features were input to a support vector machine for classification,and a regression model was employed for precisely correcting the position of the object recognition box. The obtained experimental results demonstrate that the number of candidate regions can be reduced by 82%,and the execution time of the algorithm can be reduced by 74%,which in turn can meet the demands of intelligent traffic monitoring.

引文

[1]马勇,付锐.驾驶人视觉特性与行车安全研究进展[J].中国公路学报,2015,28(6):82-94.MA Yong,FU Rui.Research and Development of Drivers Visual Behavior and Driving Safety[J].China Journal of Highway and Transport,2015,28(6):82-94.
    [2]TAIGMAN Y,YANG M,RANZATO M,et al.DeepFace:Closing the Gap to Human-level Performance in Face Verification[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2014:1701-1708.
    [3]SZEGEDY C,LIU W,JIA Y,et al.Going Deeper with Convolutions[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2015:1-9.
    [4]GIRSHICK R,DONAHUE J,DARRELL T,et al.Region-based Convolutional Networks for Accurate Object Detection and Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(1):142-158.
    [5]SERMANET P,KAVUKCUOGLU K,CHINTALAS,et al.Pedestrian Detection with Unsupervised Multi-stage Feature Learning[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2013:3626-3633.
    [6]HOSANG J,OMRAN M,BENENSON R,et al.Taking a Deeper Look at Pedestrians[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2015:4073-4082.
    [7]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2014:580-587.
    [8]DOLLR P,RABAUD V,COTTRELL G,et al.Behavior Recognition via Sparse Spatio-temporal Features[C]//IEEE.Proceedings of the 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.New York:IEEE,2005:65-72.
    [9]BREGONZIO M,GONG S,XIANG T.Recognising Action as Clouds of Space-time Interest Points[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2009:1948-1955.
    [10]WONG S F,CIPOLLA R.Extracting Spatiotemporal Interest Points Using Global Information[C]//IEEE.Proceedings of the 11th International Conference on Computer Vision.New York:IEEE,2007:DOI:10.1109/ICCV.2007.4408923.
    [11]JHUANG H,SERRE T,WOLF L,et al.A Biologically Inspired System for Action Recognition[C]//IEEE.Proceedings of the 11th International Conference on Computer Vision.IEEE,2007:DOI:10.1109/ICCV.2007.4408988.
    [12]《中国公路学报》编辑部.中国交通工程学术研究综述·2016[J].中国公路学报,2016,29(6):1-161.Editorial Department of China Journal of Highway and Transport.Review on Chinas Traffic Engineering Research Progress:2016[J].China Journal of Highway and Transport,2016,29(6):1-161.
    [13]HARRIS C,STEPHENS M.A Combined Corner and Edge Detector[J].Alvey Vision Conference,1988,15(50):147-151.
    [14]YUAN J,LIU Z,WU Y.Discriminative Subvolume Search for Efficient Action Detection[C]//IEEE.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2009:2442-2449.
    [15]LINDEBERG T.Feature Detection with Automatic Scale Selection[J].International Journal of Computer Vision,1998,30(2):77-116.
    [16]CHAKRABORTY B,HOLTE M B,MOESLUND TB,et al.A Selective Spatio-temporal Interest Point Detector for Human Action Recognition in Complex Scenes[C]//IEEE.Proceedings of the 2011International Conference on Computer Vision.New York:IEEE,2011:1776-1783.
    [17]UIJLINGS J R R,VAN DE SANDE K E A,GEVERS T,et al.Selective Search for Object Recognition[J].International Journal of Computer Vision,2013,104(2):154-171.
    [18]ROSS D A,LIM J,LIN R S,et al.Incremental Learning for Robust Visual Tracking[J].International Journal of Computer Vision,2008,77(1/2/3):125-141.
    [19]GRABNER H,GRABNER M,BISCHOF H.Realtime Tracking Via On-line Boosting[C]//British Machine Vision Conference.Proceedings of the 17th British Machine Vision Conference.London:British Machine Vision Conference,2006:1-6.
    [20]BABENKO B,YANG M H,BELONGIE S.Visual Tracking with Online Multiple Instance Learning[C]//IEEE.Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2009:983-990.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700