摘要
针对传统的计算机视觉目标跟踪算法无法适应复杂的跟踪变化,处理有遮挡情况鲁棒性差、跟踪精度低等问题,提出一种基于YOLO和RRN的运动目标跟踪算法,它是一种结构简单、运算快速、跟踪准确的深度循环回归网络模型。采用YOLO网络负责物体的外观特征学习,将YOLO的目标检测结果作为输入,与RRN结构中的LSTM网络进行连接,第一个LSTM负责学习物体的运动特征,第二个LSTM负责回归,采用离线的方式训练模型。在跟踪基准VOT 2016数据集上的测试结果表明,YOLO+RRN模型的跟踪精度为68%,跟踪速度可达77fps,满足高级驾驶辅助系统对目标跟踪的实时性需求,模型对有遮挡情况下的目标跟踪适应性较强,为高级驾驶辅助系统在复杂环境下的目标跟踪性能提供有效保障。
To address the problem of traditional object tracking algorithms,such that it is unable to adapt to complex tracking changes and it shows low robustness and accuracy when tracking target is occluded,a moving object tracking algorithm based on you only look once(YOLO)and recurrent regression network(RRN)was presented,which is a simple,fast and accurate recurrent regression network model.The YOLO was used for moving object detection and appearance feature learning in the model,and the results generated by YOLO was used as input to the long short term memory(LSTM)networks of RRN.The first LSTM was responsible for learning the motion of the object,and the second LSTM was responsible for regression,then the model was trained offline.The tracking benchmark visual object tracking(VOT)2016 was used as the dataset of moving object tracking experiment.The results show that the precision of YOLO+RRN model reaches 68%,and the speed of tracking is 77 fps,which can meet the real-time requirement for advanced driver assistance system(ADAS).The model also has strong robustness to object tracking with occlusion,which can provide effective guarantee for the tracking performance of ADAS in complex environment.
引文
[1]Henriques J F,Caseiro R,Martins P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(3):583-596.
[2]Lukezic A,Vojír T,Zajc L C,et al.Discriminative correlation filter with channel and spatial reliability[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017.
[3]Danelljan M,Hger G,Khan F S,et al.Discriminative scale space tracking[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(8):1561-1575.
[4]Danelljan M,Hager G,Shahbaz Khan F,et al.Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:4310-4318.
[5]Ma C,Huang J B,Yang X,et al.Hierarchical convolutional features for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:3074-3082.
[6]Nam H,Han B.Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,2016:4293-4302.
[7]Danelljan Martin.Beyond correlation filters:Learning continuous convolution operators for visual tracking[C]//European Conference on Computer Vision,2016:472-488.
[8]Danelljan M,Bhat G,Fahad Khan,et al.ECO:Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:21-26.
[9]Held D,Thrun S,Savarese S.Learning to track at 100fps with deep regression networks[C]//European Conference on Computer Vision.Springer,2016:749-765.
[10]Valmadre Jack.End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,2017:5000-5008.
[11]Redmon J,Divvala S,Girshick R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:779-788.
[12]Gordon D,Farhadi A,Fox D.Re3:Real-time recurrent regression networks for object tracking[DB/OL].arXiv Preprint.https://arxiv.org/abs/1705.06368,2017.
[13]Zeiler M D,Fergus R.Visualizing and understanding convolutional networks[C]//European Conference on Computer Vision.Springer,2014:818-833.
[14]Greff K,Srivastava R K,Koutník J,et al.Lstm:A search space odyssey[J].IEEE Transactions on Neural Networks and Learning Systems,2017,28(10):2222-2232.
[15]Russakovsky O,Deng J,Su H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[16]Matej Kristan.The visual object tracking vot 2016challenge results[C]//European Conference on Computer Vision workshops,2016.
[17]Vojir T,Noskova J,Matas J.Robust scale-adaptive meanshift for tracking[J].Pattern Recognition Letters,2014,49(C):250-258.
[18]Wu Y,Lim J,Yang M H.Online object tracking:A benchmark[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,2013:2411-2418.
[19]Hare S,Golodetz S,Saffari A,et al.Struck:Structured output tracking with kernels[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):2096-2109.
[20]Hong Z,Chen Z,Wang C,et al.Multi-store tracker(muster):A cognitive psychology inspired approach to object tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:749-758.
[21]Zhang J,Ma S,Sclaroff S.MEEM:Robust tracking via multiple experts using entropy minimization[C]//European Conference on Computer Vision,2014:188-203.
[22]Oron S,Bar-Hillel A,Levi D,et al.Locally orderless tracking[J].International Journal of Computer Vision,2015,111(2):213-228.