一种结合时空上下文的在线卷积网络跟踪算法

英文篇名：Online Convolutional Network Tracking via Spatio-Temporal Context
作者：柳培忠 ; 汪鸿翔 ; 骆炎民 ; 杜永兆
英文作者：Liu Peizhong;Wang Hongxiang;Luo Yanmin;Du Yongzhao;College of Engineering,Huaqiao University;Research Center for Applied Statistics and Big Data,Huaqiao University;College of Computer Science and Technology,Huaqiao University;
关键词：视觉跟踪 ; 时空上下文 ; 卷积神经网络 ; 粒子滤波 ; 在线更新
英文关键词：visual tracking;;spatio-temporal context;;convolutional neural network;;particle filter;;online update
中文刊名：JFYZ
英文刊名：Journal of Computer Research and Development
机构：华侨大学工学院;华侨大学现代应用统计与大数据研究中心;华侨大学计算机科学与技术学院;
出版日期：2018-12-15
出版单位：计算机研究与发展
年：2018
期：v.55
基金：国家自然科学基金项目(61203242,61605048);; 福建省自然科学基金项目(2016J01300,2015J01256);; 华侨大学研究生科研创新能力培育计划资助项目(1511422004)~~
语种：中文;
页：JFYZ201812021
页数：9
CN：12
ISSN：11-1777/TP
分类号：203-211

摘要

基于卷积神经网络提取抽象特征缺乏时空信息的问题,结合时空上下文模型作为卷积神经网络的各阶滤波器,提出一种在线卷积神经网络的视觉跟踪算法.首先对初始目标进行归一化处理并提取目标置信图,跟踪过程中结合时空信息更新得到时空上下文模型,第1层使用更新后的模型对输入进行卷积,并对卷积结果进行滑动窗口取片,第2层再使用时空模型分别对取片结果进行卷积,提取目标简单抽象特征,然后叠加简单层的卷积结果得到目标的深层次表达,最后结合粒子滤波跟踪框架实现目标跟踪.实验表明:结合时空上下文模型的在线卷积网络结构提取的深度抽象特征,保留相关时空信息,提高复杂背景下的跟踪效率.
Deep networks have been successfully applied to visual tracking by learning ageneric representation offline from numerous training images.However,the features of the convolutional neural network abstraction algorithm are lack of spatio-temporal context information and the offline training is time-consuming.To tackle the above issues,an online convolution network tracking via spatio-temporal context is proposed,adopting the spatio-temporal context as the every order filter in convolutional neural network.Firstly,the initial target is normalized and the target confidence map is extracted.In the process of tracking,the spatio-temporal information is updated to obtain the spatiotemporal context model.The first layer utilizes the updated model to convolve the input and performs sliding window on the convolution result.The second layer convolves the fetch results by spatiotemporal model respectively,extracts the simple target abstract features,and then the convolution result of the simple layer is superposed to the deep level target expression.Finally,the target tracking is realized by the particle filter tracking framework.Our convolutional networks have a lightweight structure and perform favorably against several state-of-the-art methods on OTB-2013 and OTB-2015.As documented in the experimental results,the deep abstract feature extracted by online convolution network structure combining with spatio-temporal context model,can preserve related spatio-temporal information and then the tracking efficiency under complex background is improved.

引文

[1]Huang Kaiqi,Chen Xiaotang,Kang Yunfeng,et al.Intelligent visual surveillance:A review[J].Chinese Journal of Computers,2015,38(6):1093-1118(in Chinese)(黄凯奇,陈晓棠,康运锋,等.智能视频监控技术综述[J].计算机学报,2015,38(6):1093-1118)
    [2]Zhang Huanlong,Hu Shiqiang,Yang Guosheng.Video object tracking based on appearance models learning[J].Journal of Computer Research and Development,2015,52(1):177-190(in Chinese)(张焕龙,胡士强,杨国胜.基于外观模型学习的视频目标跟踪方法综述[J].计算机研究与发展,2015,52(1):177-190)
    [3]Wu Yi,Lim J,Yang Minghsuan.Online object tracking:Abenchmark[C]Proc of IEEE Conf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2013:2411-2418
    [4]Babenko B,Yang Minghsuan,Belongie S.Robust object tracking with online multiple instance learning[J].IEEETransactions on Pattern Analysis and Machine Intelligence,2011,33(8):1619-32
    [5]Kalal Z,Mikolajczyk K,Matas J.Tracking-learningdetection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(7):1409-1422
    [6]Ji Hui,Ling Haibin,Wu Yi,et al.Real time robust L1tracker using accelerated proximal gradient approach[C]Proc of IEEE Conf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2012:1830-1837
    [7]Zhang Kaihua,Zhang Lei,Yang Minghsuan.Real-time compressive tracking[C]Proc of the 12th European Conf on Computer Vision.Berlin:Springer,2012:864-877
    [8]Hare S,Golodetz S,Saffari A,et al.Struck:Structured output tracking with kernels[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):2096-2109
    [9]Zhang Fengjun,Zhao Ling,An Guocheng,et al.Mean shift tracking algorithm with scale adaptation[J].Journal of Computer Research and Development,2014,51(1):215-224(in Chinese)(张凤军,赵岭,安国成,等.一种尺度自适应的Mean Shift跟踪算法[J].计算机研究与发展,2014,51(1):215-224)
    [10]Hu Zhaohua,Yuan Xiaotong,Li Jun,et al.Robust fragments-based tracking with multi-feature joint kernel sparse representation[J].Journal of Computer Research and Development,2015,52(7):1692-1704(in Chinese)(胡昭华,袁晓彤,李俊,等.基于目标分块多特征核稀疏表示的视觉跟踪[J].计算机研究与发展,2015,52(7):1692-1704)
    [11]Bolme D S,Beveridge J R,Draper B A,et al.Visual object tracking using adaptive correlation filters[C]Proc of IEEEConf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2010:2544-2550
    [12]Henriques J F,Caseiro R,Martins P,et al.Exploiting the circulant structure of tracking-by-detection with kernels[C]Proc of the 12th European Conf on Computer Vision.Berlin:Springer,2012:702-715
    [13]Danelljan M,Khan F S,Felsberg M,et al.Adaptive color attributes for real-time visual tracking[C]Proc of IEEEConf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2014:1090-1097
    [14]Henriques J F,Rui C,Martins P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(3):583-596
    [15]Danelljan M,Hger G,Khan F,et al.Accurate scale estimation for robust visual tracking[C/OL]Proc of the25th British Machine Vision Conf.2014[2017-05-01].http:www.cvl.isy.liu.se/research/objrec/visualtracking/scalvistrack/ScaleTracking_BMVC14.pdf
    [16]Zhang Kaihua,Zhang Lei,Liu Qingshan,et al.Fast visual tracking via dense spatio-temporal context learning[C]Proc of the 13th European Conf on Computer Vision.Berlin:Springer,2014:127-141
    [17]Danelljan M,Hger G,Khan F S,et al.Learning spatially regularized correlation filters for visual tracking[C]Proc of the 15th Int Conf on Computer Vision.Los Alamitos,CA:IEEE Computer Society,2015:4310-4318
    [18]Danelljan M,Hger G,Khan F S,et al.Convolutional features for correlation filter based visual tracking[C]Proc of Int Conf on Computer Vision Workshop.Los Alamitos,CA:IEEE Computer Society,2016:621-629
    [19]Wang Naiyan,Yeung D Y.Learning a deep compact image representation for visual tracking[C]Proc of Int Conf on Neural Information Processing Systems.Los Alamitos,CA:IEEE Computer Society,2013:809-817
    [20]Nam H,Baek M,Han B.Modeling and propagating CNNSs in a tree structure for visual tracking[OL].2016[2017-06-01].https:arxiv.org/abs/1608.07242
    [21]Wang Lijun,Ouyang Wanli,Wang Xiaogang,et al.STCT:Sequentially training convolutional networks for visual tracking[C]Proc of IEEE Conf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2016:1373-1381
    [22]Zhang Kaihua,Liu Qingshan,Wu Yi,et al.Robust visual tracking via convolutional networks without training[J].IEEE Transactions on Image Processing,2016,25(4):1779-1792
    [23]Ma Chao,Huang Jiabin,Yang Xiaokang,et al.Hierarchical convolutional features for visual tracking[C]Proc of the15th Int Conf on Computer Vision.Los Alamitos,CA:IEEEComputer Society,2015:3074-3082
    [24]Nam H,Han B.Learning multi-domain convolutional neural networks for visual tracking[C]Proc of IEEE Conf on Computer Vision and Pattern Recognition.Los Alamitos,CA:IEEE Computer Society,2016:4293-4302
    [25]Wu Yi,Lim J,Yang Minghsuan.Object tracking benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1834-1848

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700