运动成像平台近景视频运动目标检测技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着科学技术的进步,人们现在越来越容易获取和存储各种视频,数字视频信息出现了飞速膨胀。视频运动目标检测是视频内容分析的基础,在科学研究和工程应用上都有着十分重要的意义,其中运动成像平台近景视频运动目标检测技术,由于目标景深变化大、多目标间的静态或动态遮挡、不可预知的场景变化、目标运动和成像平台运动等多种因素的影响,是一个值得重点关注的难点问题。运动成像平台条件下运动目标检测方法可以分为基于运动分析的方法和基于统计学习的方法。本文围绕运动成像平台近景视频运动目标检测问题,深入研究了利用局部不变特征表达视频图像数据,利用运动分析和统计学习策略分析视频图像数据,从而发现和定位运动目标的方法。
     在基于局部不变特征的视频内容表述方面,主要研究内容为:(1)提出利用空域的局部近邻关系和时域的运动相似关系来定义和描述视频局部不变特征的时空上下文信息,增强视频局部特征的描述力。(2)提出在视频图像数据中引入图像空间金字塔表示,在多个层次上描述目标各部分之间的结构信息,有效的融合了目标的局部外观信息和全局结构信息。(3)提出了一种面向视频图像特征匹配的闭合回路特征匹配方法,在保证特征匹配数量的同时提高了匹配的可靠性。
     在基于运动分析的运动目标检测方面,提出了一种基于多视几何约束的运动成像平台近景视频运动目标检测方法。方法的主要创新为:(1)针对双目会聚式立体视觉,提出了一种基于四个视角的多视极线约束,有效地解决了当相机和目标沿相同方向运动时对极几何约束无法检测运动目标的问题。(2)提出在粒子滤波的框架下,采用自适应状态预测和多视极线约束状态观测的更新策略同时检测和跟踪多个运动目标,有效地处理了运动成像平台下多个运动目标在多个时刻进入或离开视野的情况。
     在基于无监督统计学习的运动目标检测方面,提出了一种基于动态主题发现的运动成像平台近景视频运动目标检测算法。基于鲁棒的稀疏时空上下文视觉词汇表示形式和无监督的学习策略,我们采用概率主题模型对运动成像平台近景视频数据中的运动目标进行建模,我们的模型本质上包含两个过程:(1)在特征层次上,提取对姿态、尺度和光照变化等因素不敏感的显著性视频块,这些视频块包含了多种不同类别运动目标的局部信息。(2)在目标层次上,利用多帧图像间的结构模式和运动模式的相似性,构建目标模型。
     在基于有监督统计学习的运动目标检测方面,以红外行人目标为例,提出了一种基于判别式模型的运动成像平台近景视频特定类别运动目标检测算法。算法的主要创新为:(1)提出了一种基于特征点滑动窗口搜索的感兴趣区域提取方法,方法能够在不同的场景条件下稳定地提取候选行人区域,保证检测率的同时大大减小了滑动窗口搜索的候选区域,提高了处理效率。(2)提出了一种金字塔二进制模式特征,同时利用局部纹理信息和全局结构信息来描述红外图像人体目标,并将其扩展为三维的动态金字塔二进制模式特征用于描述红外视频中的行人目标。
With the development of scientific technology, people nowadays can easily acquire and save a variety of videos. As a result, digital video data has become abundant. Moving object detection is part of video content analysis, and plays an important role in both scientific research and engineering applications. Especially, moving object detection in close range videos from moving platform has received more and more attention. Due to the combined effects of variations of object depth, stationary or dynamic occlusion between multiple objects, unpredictable scene structure, object as well as camera movement. The two typical groups of method on moving object detection from moving platform are methods based on motion analysis and methods based on statistical learning. In order to dicover and localize moving objects in close range videos from moving platform, this paper focus on the use of local invariant features to represent the video data, the use of motion analysis strategy and statistical learning strategy to analyze the video data.
     In video processing based on local invariant features, we put emphasis on video content representation by local features. (1) We proposed a novel spatial-temporal context based on spatial neighbor and temporal similiarity to enhance the description of local video features. (2) We proposed to describe local features in the spatial image pyramid in order to capture global configuration of different object parts. Both local global clues are used for object description. (3) We proposed a novel closed loop mapping feature matching method for video feature matching. At the same level of reliability, our method can obtain more matches.
     In moving object detection based on motion analysis, we proposed a novel moving object detection method based on multiview geometric constraints. The major contributions are: (1) We proposed a new multiview epipolar constraints based on consecutive positions of binocular camera with non-parallel configurations. It can be used for moving object detection when the object and the camera move in the same direction, where the epipolar geometry fails. (2) We proposed to detect and track multiple moving objects under the framework of particle filter so as to deal with the situation of multiple moving objects entering or leaving field of view.
     In moving object detection based on unsupervised statistical learning, we proposed a novel moving object detection algorithm based on dynamic topic discovery. Taking advantage of a robust representation of video by spatial-temporal context words and an unsupervised learning strategy, we proposed to use dynamic topic modeling to discover and localize moving objects in close range videos from moving platform. In essence, our model consists of two levels: (1) At the feature level, distinctive video patches which are robust to position, scale and lighting variations are extracted. These patches contain important information of different class of moving objects. (2) At the object level, structure and motion similarity across frames are used to build the object model.
     In moving object detection based on supervised statistical learning, we proposed a novel infrared pedestrian detection algorithm based on discriminative models, as an instance of specified category moving object detection. The major innovations are: (1) We proposed a novel feature centric sliding window region of interest extraction methods. It performs robustly under different sceniarios while greatly reduces the computation cost. (2) We proposed a novel pyramid binary pattern (PBP) feature for infrared person description. Our PBP feature combines both local texture and global shape information and it has been extended to 3D form for pedestrian description.
引文
[1]王润生.图像理解[M].长沙:国防科技大学出版社, 1995.
    [2]王润生.信息融合[M].北京:科学出版社, 2007.
    [3]章毓晋.基于内容的视觉信息检索[M].北京:科学出版社, 2003.
    [4]张祖勋.数字摄影测量30年[M].武汉:武汉大学出版社, 2007.
    [5]张广军.视觉测量[M].北京:科学出版社, 2008.
    [6]吴思,张勇东,林守勋,李豪杰.动态场景视频序列中的前景区域自动提取[J].计算机辅助设计与图形学学报, 2005, 17(2): 359-363.
    [7] Moscheni F. Spatio-temporal segmentation and object tracking: An application to second generation video coding [D]. Lausanne: Swiss Federal Institute of Technology, 1997.
    [8] Haritaoglu I, Harwood D, Davis L. W4: real-time surveillance of people and their activities [J]. IEEE Trans Pattern Analysis and Machine Intelligence, 2000, 22 (8): 809-830.
    [9] McKenna S. Tracking groups of people [J]. Computer Vision and Image Understanding, 2000, 80 (1): 42-56.
    [10] Karmann K, Brandt A. Moving object recognition using an adaptive background memory [C]. In: V Cappellini, Time-varying Image Processing and Moving Object Recognition, Elsevier, Amsterdam, The Netherlands, 1990.
    [11] Kilger M. A shadow handler in a video-based real-time traffic monitoring system [C]. IEEE Workshop on Applications of Computer Vision, Palm Springs, CA, 1992: 1060-1066.
    [12] Stauffer C, Grimson W. Adaptive background mixture models for real-time tracking [C]. IEEE Conference on Computer Vision and Pattern Recognition, Fort Collins, Colorado, 1999, 2: 246-252.
    [13] Toyama K, Krumm J, Brumitt B, Meyers B. Wallflower: principles and practice of background maintenance [C]. International Conference on Computer Vision, 1999: 255–261.
    [14] Elgammal, Harwood A.D., Davis L.S. Nonparametric model for background subtraction [C]. European Conference on Computer Vision, 2000: 751-767.
    [15] Wang H, Suter D. Background subtraction based on a robust consensus method [C]. IEEE 18th International Conference on Pattern Recognition 2006: 223-226.
    [16] Lipton A, Fujiyoshi H, Patil R. Moving target classification and tracking from real-time video [C]. IEEE Workshop on Applications of Computer Vision, Princeton, 1998: 8-14.
    [17] Kameda Y, Minoh M. A human motion estimation method using 3-successive video frames [C]. Proceedings of International Conference on Virtual Systems and Multimedia, 1996: 135-140.
    [18] Paragios N, Geodesic R.D. Active contours and level sets for the detection and tracking of moving objects [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(3):266-280.
    [19] Meyer D, Denzler J, Niemann H. Model based extraction of articulated objects in image sequences for gait analysis [C]. IEEE International Conference on Image Processing, 1997: 78-81.
    [20] Stringa E. Morphological change detection algorithms for surveillanceapplications [C]. British Machine Vision Conference, 2000: 402-411.
    [21] Neri A, Colonnese S, Russo G, et al. Automatic moving object and background separation [J]. Signal Processing, 1998, 66 (2):219-232.
    [22] Murray D, Basu A. Motion tracking with an active camera [J]. IEEE Trans Pattern Analysis and Machine Intelligence, 1999, 16(5):449-459.
    [23] Guo J, Chng E S, Rajan D. Foreground motion detection by difference-based spatial temporal entropy image [C]. IEEE International Conference on Computer Vision, 2004: 379- 382.
    [24] Araki T, Matsuoka H, Yokoya N. Real-time tracking of multiple moving objects in moving camera image sequences using robust statistics [C]. Proceedings of the 14th International Conference on Pattern Recoginition, 1998: 1433-1435.
    [25] Fang S, Chi J, Xu X. Moving target tracking algorithm in video surveillance [J].Control and Decision, 2005, 20(12):1388-1391.
    [26] Cheng Y. MeanShift model seeking and clustering [J]. IEEE Transactions on Pattern Recognition and Machine Intelligence, 1995, 17(8):790-799.
    [27] Yamaguchi K, Kato T, Ninomiya Y. Vehicle ego-motion estimation and moving object detection using a monocular camera [C]. International Conference on Pattern Recognition, 2006: 610-613.
    [28] Sawhney H, Guo Y, Kumar R. Independent motion detection in 3D scenes [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(10):1191-1199.
    [29] Tekalp A. M. Digital video processing [M].北京:清华大学出版社, 2004.
    [30] Barron J, Fleet D, Beauchemin S. Performance of optical flow techniques [J]. International Journal of Computer Vision, 1994, 12(1): 42-77.
    [31] Irani M, Anandan P. A unified approach to moving object detection in 2D and 3D scenes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(6):577-589.
    [32] Hu Z, Uchimura K. Moving object detection from time-varied background: an application of camera 3D motion analysis[C]. Proceedings of IEEE International Conference on 3-D Imaging and Modeling, 1999: 59-67.
    [33] Ommer B, Mader T, Buhmann J M. Seeing the objects behind the dots: recogntion in videos from a moving camera [J]. International Journal of Computer Vision, 2009, 83: 57-71.
    [34] Tuytelaar T, Lampert C H, Blaschko M B, Buntine W. Unsupervised object discovery: a comparison [J]. International Journal of Computer Vision, 2010, 88: 284-302.
    [35] Liu D, Chen T. DISCOV: A framework for discovering objects in videos [J]. IEEE Transactions on Multimedia, 2008, 10: 200–208.
    [36] Moravec H. Towards automatic visual obstacle avoidance [C]. Proceedings of International Joint Conference on Artificial Intelligence, 1977: 584.
    [37] Tuytelaars T, Gool L V. Matching widely separated views based on affine invariant regions [J]. International Journal of Computer Vision, 2004, 59(1): 61-85.
    [38] Schaffalitzky F, Zisserman A. Multi-view matching for unordered image sets [C]. Proceedings of the 7th European Conference on Computer Vision, 2002: 414-431.
    [39] Pritchett P, Zisserman A. Wide baseline stereo matching [C]. Proceedings of the 6th International Conference on Computer Vision, 1998:754-760.
    [40] Lowe D G. Object recognition from local scale-invariant features [C]. Proceedingsof the 7th International Conference on Computer Vision, 1999: 1150-1157..
    [41] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(4): 509-522.
    [42] Carbonetto P, Dorko G, Schmid C, Kuck H, Freitas N. Learning to recognize objects with little supervision[J]. International Journal of Computer Vision, 2008, 77: 219-237.
    [43] Obdrzalek S, Matas J. Object recognition using local affine frames on distinguished regions [C]. Proceedings of British Machine Vision Conference, 2002:113-122.
    [44] Mikolajczyk K, Leibe B, Schiele B. Local features for object class recognition [C]. Proceedings of International Conference on Computer Vision, 2005: 1792-1799.
    [45] Dorko G, Schmid C. Selection of scale invariant neighborhoods for object class recognition [C]. Proceedings of International Conference on Computer Vision, 2003:634-640.
    [46] Schmid C, Mohr R. Local grayvalue invariants for image retrieval [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(5): 530-534.
    [47] Sivic J, Zisserman A. Video google: A text retrieval approach to object matching in videos [C]. Proceedings of International Conference on Computer Vision, 2003:1470-1478.
    [48] Sivic J, Schaffalitzky F, Zisserman A. Object level grouping for video shots [J]. International Journal of Computer Vision, 2006, 67(2): 189-210.
    [49] Se S, Lowe D G, Little J. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks [J]. International Journal of Robotics Research, 2002, 21(8):735-758.
    [50] Oliva A, Torralba A. Modeling the shape of the scene: a holistic representation of the spatial envelope [J]. International Journal of Computer Vision, 2001, 42(3):145-175.
    [51] Varma M, Zisserman A. A statistical approach to texture classification from single images [J]. International Journal of Computer Vision, 2005, 62(1/2): 61-81.
    [52] Randen T, Husoy J H. Filtering for texture classification: a comparative study [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(4): 291-310.
    [53] Sivic J, Zisserman A. Video data mining using configurations of viewpoint invariant regions [C]. Proceedings of International Conference on Computer Vision and Pattern Recognition, 2004:488-495.
    [54] Hartley R I, Zisserman A. Multiple view geometry in computer vision [M]. Cambridge: Cambridge University Press, 2004: 37-44.
    [55] Li J, Allinson N M. A comprehensive review of current local features for computer vision [J]. Neurocomputing, 2008, 71:1771-1787.
    [56] Lindeberg T. Scale-space [J]. Encyclopedia of Computer Science and Engineering, 2009, IV: 2495-2504.
    [57] Harris C, Stephens M. A combined corner and edge detector [C]. Proceedings of Alvey Vision Conference, 1988:147-151.
    [58] Schmid C, Mohr R, Bauckhage C. Evaluation of interest point detectors [J]. International Journal of Computer Vision, 2000, 37(2): 151-172.
    [59] Smith S M, Brady J M. SUSAN-a new approach to low level image processing [J]. International Journal of Computer Vision, 1997, 23(1): 45-78.
    [60] Rosten E, Drummond T. Machine learning for high-speed corner detection [C]. Proceedings of European Conference on Computer Vision, 2006: 430-443.
    [61] Quinlan J R. Induction of decision trees [J]. Machine Learning, 1986, 1:81-106.
    [62] Mikolajczyk K, Tuytelaars T, Schmid C, et al. A comparison of affine region detectors [J]. International Journal of Computer Vision, 2005, 65(1/2): 43-72.
    [63] Lindeberg T. Feature detection with automatic scale selection [J]. International Journal of Computer Vision, 1998, 30(2): 79-116.
    [64] Mikolajczyk K, Schmid C. Scale&affine invariant interest point detectors [J]. International Journal of Computer Vision, 2004, 60(1): 63-86.
    [65] Mikolajczyk K. Interest Point Detection Invariant to Affine Transformations [D]. Nice: Institut National Polytechnique de Grenoble, 2002.
    [66] Tuytelaars T, Mikolajczyk K. Local invariant feature detectors: a survey [J]. Foundations and Trends in Computer Graphics and Vision, 2007, 3(3):177-280.
    [67] Lowe D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
    [68] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features [C]. Proceedings of International Conference on Computer Vision and Pattern Recognition, 2001:511-518.
    [69] Bay H, Tuytelaars T, Gool L V. SURF: Speeded up robust features [C]. Proceedings of European Conference on Computer Vision, 2006:404-417.
    [70] Kadir T, Zisserman A, Brady M. An affine invariant salient region detector [C]. Proceedings of the 8th European Conference on Computer Vision, 2004: 345-457.
    [71] Matas J, Chum O, Urban M, et al. Robust wide baseline stereo from maximally stable extremal regions [C]. Proceedings of the 13th British Machine Vision Conference, 2002:384-393.
    [72] Mikolajczyk K, Schmid C. A performance evaluation of local descriptors [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10): 1615-1630.
    [73] Ke Y, Sukthankar. PCA-SIFT: A more distinctive representation for local image descriptors [C]. Proceedings of International Conference on Computer Vision and Pattern Recognition, 2004: 506-513.
    [74] Johnson A E, Hebert M. Using spin-images for efficient multiple model recognition in cluttered 3-d scenes [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(5): 433-449.
    [75] Freeman W, Adelson E. The design and use of steerable filters [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(9): 891-906.
    [76] Baumberg A. Reliable feature matching across widely separated views [C]. Proceedings of International Conference on Computer Vision and Pattern Recognition, 2000:774-781.
    [77] Hu M K. Visual pattern recognition by moment invariants [J]. IRE Transactions on Information Theory, 1962, 8:179-187.
    [78] Shu H, Luo L, Coatrieux J L. Moment-based approaches in image [J]. IEEE Engineering in Medicine and Biology Magazine, 2007, 26(5):70-74.
    [79] Belkasim S O, Shridhar M, Ahmadi M. Pattern recognition with moment invariants: a comparative study and new results [J]. Pattern Recognition, 1991, 24:1117-1138.
    [80] Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns [J]. IEEE Transactions onPattern Analysis and Machine Intelligence, 2002, 24:971-987.
    [81] Darrell T, Indyk P, Shakhnarovich G. Nearest neighbor methods in learning and vision: theory and practice [M]. Cambrige: MIT Press, 2006.
    [82] Calonder M, Lepetit V, Fua P, et al. Compact signatures for high-speed interest point detection and matching [C]. Proceedings of the IEEE International Conference on Computer Vision, 2009:357-364.
    [83] Arya S, Mount D. ANN: Library for approximate nearest neighbor searching. http://www.cs.umd.edu/~mount/ANN/.
    [84] Mikolajczyk K, Schmid C. Indexing based on scale invariant interest points [C]. Proceedings of the 8th International Conference on Computer Vision, 2001: 525-531.
    [85] Andoni A, Indyk P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions [J]. Communications of the ACM, 2008, 51(1): 117-122.
    [86] Muja M, Lowe D G. Fast approximate nearest neighbors with automatic algorithm configuration[C]. Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, 2009:331-340.
    [87] Mikolajczyk K, Matas J. Improving descriptors for fast tree matching by optimal linear projection [C]. Proceedings of the IEEE International Conference on Computer Vision, 2007: 1-7.
    [88] Robert F. Visual object category recognition [D]. PhD Thesis. University of Oxford, Oxford, 2005.
    [89] Bosh A, Zisserman A, Munoz X. Scene classification using a hybrid generative/discriminative approach [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30:712-727.
    [90] El-Sheimy N, Schwarz K. Navigating urban areas by VISAT- A mobile mapping system integrating GPS/INS/Digital cameras for GIS application[J]. Navigation, 1999, 45: 275-286.
    [91] Ogale A.S, Fermuller C, Aloimonos Y. Motion segmentation using occlusions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27:988-992.
    [92] Chumerin N, Van Hulle M M. An approach to on-road vehicle detection, description and tracking[C]. IEEE 17th Signal Processing Society Workshop on Machine learning for Signal Processing, 2007:265-269.
    [93] Bugeau A, Perez P. Detection and segmentation of moving objects in complex scenes[J]. Computer Vision and Image Understanding, 2009, 113:459-476.
    [94] Shi J, Tomasi C. Good features to track[C]. IEEE Conference on Computer Vision and Pattern Recognition, 1994:593-600.
    [95] Bruhn A, Weickert J, Schnorr C. Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods [J]. International Journal of Computer Vision, 2005, 63(3):211-231.
    [96] Trucco E, Tommasini T, Roberto V. Near-recursive optical flow from weigthted image differences[J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B: CYBERNETICS, 2005, 35(1):124-129.
    [97] Chunke Y, Oe S. A new gradient-based optical flow method and its application to motion segmentation [J]. IECON, 2000, 2:1225-1230.
    [98] Chang Y, Gerard M, Jinman K, Isaac C. Detecting motion regions in the presence of a strong parallax from a moving camera by multiview geometric constraints [J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(9):1627-1641.
    [99] Torr P. Geometric motion segmentation and model selection[J].Philosophical Transactions, 1998, 356(1740): 1321-1340.
    [100] Hartley R, Vidal R. The multibody trifocal tensor: motion segmentation from 3 perspective views [C]. IEEE Conference on Computer Vision and Pattern Recognition, 2004: 769-775.
    [101] Vidal R, Ma Y, Soatto S, Sastry S. Two-view multibody structure from motion [J]. International Journal of Computer Vision, 2006, 68(1): 7-25.
    [102] Lourakis M.I., Argyros A.A., Orphanoudakis S.C. Independent 3D motion detection using residual parallax normal flow fields[C]. International Conference on Computer Vision, 1998:1012-1017.
    [103]Armangue X, Salvi J. Overall view regarding fundamental matrix estimation [J]. Image and Vision Computing, 2003, 21:205-220.
    [104]Zhang Z. Determining the epipolar geometry and its uncertainty: a review [J]. International Journal of Computer Vision, 1998, 27:161-198.
    [105]Torr P.H.S. Bayeisan model estimation and selection for epipolar geometry and generic manifold fitting[J]. International Journal of Computer Vision, 2002, 50:35-61
    [106] Crisan D, Doucet A. A survey of convergence results on particle filtering methods for practitioners [J]. IEEE Transactions on Speech and Audio Processing, 2002, 10(3): 173-185.
    [107]Isard M, Blake A. Condensation-conditional density progation for visual tracking[J]. International Journal of Computer Vision, 1998:5-28..
    [108] Landauer T K, McNamara D S, Dennis S, Kintsch W. Handbook of latent semantic analysis[M]. Mahwah NJ: Lawrence Erlbaum Associate, 2007.
    [109] Scott C D, Susan T D, Thomas K L, George W F, Richard A H. Indexing by latent semantic analysis [J]. Journal of the American Society for Information Science, 1990, 41(6): 391-407.
    [110] Hofmann T. Probabilistic latent semantic analysis[C]. The Fifteenth Conference on Uncertainty in Artifical Intelligence, 1999:289-296.
    [111] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
    [112] FeiFei L, Perona P. A Bayesian hierarchical model for learning natural scene categories[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2005, 2:524-531.
    [113] Sivic J, Russell B C, Efros A A, Zisserman A, Freeman W T. Discovering objects and their locations in images[C]. IEEE Conference on Computer Vision, 2005, 1:370-377.
    [114] Fergus R, FeiFei L, Perona P, Zisserman A. Learning object categories from Google's image search[C]. IEEE Conference on Computer Vision, 2005, 2:1816-1823.
    [115] Russell B C, Efros A A, Sivic J, Zisserman A, Freeman W T. Using mutiple segmentations to discover objects and their extent in image collections[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2006:151-157.
    [116]Bissacco A, Yang M H, Soatto S. Detecting humans via their pose [J].Advance in Neural Information Processing Systems, 2007, 19: 169-176.
    [117]Niebles J C, Wang H, FeiFei L. Unsupervised learning of human action categoriesusing spatial-temporal words[J]. International Journal of Computer Vision, 2008, 79(3):299-318.
    [118]Wong S F, Kim T K, Cipolla R. Learning motion categories using both semantic and structure information [C]. IEEE International Conference on Computer Vision and Pattern Recognition, 2007: 107-115.
    [119]Vedaldi A, Gulshan V, Varma M, Zisserman A. Multiple kernels for object detection[C]. IEEE International Conference on Computer Vision, 2009: 606-613.
    [120]Harzallah H, Jurie F, Schmid C. Combing efficient object localization and image classification[C]. IEEE International Conference on Computer Vision, 2009: 237-244.
    [121]Leordeanu M, Collins R. Unsupervised learning of object features from video sequences[C]. IEEE International Conference on Computer Vision and Pattern Recognition, 2005: 1142-1149.
    [122]Liu D, Chen T. A topic-motion model for unsupervised video object discovery[C]. IEEE International Conference on Computer Vision and Pattern Recognition, 2007: 1-8.
    [123]Liu D, Chen T. Unsupervised image categorization and object localization using topic models and correspondences between images[C]. IEEE International Conference on Computer Vision, 2007: 1-7.
    [124]Blei D M, Lafferty J D. Dynamic topic models[C]. The 23rd International Conference on Machine Learning, 2006:113-120.
    [125]Brostow G J, Fauqueur J, Cipolla R. Semantic object classes in video: a high-definition ground truth database[J]. Pattern Recognition Letters, 2008, 30:88-97.
    [126] Gandhi T, Trivedi M M. Pedestrian protection systems: issues, surveys, and challenges [J]. IEEE Transactions on Intelligent Transportation Systems, 2007, 8(3): 413-430.
    [127] Enzweiler M, Gavrial D M. Monocular pedestrian detection: survey and experiments [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(12): 2179-2195.
    [128]Munder S, Gavrila D M. An experimental study on pedestrian classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 28(11):1-6.
    [129]Gavrila D M. Multi-cue pedestrian detection and tracking from a moving vehicle [J]. International Journal of Computer Vision, 2007, 73(1):41-59.
    [130] Xu F, Liu X, Fujimura K. Pedestrian detection and tracking with night vision[J]. IEEE Transactions on Intelligent Transportation Systems, 2005, 6(1): 63-71.
    [131] Nanda H, Davis L. Probabilistic template based pedestrian detection in infrared videos[C]. IEEE Intelligent Vehicle Symposium, 2002, 1: 15-20.
    [132]Toyofuku K, Katsuno T. Night view system with pedestrian detection using near infared light [J]. TOYOTA Technical Review, 2008, 56(1):82-87.
    [133]Bertozzi M, Broggi A, Caraffi C, Rose M D, Felisa M, Vezzoni G. Pedestrian detection by means of far-infared stereo vision[J]. Computer Vision and Image Understanding, 2007, 106:194-204.
    [134]Sabzmeydani P, Mori G. Detecting pedestrians by learning Shapelet features[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2007: 1-8.
    [135] Wu B, Nevatia R. Detection of multiple partially occluded humans in a single image by bayesian combination of edgelet part detectors[C]. IEEE InternationalConference on Computer Vision, 2005, 1:90–97.
    [136] Tuzel O, Porikli F, Meer P. Pedestrian detection via classification on Riemannian manifolds[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30 (10): 1713-1727.
    [137] Papageorgiou C, Poggio T. A trainable system for object detection[J]. International Journal of Computer Vision, 2000, 38 (1): 15–33.
    [138] Gavrila D M. Pedestrian detection from a moving vehicle[C]. European Conference on Computer Vision, 2000, 2:37–49.
    [139] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]. International Conference on Computer Vision and Pattern Recognition, 2005, 2: 886–893.
    [140] Zhu Q, Avidan S, Yeh M C, Cheng K T. Fast human detection using a cascade of histograms of oriented gradients[C]. International Conference on Computer Vision and Pattern Recognition, 2006, 2:1491–1498.
    [141] Viola P, Jones M J, Snow D. Detecting pedestrians using patterns of motion and appearance[C]. IEEE International Conference on Computer Vision, 2003, 2: 734–741.
    [142] Dalal N, Triggs B, Schmid C. Human detection using oriented histograms of flow and appearance[C]. European Conference on Computer Vision, 2006, 2:428-441.
    [143] Wojek C, Walk S, Schiele B. Multi-cue onboard pedestrian detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, 2009: 794-801.
    [144]Liu Y, Chen X, Yao H, Cui X, Liu C, Gao W. Contour-motion feature (CMF): a space-time approach for robust pedestrian detection[J]. Pattern Recognition Letters, 2009, 30:148-156.
    [145]Malley R O, Glavin M, Jones E. A review of automotive infrared pedestrian detection techniques[C]. IET Irish Signals and Systems Conference, 2008: 168-173.
    [146]Bertozzi M, Broggi A, Grisleri P, et al. Pedestrian detection in infared images[C]. International IEEE Conference on Intelligent Transportation Systems, 2003:662-667.
    [147]Yasuno M, Ryousuke S, Yasuda N, et al. Pedestrian detection and tracking in far infrared images[C]. IEEE Conference on Intelligent Transportation Systems, 2005:182-187.
    [148]Dai C, Zheng Y, Li X. Layered represenation for pedestrian detection and trakcing in infrared imagery[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 1:13-20.
    [149]Mu Y, Yan S, Liu Y, et al. Discriminative local binary patterns for human detection in personal album[C]. International Conference on Computer Vision and Pattern Recognition, 2008:1-8.
    [150]Wang X, Han T X, Yan S. An HOG-LBP human detector with partial occlusion handling[C]. IEEE International Conference on Computer Vision, 2009:1–8.
    [151]Haralick R, Shanmugam K, Dinstein I. Textural features for image classification [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1973, 3(6): 610-621.
    [152] Tamura H, Mori S, Yamawaki Y. Textural Features Corresponding to Visual Perception [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1978, SMC-8: 460-473.
    [153] Sklansky J. Image Segmentation and Feature Extraction [J]. IEEE Transactions on Systems, Man, and Cybernetics, 1978, SMC-8: 237-247.
    [154] Bovik A, Clarke M, Geisler W. Multichannel texture analysis using localised spatial filters [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990, 12: 55-73.
    [155]Chaudhuri B, Sarkar N. Texture segmentation using fractal dimension [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, 17(1): 72-77.
    [156] Randen T, Hus J.H. Filtering for texture classification: A comparative study [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(4): 291~310.
    [157] Reed T R, Buf J M H. A review of recent texture segmentation and feature extraction techniques [J]. Computer Vision, Image Processing and Graphics, 1993, 57(3): 359-372.
    [158] Zhang J G, Tan T N. Brief review of invariant texture analysis methods [J]. Pattern Recognition, 2002, 35: 735-747.
    [159] Rao A R. A Taxonomy for Texture Description and Identification [M]. Berlin: Springer, 1990.
    [160] Heikkila M, Pietikainen M, Schmid C. Description of interest regions with local binary patterns [J]. Pattern Recognition, 2009, 42:425-436.
    [161] http://www.cse.ohio-state.edu/otcbvs-bench/
    [162] Maenpaa T, Ojala T, Pietikainen M, et al. Robust texture classification by subsets of local binary patterns[C]. The 15th International Conference on Pattern Recognition, 2000, 3:935-938.
    [163]Davis J, Keck M. A two-stage approach to person detection in thermal imagery[C]. The 7th IEEE Workshops on Applications of Computer Vision, 2005, 1:364-369.
    [164]Bertozzi M, Broggi A, Caraffi C, et al. Pedesrian detection by means of far-infrared stereo vision [J]. Computer Vision and Image Understanding, 2007, 106(2-3): 194-204.
    [165]Zhang J, Marszalek M, Lazebnik S, Schmid C. Local features and kernels for classification of texture and object categories: A comprehensive study [J]. International Journal of Computer Vision, 2007, 73(2): 213-238.
    [166]周晖.高分辨率遥感图像的层次化分析方法[D].博士.中国人民解放军国防科学技术大学,长沙,2010.
    [167]王威.基于图像的视频事件分析方法[D].博士.中国人民解放军国防科学技术大学,长沙,2010.
    [168]陈涛.图像仿射不变特征提取方法研究[D].博士.中国人民解放军国防科学技术大学,长沙,2006.
    [169]姜嘉言.用于人脸识别的产生式模型和判别式模型中若干问题的研究[D].博士.复旦大学,上海,2009.
    [170]王江涛.基于视频的目标检测、跟踪及其行为识别研究[D].博士.南京理工大学,南京,2008.
    [171]孙浩.融合视觉和惯性传感器的独立运动目标检测[D].硕士.中国人民解放军国防科学技术大学,长沙,2008.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700