多摄像机协同的行人检测技术研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

多摄像机协同的行人检测技术研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Human Detecion Based on the Collaboration of Multiple Cameras
作者：曾成斌
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：行人检测 ; 多摄像机协同 ; 特征提取 ; 多视角融合 ; 人数统计
英文关键词：human detection ; multi-camera collaboration ; feature extraction ; multi-view fusion ; people counting
学位年度：2011
导师：马华东
学科代码：081203
学位授予单位：北京邮电大学
论文提交日期：2011-05-15

摘要

随着监控摄像机的日益普及,迫切需要将智能识别技术引入到视频感知网络中来,实现智能化的场景监测。本文主要研究多摄像机协同环境下的行人检测。传统的方法主要在单摄像机下,利用图像处理和机器学习的方法对行人进行检测,检测的精度和速度都还有待提高,且不能很好地解决遮挡问题；也有一些方法利用多摄像机来解决遮挡问题,但这些方法都是假设场景中的目标是连续运动的,并且也不对目标的类别进行判断,即只是利用多摄像机来对连续运动的目标进行跟踪,而不对目标进行识别。
     针对传统行人检测方法中出现的问题,本文从多摄像机协同的目标搜索、行人的特征提取和分类、多视角的目标融合和遮挡处理三个方面,提出了一系列的模型和方法,并且搭建了一个具有实用价值的人数统计系统原型。论文的基本思路是：先利用多视角几何的方法来构造三维搜索空间,把搜索到的候选目标投影到二维图像中；再对投影后的目标提取特征并进行分类,判断是否是行人；最后,对各个摄像机中检测到的目标进行融合和遮挡处理,从而快速准确地检测出场景中的行人。本论文的主要贡献如下：
     (1)提出了一种基于三维空间的目标搜索方法,从而显著地减少了候选行人目标的搜索空间。行人检测的第一步是在图像中搜索候选目标,即在图像平面中搜索可能含有行人的子图像。和传统的二维搜索方法不同,我们利用多视角几何的方法来构建三维地面；再利用行人和地面间及行人间存在的空间约束,通过重投影的方法来定位图像中可能含有行人的区域。和传统方法相比,这种搜索方法不仅能明显地降低搜索空间,有效提高检测率,而且对运动和静止的行人都是有效的。
     (2)提出了一种多级边缘和多级纹理的特征提取方法,该方法对行人的姿态变化具有鲁棒性。对上一步中得到的候选区域,我们需要对其进行分类,判断是否为行人。我们用多级边缘和多级纹理特征来描述行人,并对这个特征进行降维,去掉冗余的信息。实验表明,我们的方法能有效地解决行人的姿势和形态的变化问题。进一步,我们采用级联的方法对该特征进行加速,达到了实时计算的目的,并提出了一种改进的支持向量机训练方法,在加速了特征计算的同时,还保持了近似相同的检测精度。
     (3)提出了一种误差容忍的单应性约束方法,实现了多视角的目标融合和遮挡处理。对每个摄像机下检测到的目标,我们采用重投影的融合方法来修正检测结果,进一步提高了检测精度。为了将检测结果应用于人数统计,我们需要知道各个摄像机中检测到的行人中,哪些是同一个目标。在本文中,我们提出了一种误差容忍的单应性约束方法,较好地实现了多视角的目标匹配,从而较好地解决了遮挡问题。
     (4)实现了一个多摄像机协同的人数统计系统。针对智能监控应用对自动人数统计系统的需求,我们在上述的行人检测方法的基础上,提出了基于粒子滤波跟踪的边界控制法来统计场景中进和出的行人数,从而设计出了一个多摄像机协同的人数统计系统原型。通过在实际场景中的测试结果表明,我们的方法对拥挤和稀疏人群均能达到很好的统计精度。
With the growing popularity of surveillance cameras, it is urgent to introduce the techniques of intelligent recognition into video sensing network, and thus achieve intelligent scene surveillance. In this thesis, we mainly study on human detection based on the collaboration of multiple cameras. Traditional detection approaches are mainly based on a single camera, using image processing and machine learning. The accuracy and speed of these approaches need to be further improved. Some multi-view approaches were proposed to detect and track people in a dense crowd to avoid occlusion. However, these approaches also assumed that the people are moving, and did not classify the objects. They just used multiple cameras to track the consecutively moving objects, and did not judge whether the objects are humans.
     To overcome the problems in traditional human detection approaches, we propose a series of methods from three different aspects:object search via multi-camera, the extraction and classification of human's features, and objects fusion and occlusion handling on multi-camera. We also construct a people counting system prototype by using multiple surveillance cameras. The basic ideas of this thesis is to construct the 3D search space by using multi-view geometry, then re-project the candidate objects to each view and classify the re-projected sub-images, and finally fuse the results from each view, and thus detect human accurately and rapidly. The main contributions of this thesis are as follows:
     (1) We propose an object search method based on 3D space, and thus constrain the search space greatly. The first step of human detection is to search candidate objects (i.e., sub-images which probably contain people) in an image. Unlike traditional 2D search method, we use the method of multi-view geometry to reconstruct the 3D ground plane. By using the spatial constraints between people and the ground plane, we can locate the candidate sub-images through re-projection. Compared with traditional methods, our approach not only can constrain the search space, but also can deal with moving and still humans simultaneously.
     (2) We propose a dimensionality reduction method on the multilevel edge and texture feature. This method is robust to the large variations in people appearances and poses. For the candidate sub-images, we need to judge whether they are humans through classification. We use the multilevel edge and multilevel texture feature to describe human, and then reduce the dimension of the feature to discard the redundant and noisy information. Experiments show that our method can handle the large variations in people appearances and poses. To accelerate the detection speed, we propose a novel two-stage cascade-of-rejectors method. In order to maintain an accuracy level similar to the multilevel edge and texture feature, we propose an improved SVM (Support Vector Machine) for training.
     (3) We propose a homography constraint method with error tolerance, and thus achieve visual fusion and occlusion handling via multi-camera. To further improve the detection rate, the author uses the method of re-projection to fuse the multi-view detection results. To count the number of people in crowded scenes, we need to judge which persons detected from multiple views are the same objects. We present a homography constraint method with error tolerance to match the objects from multi-view, and thus resolve the occlusion significantly.
     (4) We design a people counting system prototype based on multiple surveillance cameras. To meet the requirement of surveillance, we combine our multi-view fusion detection method with particle tracking to count the number of people moving in/out the camera view ("border control"), In this way, we design a people counting system prototype via multiple cameras. The evaluations on some real scenes show that our method can count the number of people both in crowded and sparse scenes.

引文

[Acorel]Automatic People Counting Systems. http://www.acorel.com/.
    [Babenko08] B. Babenko, P. Dollar, Z. Tu, and S. Belongie. Simultaneous learning and alignment:multi-instance and multi-pose learning. In Proceedings of European Conference on Computer Vision (ECCV),2008.
    [Beauchemin95] S. S. Beauchemin, J. L. Barron. The computation of optical flow. ACM Computing Surveys, vol.27,1995, pp.432-467.
    [Belongie02] S. Belongie, J. Malik, J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.11, no.10,2002, pp.509-522.
    [Bourdev10] L. Bourdev, S. Maji., T. Brox. and J. Malik. Detecting people using mutually consistent poselet activations. In Proceedings of European Conference on Computer Vision (ECCV),2010, pp.168-181.
    [Brogefors88] Brogefors, G. Hierarchical chamfer matching:A parametric edge matching algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.3, no.19,1988, pp.849-865.
    [Brox09]T. Brox, C. Bregler, J. Malik. Large displacement optical flow. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009, pp.886-893.
    [Bruhn05]A.Bruhn, J.Weickert, and C. Schnorr. Lucas/Kanade meets Horn/Schunck: combining local and global optical flow methods. International Journal of Computer Vision (IJCV), vol.61, no.3,2005,211-231.
    [Dalal05]N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005, pp.886-893.
    [Dalal06]N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histo-grams of flow and appearance. In Proceedings of European Conference on Computer Vision (ECCV),2006, pp.428-441.
    [Dollar09]P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection:a ben-chmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009, pp.304-311.
    [Eshe108]R. Eshel, R. and Y. Moses. Homography based multiple camera detection and tracking of people in a dense crowd. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008, pp.1-8.
    [Enzweiler10] M. Enzweiler, A. Eigenstetter, B. Schiele and D. M. Gavrila. Multi-Cue Pedestrian Classification with Partial Occlusion Handling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010, pp. 86-93.
    [Fan05]R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using the second order information for training SVM. Journal of Machine Learning Research. Vol. 6,2005, pp.1889-1918.
    [Felzenszwalb10] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Obj-ect detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.32, no.9,2010, pp. 1-20.
    [Ferrari09] V. Ferrari, M. Marin-Jimenez and A. Zisserman. Pose Search:Retrieving People using Their Pose. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010.
    [Fleuret08] F. Fleuret, J. Berclaz, R. Lengagne, and P. Fua. Multicamera people track-ing with a probabilistic occupancy map. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.30, no.2, pp.267-282,2008.
    [Gavrila07a] Gavrila, D.M. A Bayesian, exemplar-based approach to hierarchical shape matching. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.18, no.7,2007, pp.1408-1421.
    [Gavrila07] Gavrila, D.M. and Munder, S. Multi-cue pedestrian detection and tracking from a moving vehicle. International Journal of Computer Vision (IJCV). vol.73, no.1,2007, pp.41-59.
    [Ge10]W. Ge and R. Collins. Crowd detection with a multiview sampler. In Procee-dings of 2010 European Conferenceon Computer Vision (ECCV),2010, pp.324-337.
    [Horn81]B. Horn, B. Schunck. Determining optical flow. Artificial Intelligence, vol. 17,1981,pp.185-203.
    [Huttenlocher93] D. P. Huttenlocher, G. A. Klanderman, G.A. and W. A. Rucklidge. Comparing images using the Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.2, no.4,1993, pp.850-863.
    [Ke04]Y. Ke, R. Sukthankar. PCA-SIFT:A more distinctive representation for local image descriptors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2004.
    [Khanloo10] B.Y.S. Khanloo, F. Stefanus, M. Ranjbar, Z. Li, N. Saunier, and et al. Max-Margin Offline Pedestrian Tracking with Multiple Cues.In Proceedings of Seventh Canadian Conference on Computer and Robot Vision,2010.
    [Khan09]S.M. Khan and M. Shah. Tracking multiple occluding people by localizing on multiple scene planes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.31, no.3, pp.505-519,2009.
    [Li08]Li, M. and Zhang, Z. and Huang, K. and Tan, T. Estimating the number of peo-ple in crowded scenes by MID based foreground segmentation and head-shoulder detection. In Proceeding of the International Conference on Pattern Recognition (ICPR),2008.
    [Lan10]T. Lan, Y. Wang, W. L. Yang, and G. Mori. Beyond Actions:Discriminative Models for Contextual Group Activities. In Proceeding of Neural Information Processing Systems (NIPS),2010.
    [Lim05]S. H.Lim, J. G. Apostolopoulos and A. E. Gamal, "Optical flow estimation using temporally oversampled video." IEEE Transactions on Image Processing (ITIP), vol.14, no.8,2005,1074-1087.
    [Lin10]Lin, Z. and Davis, L.S. Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.32, no.4,2010, pp.604-618.
    [Liu09]Liu, Y. and Shan, S. and Zhang, W. and Chen, X. and Gao, W. Granularity-tunable gradients partition (GGP) descriptors for human detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009, pp.304-311.
    [Liu109]刘亮.视频传感器网络中目标定位跟踪问题研究.[博士研究生论文].北京邮电大学,2009.
    [Lucas81]B. Lucas, T. Kanade. An iterative image registration technique with an application to stereo vision. Joint Conference on Artificial Intelligence,1981, pp.674-679.
    [Ma05]H. Ma, Y. Liu. Correlation based video processing in video sensor networks. In Proceeding of IEEE WirelessCom 2005, pp.987-992.
    [Ma06]马华东,陶丹.多媒体传感器网络及其研究进展.软件学报.vol.17,no.9,2006,2013-2028.
    [Ma09]H. Ma, X. Zhang, A. Ming. A coverage-enhancing method for 3D directional sensor networks. In Proceeding of IEEE INFOCOM,2009, pp.2791-2795.
    [Maji08]S. Maji, A. C. Berg and J. Malik. Classification using intersection kernel support vector machines is efficient. In Proceedings of IEEE Conference on Com-puter Vision and Pattern Recognition (CVPR),2008.
    [Maji09]S. Maji and J. Malik. Object detection using a max-margin hough transform. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
    [Mohan01]A. Mohan, C. Papageorgiou, T. Poggio. Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.23, no.10,2001, pp.349-361.
    [Mu08]Y. Mu, S.Yan, Y.Liu, T.Huang, B.Zhou. Discriminative local binary patterns for human detection in personal album. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008.
    [Ojala02]T. Ojala, M. Pietikainen and T. Maenpaa. Multiresolution gray scale and rotation invariant texture analysis with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.24, no.7,2002, pp.971-987.
    [Onishi08]K. Onishi, T. Takiguchi and Y. Ariki.3D human posture estimation using the HOG features from monocular image. International Conference on Pattern Recognition (ICPR).2008, pp.1-4.
    [Papageorgiou00] C. Papageorgiou and T. Poggio. A trainable system for object detection. International Journal of Computer Vision (IJCV). vol.38, no.1,2000, pp.15-33.
    [Pets09]J. Ferryman and A. Shahrokni, "Pets2009:Dataset and challenge," In Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS-Winter), London,2009, pp.1-6.
    [Ratsch01]G. Ratsch, T. Onoda, and K.R.Muller. Soft margins for AdaBoost. Journal of Machine Learning, vol.42, no.3,2001, pp.287-320.
    [Richard09] Richard Roberts and Christian Potthast and F. Dellaert. "Learning General Optical Flow Subspaces for Egomotion Estimation and Detection of Motion Anomalies." In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
    [Stauffer99] C. Stauffer and Grimson, WEL. Adaptive background mixture models for real-time tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),1999.
    [Schwartz09] W.R. Schwartz, A. Kembhavi, D. Harwood, and L.S. Davis. Human Detection Using Partial Least Squares Analysis. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2009.
    [Tao07]陶丹,马华东,刘亮.一种基于虚拟势场的有向传感器网络覆盖增强算法.软件学报.vol.18,no.5,2007,pp.1151-1162.
    [Tian07]田广.基于视觉的行人检测和跟踪技术的研究.[博士研究生论文].上海交通大学,2007.
    [Tuytelaars10] T.Tuytelaars, C.H.Lampert, M.B.Blaschko, and W. Buntine. Unsuperv-ised object discovery:A comparison. International Journal of Computer Vision (IJCV). vol.88, no.2,2010, pp.284-302.
    [Tuzel07]O. Tuzel, F. Porikli, and P. Meer. Human detection via classification on riemannian manifolds. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2007, pp.1-8.
    [TPC09]True View People Counter. http://www.cognimatics.com/products/intelligent-surveillance/people-counter/
    [Viola04]P.Viola and M.J.Jones. Robust real-time face detection. International Journ-al of Computer Vision (IJCV). vol.57, no.2,2004, pp.137-154.
    [VPC09]Video People Counter. http://www.videopeoplecounter.com/
    [Walk10]S. Walk, N. Majer, K. Schindler, and B. Schiele. New features and insights for pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1030-1037,2010.
    [Wang09]X. Wang, T. X. Han, and S. Yan. An HOG-LBP human detector with partial occlusion handling. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pp.82-90,2009.
    [Wojek09]C. Wojek, S. Walk and B. Schiele. Multi-cue onboard pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
    [Wu06]T. Wu and T. Yu. A field model for human detection and tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.2, no.6, 2006, pp.753-765.
    [Wu07]B. Wu and R. Nevatia. Cluster boosted tree classifier for multi-view, multi-pose object detection. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2007.
    [Xu10]L. Xu, J. Jia, and Y. Matsushita. Motion detail preserving optical flow estima-tion. In Proceedings of IEEE Conference on Computer Vision and Pattern Recog-nition (CVPR),2010, pp.1293-1300.
    [Yang10]Yang, W. and Wang, Y. and Mori, G. Recognizing human actions from still images with latent poses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010, pp.2030-2037.
    [Zhu06]Q. Zhu, S. Avidan, M. C. Yeh, and K. T. Chen. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1491-1498,2006.
    [Andrea11]T. Andrea, R. Emanuele, A. Andrea. Multiview Registration via Graph Diffusion of Dual Quaternions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2011.
    [Arulampalam02] M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing, vol.50, no.2,2002, pp.174-188.
    [Bourdev09] L. Bourdev, S. Maji., T. Brox. and J. Malik. Poselets:Body Part Detectors Trained Using 3D Human Pose Annotations. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2009.
    [Brown07]M. Brown,D. G. Lowe. Automatic panoramic image stitching using invar-iant features. International Journal of Computer Vision (IJCV), vol.74, no.1, 2007, pp.59-73.
    [Dalal05]N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005, pp.886-893.
    [Dollar09]P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection:a ben-chmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009, pp.304-311.
    [Eshel08]R. Eshel, R. and Y. Moses. Homography based multiple camera detection and tracking of people in a dense crowd. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008, pp.1-8.
    [Felzenszwalb10] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Ob-ject detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.32(9),2010, pp.1-20.
    [Fischler81] M. A. Fischler, R. C. Rolles. Random sample consensus:a paradigm for modeling fitting with applications to image analysis and automated cartography. Communications of ACM, vol.24, no.6,1981, pp.381-395.
    [Fleuret08] F. Fleuret, J. Berclaz, R. Lengagne, and P. Fua. Multicamera people track-ing with a probabilistic occupancy map. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.30, no.2, pp.267-282,2008.
    [Ge10]W. Ge and R. Collins. Crowd detection with a multiview sampler. In Proc. of 2010 European Conferenceon Computer Vision (ECCV),2010, pp.324-337.
    [Khan09]S.M. Khan and M. Shah. Tracking multiple occluding people by localizing on multiple scene planes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.31, no.3, pp.505-519,2009.
    [Kong04]孔斌,方廷健.一种简单而精确的径向畸变标定方法.中国图像图形学报.vol.9,no.4,2004,pp.429-434.
    [Lampert08] C. H. Lampert, M. B. Blaschko, and T. Hofmann. Beyond sliding windows:object localization by efficient subwindow search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008.
    [Li08]Y. Li, B. Wu, and R. Nevatia. Human detection by searching in 3D space using camera and scene knowledge. In Proceedings of 2008 International Conference on Pattern Recognition (ICPR),2008, pp.1-5.
    [Liang09]梁华.多摄像机视频监控中运动目标检测与跟踪.[博士研究生论文].国防科技大学,2009.
    [Lowe04]D. G. Lowe. Distinctive image features from scale invariant key points. Int-ernational Journal of Computer Vision (IJCV), vol.60, no.2,2004, pp.91-110.
    [Ma04]Y. Ma, S. Soatto, J. Kosecka, and S. S. Sastry, An invitation to 3-D vision. vol. 6, Springer,2004.
    [Meng05]孟维亮.基于图像的多视图重建和纹理恢复.[硕士研究生论文].天津大学,2005.
    [Pan09]潘兵,谢惠民,陈鹏万,黄风雷,张庆明.数字图像相关测量中镜头成像畸变的估计和校正.计量学报,vol.30,No.1,2009,pp.62-67.
    [Pets09]J. Ferryman and A. Shahrokni. Pets2009:Dataset and challenge. In Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS-Winter), London,2009, pp.1-6.
    [Richard02] Richard Hartley, and Andrew Zisserman.计算机视觉中的多视图几何.韦穗等译.合肥：安徽大学出版社：2002,8.
    [Schwartz09] W.R. Schwartz, A. Kembhavi, D. Harwood, and L.S. Davis. Human Detection Using Partial Least Squares Analysis. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2009.
    [Tsai87]R.Y. Tsai. A versatile camera calibration technique for high-accuracy 3D machine vision. International Journal of Robotics and Automation. vol.3, no.4, 1987,pp.323-344.
    [Tong06]童强.基于两视图的图像匹配算法研究.[硕士研究生论文].安徽大学,2008.
    [Viola04]P.Viola and M.J.Jones. Robust real-time face detection. International Journ-al of Computer Vision (IJCV). vol.57, no.2,2004, pp.137-154.
    [Xiao10]J.Xiao, H.Cheng, H.Sawhney, F.Han. Vehicle detection and tracking in wide field-of-view aerial video. In Proceedings of IEEE Conference on Computer Visi-on and Pattern Recognition (CVPR),2010.
    [Xu96]G. Xu and Z.Y. Zhang. Epipolar geometry in stereo, motion and object recognition, a unified approach. Kluwer Academic Publishers,1996.
    [Yang08]杨彦景.摄像机标定与畸变图像矫正算法的设计与实现.[硕士研究生论文].东北大学,2008.
    [Zhang05]张静,胡志萍,欧宗瑛.基于异常匹配点去除的基本矩阵优化估计.计算机工程,vol.31,no.13,2005,pp.13-16.
    [Zhang07]W. Zhang, G. Zelinsky, D. Samaras. Real-time Accurate Object Detection using Multiple Resolutions. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2007.
    [Ahonen06]T. Ahonen, A. Hadid, and M. Pietikainen. Face description with local binary patterns:Application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.28, no.12,2006, pp.2037-2041.
    [Andrews02] S. Andrews, T. Hofmann, and I. Tsochantaridis.Multiple instance learning with generalized support vector machines. In proceedings of the 18th National Conference on Artificial Intelligence, Edmonton, Canada,2002, pp. 943-944.
    [Babenko08] B. Babenko, P. Dollar, Z. Tu, and S. Belongie. Simultaneous learning and alignment:multi-instance and multi-pose learning. In Proceedings of European Conference on Computer Vision (ECCV),2008.
    [Belongie02] S. Belongie, J. Malik, J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.11, no.10,2002, pp.509-522.
    [Bourdev10] L. Bourdev, S. Maji., T. Brox. and J. Malik. Detecting people using mutually consistent poselet activations. In Proceedings of European Conference on Computer Vision (ECCV),2010, pp.168-181.
    [Cape105]D. Capel, A. Zisserman. Super-resolution from multiple views using learnt image models.In Proceedings of IEEE Conference on Computer Vision and Patte-rn Recognition (CVPR),2005, pp.529-627.
    [Cha09]查正军.基于机器学习方法的视觉信息标注研究.[博士研究生论文].中国科学技术大学,2009.
    [ChevaleyreOl] Y. Chevaleyre, J. D. Zucker. Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem. In Proceedings of the 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence (LNAI 2056),2001, pp.204-214.
    [Dala105]N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2005, pp.886-893.
    [Dala106a]N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histo-grams of flow and appearance. In Proceedings of European Conference on Computer Vision (ECCV),2006, pp.428-441.
    [Dala106b]N. Dalal. Finding people in images and videos. PhD thesis, Institut National Polytechnique de Grenoble, July 2006.
    [Dietterich97] T. G. Dietterich, R. H. Lathrop, and T. Lozano-Perez. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence, vol. 89, no 1,1997, pp.31-71.
    [Dollar09]P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection:a ben-chmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009, pp.304-311.
    [Eshe108]R. Eshel, R. and Y. Moses. Homography based multiple camera detection and tracking of people in a dense crowd. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008, pp.1-8.
    [Enzweiler10] M. Enzweiler, A. Eigenstetter, B. Schiele and D. M. Gavrila. Multi-Cue Pedestrian Classification with Partial Occlusion Handling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010, pp. 86-93.
    [Ess07]A. Ess, B. Leibe, and L. V. Gool. Depth and appearance for mobile scene analysis. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2009, pp.1-8.
    [Fan05]R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using the second order information for training SVM. Journal of Machine Learning Research. Vol. 6,2005, pp.1889-1918.
    [Fan08]R.-E. Fan, Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J.2008. Liblinear:A library for large linear classification. Journal of Machine Learning Research, vol.9,2008, pp.1871-1874.
    [Felzenszwalb10] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Obj-ect detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.32, no.9,2010, pp. 1-20.
    [Ferrari09] V. Ferrari, M. Marin-Jimenez and A. Zisserman. Pose search:retrieving people using their pose. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2010.
    [Gllavata04] J. Gllavata, R. Ewerth, and B. Freisleben. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. Pattern Recognition, vol.1,2004, pp.425-428.
    [Ke04]Y. Ke, R. Sukthankar. PCA-SIFT:A more distinctive representation for local image descriptors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2004.
    [Li07]李杰.基于内容的图像检索方法研究.[博士研究生论文].中国科学技术大学,2007.
    [Maji08]S. Maji,A.C.Berg and J. Malik. Classification using intersection kernel support vector machines is efficient. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008.
    [Maron98]O. Maron, T. Lozano-Perez. A framework for multiple-instance learning. In Proceedings of Advances in Neural Information Processing systems,1998, pp. 570-576.
    [Mohan01]A. Mohan, C. Papageorgiou, T. Poggio. Example-based object detection in images by components. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.23, no.10,2001, pp.349-361.
    [Ojala02]T. Ojala, M. Pietikainen and T. Maenpaa. Multiresolution gray scale and rotation invariant texture analysis with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.24, no.7,2002, pp.971-987.
    [Pets09]J. Ferryman and A. Shahrokni, "Pets2009:Dataset and challenge," In Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS-Winter), London,2009, pp.1-6.
    [RatschOl]G. Ratsch, T. Onoda, and K.R.Muller. Soft margins for AdaBoost. Journal of Machine Learning. vol.42, no.3,2001, pp.287--320.
    [Richard09] Richard Roberts and Christian Potthast and F. Dellaert. "Learning General Optical Flow Subspaces for Egomotion Estimation and Detection of Motion Anomalies." In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
    [Ruffo00]G. Ruffo G. Learning single and multiple instance decision trees for comp-uter security applications. PhD dissertation, Department of Computer Science, University of Turin, Torino, Italy,2000.
    [Schwartz09] W.R. Schwartz, A. Kembhavi, D. Harwood, and L.S. Davis. Human Detection Using Partial Least Squares Analysis. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2009.
    [Stauffer99] C. Stauffer and Grimson, WEL. Adaptive background mixture models for real-time tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),1999.
    [Sun06]N.Sun, W. Zheng, C. Sun, and et al. Gender classification based on boosting local binary pattern. In Proceedings of Advances in Neural Networks.2006, pp.194-201.
    [Turk94]M. Turk and A. Pentland. Face Recognition using eigefaces, In Proceeding of the IEEE Computer Conference on Computer Vision and Pattern Recognition (CVPR),1994, pp.586-591.
    [Tuytelaars10] T.Tuytelaars, C.H.Lampert, M.B.Blaschko, and W. Buntine. Unsuperv-ised object discovery:A comparison. International Journal of Computer Vision (IJCV). vol.88, no.2,2010, pp.284-302.
    [Tuze107]O. Tuzel, F. Porikli, and P. Meer. Human detection via classification on riemannian manifolds. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2007, pp.1-8.
    [Viola04]P. Viola and M. J. Jones. Robust real-time face detection. International Jou-rnal of Computer Vision (IJCV). vol.57, no.2,2004, pp.137-154.
    [Walk10]S. Walk, N. Majer, K. Schindler, and B. Schiele. New features and insights for pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1030-1037,2010.
    [Wang09]X. Wang, T. X. Han, and S. Yan. An HOG-LBP human detector with partial occlusion handling. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pp.82-90,2009.
    [Wojek09]C. Wojek, S. Walk and B. Schiele. Multi-cue onboard pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2009.
    [Wu06]T. Wu and T. Yu. A field model for human detection and tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.2, no.6, 2006, pp.753-765.
    [Wu07]B. Wu,and R. Nevatia. Cluster boosted tree classifier for multi-view, multi-pose object detection. In Proceedings of IEEE International Conference on Computer Vision (ICCV),2007.
    [(?)06]Q. Zhu, S. Avidan, M. C. Yeh, and K. T. Chen. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1491-1498,2006.
    [Zhou02]Zhou Z-H, Zhang M-L. Neural networks for multi-instance learning. Technical Report, AI Lab, CS Dept., Nanjing Univ., Aug.2002.
    [Zhou03]周志华.多示例学习.南京大学技术报告,2003.
    [Arulampalam02] M. S. Arulampalam, S. Maskell, N. Gordon, T. Clapp. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Transactions on Signal Processing, vol.50, no.2,2002, pp.174-188.
    [Bourdev10] L. Bourdev, S. Maji., T. Brox. and J. Malik. Detecting people using mutually consistent poselet activations. In Proceedings of European Conference on Computer Vision (ECCV),2010, pp.168-181.
    [Breitenstein10]M.D.Breitenstein, F.Reichlin, B.Leibe, and et al. Online Multi-Person Tracking-by-Detection from a Single, Uncalibrated Camera. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), vol.4, no.15,2010, pp.603-619.
    [Comaniciu02] D. Comaniciu and P. Meer. Mean Shift:A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), vol.24, no.5,2002, pp.603-619.
    [Doucet98]A. Doucet. On sequentialMonte Carlomethods for Bayesian filtering.Dept. Eng., Univ. Cambridge, UK, Tech. Rep.,1998.
    [Eshe108]R. Eshel, R. and Y. Moses. Homography based multiple camera detection and tracking of people in a dense crowd. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2008, pp.1-8.
    [(?)zenszwalb10] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Ob-ject detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.32, no.9,2010, pp. 1-20.
    [Fischler81] M. A. Fischler, R. C. Rolles. Random sample consensus:a paradigm for modeling fitting with applications to image analysis and automated cartography. Communications of ACM, vol.24, no.6,1981, pp.381-395.
    [Fleuret08] F. Fleuret, J. Berclaz, R. Lengagne, and P. Fua. Multicamera people track-ing with a probabilistic occupancy map. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.30, no.2, pp.267-282,2008.
    [(?)]W. Ge and R. Collins. Crowd detection with a multiview sampler. In Procee-dings of 2010 European Conferenceon Computer Vision (ECCV),2010, pp.324-337.
    [Hartley03] R. Hartley and A. Zisserman. Multiple view geometry in computer vision. Cambridge University Press, New York, NY, USA,2003.
    [Kasturi09] R. Kasturi, D. Goldgof, P. Soundararajan, V. Manohar, J. Garofolo, R. Bowers, M. Boonstra, V. Korzhova, and J. Zhang. Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video:Data, Metrics, and Protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.31, no.2, pp.319-336,2009.
    [Khan09]S.M. Khan and M. Shah. Tracking multiple occluding people by localizing on multiple scene planes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.31, no.3, pp.505-519,2009.
    [Li08]Y. Li, H. Ai, T. Yamashita, T, and et al. Tracking in low frame rate video:a cascade particle filter with discriminative observers of different life spans. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.21, no.8, pp.1728-1740,2008.
    [Ma04]Y. Ma, S. Soatto, J. Kosecka, and S. S. Sastry, An invitation to 3-D vision. vol.6, Springer,2004.
    [Pets09]J. Ferryman and A. Shahrokni. Pets2009:Dataset and challenge. In Proceedings of IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS-Winter), London,2009, pp.1-6.
    [Wang08]王法胜,赵清杰.一种用于解决非线性滤波问题的新型粒子滤波算法.计算机学报.vol.31,no.2,2008.
    [Wang09]X. Wang, T. X. Han, and S. Yan. An HOG-LBP human detector with partial occlusion handling. In Proceedings of IEEE International Conference on Computer Vision (ICCV), pp.82-90,2009.
    [Welch95]G. Welch and G. Bishop. An introduction to the Kalman filter. Internation-al Journal on University of North Carolina at Chapel Hill, Chapel Hill, NC, vol.7, no.1,1995.
    [Zeng09]C. B. Zeng, H. D. Ma, A. L. Ming, and X. B. Xiao.3D human body trackin-g in unconstrained scenes.In Proceeding of IEEE Pacific-Rim Conference on Mu-ltimedia (PCM), Thailand,2009.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700