基于场景外观建模的移动机器人视觉闭环检测研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于场景外观建模的移动机器人视觉闭环检测研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Study on Visual Loop Closure Detection of Mobile Robot Based on Scene Appearance Modeling
作者：李博
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：同步定位和地图构建 ; 场景外观建模 ; 视觉闭环检测 ; 视觉词袋模型 ; 视觉字典树
英文关键词：Simultaneous Lcalization and Mapping ; Scene Appearance Modeling ; Visual Loop Closure Detection ; Bag-of-Visual Words Model ; Visual Vocabulary Tree
学位年度：2011
导师：杨丹 ; Hong Zhang
学科代码：081201
学位授予单位：重庆大学
论文提交日期：2011-04-01

摘要

机器人在未知环境中根据自身位置估计和传感器数据,创建环境地图同时指导机器人自主定位和导航,也即机器人同步定位与地图构建(SLAM),是实现真正自主移动机器人的关键,成为机器人和人工智能领域研究的热点和难点。闭环检测是SLAM的基础问题之一,如何准确判断机器人当前位置是否位于之前已经访问过的环境区域,对减少机器人位姿和地图状态变量的不确定性,避免错误引入地图冗余变量或重复结构,至关重要。
     由于视觉传感器的诸多优点,近年来,基于视觉的SLAM技术,即vSLAM引起了广泛关注。然而,移动机器人在视觉信息的采集、描述、匹配等关键环节中的模型固有缺陷和不可避免的计算误差,导致无法准确提取闭环响应,进而妨碍机器人完成SLAM任务,因此机器人在大规模非结构化环境中的视觉闭环检测仍是目前最具有挑战性的问题之一。本文对视觉闭环检测问题进行了深入系统的研究,旨在解决当前主流的基于视觉场景外观建模的闭环检测中存在的主要问题,提高闭环检测的效率和准确率。取得的创新成果主要包括:
     首先分析比较了视觉场景采样中,各种帧采样技术的优劣,提出了基于图像内容变化的关键采样方法成为vSLAM首选的依据。针对SLAM领域至今没有对关键帧检测方法的定量评估和选择标准,本文通过研究各种关键帧检测技术的算法机理,提出了无监督的算法性能评估方案和准则,搭建了系统的实验评估框架,通过视觉SLAM数据库上的实验分析,基于特征匹配的关键帧检测方法在本文研究的五类方法中具有最佳的检测效果。该研究工作常常被vSLAM研究所忽略,本研究为解决vSLAM中场景采样问题提供了参考依据。
     在机器人场景外观建模中,通过研究视觉词袋模型BoVW的关键问题,提出了一种鲁棒视觉字典本的优化构造策略,以克服底层特征的海量性、高维性、不稳定性对视觉字典本生成的影响。首先引入条件数理论定量评估海量底层特征的稳定性,筛选出鲁棒视觉特征;提出了一种聚类和降维的统一计算模型,构造了具有聚类结构的自适应维数约简算法;利用低维聚类信息中的邻域支持度,自适应选取最佳的初始视觉单词,选择Silhouette指标作为迭代目标函数,从而改进流行的LBG字典本生成算法敏感于初始点的随机选取,并只能得到局部最优等不足。新的视觉字典本生成算法具有聚类和降维的统一计算功能、良好的鲁棒性和自适应优化等特性,取得了良好的场景图像描述效果。
     提高视觉字典本表征性能是提高闭环检测准确性的关键,针对目前图像分类中的优化策略大都是面向类信息的有监督模式,本文立足闭环检测的无监督性,依托闭环提取计算出的数据实体,提出了一套无监督的视觉单词本表征性能定量评估和优化方法。首先采用熵排序技术的特征向量选择方法改进传统的谱聚类,对原始底层特征在无监督条件下聚类生成初始视觉单词;继而提出一种基于马氏距离测度的视觉单词区分度定量评估算法,在图像-单词矩阵上计算出视觉单词的区分度,设计了一个弱表征性单词的迭代更新策略;最后采用刻画图像相似性矩阵的分解复杂度的秩缩减技术度量新视觉字典本的表征性能。在移动机器人室内和室外场景实验中,本文方法提高了视觉字典本建模的有效性,获得了良好的闭环检测效果,同时对视觉混淆现象表现出良好鲁棒性。
     为提高闭环检测的效率,满足闭环检测的实时计算需求,针对场景外观表征性能受制于有限单词个数以及算法效率低的不足,本文对机器人视觉特征分层量化,构建了视觉字典树,并计算图像在树节点单词的TF-IDF投影权重,生成图像-单词逆向文档索引。为消除视觉字典本的单尺度量化误差,并克服传统平面匹配模式中不区分不同层次节点的区分度对闭环检测的影响,本文融合字典树低层单词的强表征性和高层单词的强鲁棒性,提出由下而上逐层计算图像间相似性增量的金字塔得分匹配方法。
     为剔除候选闭环中错误闭环的干扰,建立时间一致性约束、空间一致性约束和对极几何约束等后验确认操作,有效抑制错误闭环。在移动机器人视觉闭环检测实验中,本文算法提高了闭环提取的效率和检测性。
     通过对视觉闭环检测检测的系统研究,不仅提高了闭环检测的效率和准确性,更扩展了场景外观模型方法在整个vSLAM系统中的应用,也丰富了图像处理、机器视觉等领域的BoW方法研究。
Simultaneous Localization And Mapping (SLAM), which is a process of sensing, estimating self-location and state, and at the same time charting a environmental map, becomes a key technique for a mobile robot to deal with the problem of localization and navigation in an unknown environment. It is a necessary prerequisite to make mobile robot autonomous and has become one of the hot topics in the field of robots and artificial intelligence research. Because of the errors in vehicle pose estimates, it is hard to correctly asserting a vehicle has returned to a previously visited location. This problem is called loop closure detection, which is an important component to make SLAM solution reliable. It helps deciding if we should add a new node to the map or update a previous one when considering a new place, which allows the robot to reduce the uncertainty associated with the state variables that define the robot pose and the map, and avoid erroneously introducing duplicated variables or structures to the map.
     Due to the virtues of visual sensors, vision-based techniques, namely vSLAM and visual loop closure detection approaches, have recently received wide attention. However, the errors of modeling and calculating in visual information acquiring, describing, matching result in the difficulty of the mobile robot to extract the precise loop closures and further to complete the SLAM tasks accurately. The visual loop closure detection is still one of the most challenging problems in large scale unstructured environments. This thesis investigates the key techniques of the visual loop closure detection systematically, particularly the solutions in the space of appearance. Our goal is to work out some issues in current research and design some new visual-based approaches to improve the performance of loop closure detection. The main works are as follows:
     Firstly, this thesis is concerned with the visual scene sampling during the robot moving. Some comments about why the keyframe sampling method based on the visual content change is best choice for the SLAM are formed. An unsupervised evaluation framework and some criteria are proposed in this work by investigating the underlying computational mechanisms for keyframe extraction. Using experimental results obtained from visual SLAM datasets, we conclude that the feature matching method offers the best performance among five representative methods in terms of accurately measuring the amount of visual content change between robot’s views. This study fills an important but missing step in the current appearance-based SLAM research.
     In appearance modeling of robot visual scene, the principle and various factors which govern the performance of Bag-of-Visual Words (BoVW) method are analyzed and a robust optimization framework for the visual vocabulary generation is proposed. Firstly, the Condition Number Theory is applied to evaluate the stability of initial visual features, and then the well conditioned features are preserved by eliminating the bad conditioned features. Next, an adaptive algorithm to generate low-dimensional visual words is proposed by studying a uniform framework of clustering and dimension-reducing. In order to overcome the popular LBG algorithm suffers from local optimality and is sensitive to the initial solution, a parameter called neighborhood-support for each feature is calculated according to clustering structure, which is used to adaptive select initial visual words. Finally, the rational distortion function is redefined using Silhouette metric. Compared with traditional algorithms, the presented algorithm has excellent properties at simultaneous clustering and dimension reduction, good robustness and adaptive optimization.
     To improve the discriminative power of vocabulary is a key problem in BoVW-based loop closure detection. In this thesis,by investigating an method to measure and improve the discriminative power of vocabulary unsupervisedly, we expect to compensate for the weaknesses that the BoVW method doesn't consider the image particularity in the current appearance-based SLAM research. At first, to generate an original vocabulary by spectral clustering, we address the problem of selecting the most important eigenvectors based on an entropy ranking technique and generating words on the feature set. Furthermore, we present a scheme to evaluate the discriminative power of each visual word quantitatively in terms of Mahalanobis separability of image-word matrix. Finally, a discriminative vocabulary is obtained unsupervisedly by updating the poor visual words based-on an iterative solution. The performance of discrimination power of the updated vocabulary is equivalent to the complexity of similarity matrix decomposition which can be measured by entropy metric. The experimental results in both indoor and outdoor image sequences show that, our method is effective to image description and loop closure detection, especially robust to perceptual aliasing.
     To enhance the computational efficiency of algorithms is indispensable for the online processing of loop closure detection. The performance of loop closure detection by using conventional vocabulary is restricted by the limited number of visual words and high computational cost. We construct a visual vocabulary tree by clustering the visual features hierarchically. The weight of each visual word in the vocabulary tree is computed by TF-IDF entropy of each node. Then the inverted index of image-word is exploited. To avoid the quantization error of single scale vocabulary and the neglect of the different discriminative power among different level words of tree-based matching, we take advantage of the robust of high level words and the discriminability of low level words to present a pyramid scoring match scheme. A posteriori management helps discarding outliers by verifying that the two images of the loop closure satisfy some hypothesis constraints. The experiments of loop closure detection demonstrate that our scheme improves similarity calculation in both accuracy and efficiency and obtain a higher precision-recall ratio with a faster speed compared to the traditional methods.
     The contributions of this work not only improve the efficiency and accuracy of loop closure detection, but also extend the appearance-based modeling method to the application of vSLAM,and at the same time, enrich the research of BoW method in image processing, machine vision and other relevant fields.

引文

[1] N. Nilsson. A Mobile Automation: An Application of Artificial Intelligence[C]. International Joint Conference on Artificial Intelligence. Washington, DC, May 1969: 509-520.
    [2] N. J. Nilsson. Shakey The Robot[R]. Techreport , 1984, Stanford University: Stanford California.
    [3]蔡自兴.机器人学,第2版[M].北京:清华大学出版社, 2009.
    [4] T. S. Levitt, D. T. Lawton. Qualitative Navigation for Mobile Robots[J]. Artificial Intelligence, 1990, Vol. 44(3): 305-360.
    [5] S. Ishikawa, H. Kuwamoto, S. Ozawa. Visual Navigation of an Autonomous Vehicle Using White Line Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1988, Vol. 10(5): 743-749.
    [6] E.Y. Rodin, S. M. Amin. Intelligent Navigation for An Autonomous Mobile Robot[C]. In Proceedings of IEEE International Symposium on Intelligent Control, Alexandria,VA, 1988: 366-369.
    [7]苑晶.未知环境下移动机器人主动同时定位与建图研究[D].博士学位论文, 2007,天津:南开大学.
    [8]夏庭锴,杨明,杨汝清.基于单目视觉的移动机器人导航算法研究进展[J].控制与决策, 2010, Vol. 25(1): 1-7.
    [9]张一鸣,秦世引.基于单目视觉的移动机器人测距方法[J].微计算机信息, 2008, Vol. 24(29): 224-226.
    [10]陈伟.单目视觉移动机器人的定位与建图研究[D].博士学位论文, 2008,湖南:国防科学技术大学.
    [11] E. Royer, M. Lhuillier, M. Dhome, J. M. Lavest. Monocular Vision for Mobile Robot Localization and Autonomous Navigation[J]. International Journal of Computer Vision, 2007, 74(3): 237–260.
    [12]白明,庄严,王伟.双目立体匹配算法的研究与进展[J].控制与决策, 2008, Vol. 23(7): 721-729.
    [13] R. Y. Tsai. A versatile ceamera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV camera and lenses[J]. IEEE Joumal of Roboties and Automation, 1987, Vol. 3(4): 323-344.
    [14] Z. Y. Zhang. A flexible new techniqe for camera calibration[J]. IEEE Transactions on pattern Analysis and Machine Intelligenee, 2000, Vol. 22(11): 1330-1334.
    [15] M. Doi, K. Suzuki, S. Hashimoto. Integrated Communicative Robot "BUGNOID" Robot and Human[C]. In Proceedings of the 11th IEEE International Workshop on Interactive Communication, Berlin, Germany, 2002.
    [16] S. Thrun, J. J. Leonard. Simultaneous Localization and Mapping[M]. In: B Siciliano, 0. Khatib. Springer Handbook of Robotics, 2008: 871-879.
    [17] S. Thrun, D. Fox, W. Burgard. A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots[J]. Machine Learning and Autonomous Robots (joint issue), 1998. Vol. 31(5): 1-25.
    [18] R. Smith, M. Self, P. Cheeseman. Estimating uncertain spatial relationships in robotics[C]. In Proceedings of the Proceedings of the Second Conference Annual Conference on Uncertainty in Artificial Intelligence, New York, NY, 1986: 267-288.
    [19] T. Bailey, J. Guivant, M. Stevens, E. Nebot. Consistency of the EKF-SLAM Algorithm[C]. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Beijing, China, 2006: 3562-3568.
    [20] H. Andrew. Multi-robot Simultaneous Localization and Mapping using Particle Filters[J]. International Journal of Robotics Research, 2006, Vol. 25(12): 1243-1256.
    [21] M. Montemerlo, S. Thrun. FastSLAM: A factored solution to the simultaneous localization and mapping Problem[C]. In Proceedings of the Eighteenth Nation Conference on Aitificial Intelligence. Alberta Canada, 2002: 593-598.
    [22] M. Montemerlo, S. Thrun, D. Koller, B. Wegbreit. FastSLAM2.0: An improved Particle filtering algorithm for simultaneous localization and mapping that Provably converges[C]. In Proceedings of the International Conference on Artificial Intelligence. Acapulco, Mexico, 2003: 1151-1156.
    [23]王璐,蔡自兴.未知环境中移动机器人并发建图与定位(CML)的研究进展[J].机器人, 2004, Vol. 26(4): 380-384.
    [24] S. Thrun, Y. Liu, D. Koller, et al. Simultaneous Localization and Mapping with Sparse Extended Information Filters[J]. Imitational Journal of Robotics Research, 2004, Vol. 23(7): 693-716.
    [25] M. DiMareo, S. Garulli, S. Laeroix. A Set Theoretic Approaeh to the Simultaneous Localization and Map Building Problem[C]. In Proceedings of the 39th IEEE Conference on Decision and Control, Sidney, 2000: 833-838.
    [26] A. Angeli, D. Filliat, S. Doncieux, J. A. Meyer. Fast and incremental method for loop-closure detection using bags of visual words[J]. IEEE Transactions on Robotics, 2008, Vol. 24: 1027-1037.
    [27] B. Williams, M. Cumminsa, J. Neira, et al. A comparison of loop closing techniques in monocular SLAM[J]. Robotics and Autonomous Systems, 2009, Vol. 57: 1188-1197.
    [28] M. Cummins, P. Newman. Probabilistic appearance based navigation and loop closing[C]. In Proceedings of IEEE International Conference on Robotics and Automation, Rome, 2007: 2042-2048.
    [29] S. Bazeille, D. Filliat. Combining odometry and visual loop-closure detection for consistent topo-metrical mapping[J]. Rairo-Operations Research, 2010, Vol. 44(4): 365-377.
    [30] H. K. Leong, P. Newman. Loop closure detection in SLAM by combining visual and spatial appearance[J]. IEEE Journal of Robotics and Autonomous Systems, 2006, Vol. 54(9): 740-749.
    [31] J. Callmer, K. Granstr?m, J. Nieto, F. Ramos. Tree of words for visual loop closure detection in urban SLAM[C]. In Proceedings of the Australasian Conference on Robotics and Automation, 2008.
    [32] B. Williams, M. Cummins, J Neira, et al. An image-to-map loop closing method for monocular SLAM[C]. In Proceedings of International Conference on Intelligent Robots and Systems, Nice, 2008: 2053-2059.
    [33] H. K. Leong, P. Newman. Detecting loop closure with scene sequences[J]. International Journal of Computer Vision, 2007, Vol. 74(3): 261-286.
    [34] K. Jungho, I. S. Kweon. Robust feature matching for loop closing and localization[C]. In Proceedings of International Conference on Intelligent Robots and Systems, San Diego, 2007: 3905-3910.
    [35]赵逢达,孔令富.一种基于图像匹配的闭环检测方法[J].燕山大学学报, 2008(2): 115-119.
    [36] P. Newman, D. Cole, K. Ho. Outdoor SLAM using visual appearance and laser ranging[C]. IEEE International Conference on Robotics and Automation, Orlando, FL, 2006: 1180-1187.
    [37] L. Clemente, A. Davison, I. Reid, et al. Mapping large loops with a single hand-held camera[C]. In Proceedings of Roboties: Seience and Systems, Atlanta, USA, 2007.
    [38] M. Cummins, P. Newman. FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance[J]. International Journal of Robotics Research, 2008, Vol. 27(6): 647-665.
    [39] M. A. Fisehler, R. C. Bolles. Random sample consensus: a Paradigm for model fitting with applications to image analysis and automated cartography[J]. Cornnlunieations of the ACM, 1981, Vol. 24(6): 381-395.
    [40] V. Lepetit, P. Fua. Key Point recognition using randomized trees[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, Vol. 28(9): 1465-1479.
    [41] E. Eade, T. Drurnmond. Monoeular SLAM as a graph of coaleseed observations[C]. In Proceedings of the Intemational Conference on Computer Vision, Rio de Janeiro, Brazil, 2007: l-8.
    [42] A. Kumar, J. Thrdif, R. Anati, K. Daniilidis. Experiments on visual loop closing using vocabulary trees[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008: l-8.
    [43] G. Q. Wei, S. Dema. Implicit and Explicit Camera Calibration: Theory and Experiments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994, Vol. 16(5): 469-480.
    [44] I. Minoru. Robot vision modeling camera modeling and camera calibration[J]. Advanced Robotics, 1990, Vol. 5(3): 321-335.
    [45] J. Klippenstein, H. Zhang. Performance Evaluation of Visual SLAM Using Several Feature Extractors[C]. IEEE/RSJ International Conference on Intelligent Robots and Systems, St Louis, USA, 2009: 1574-1581.
    [46] H. Zhang. BoRF: Loop-Closure Detection with Scale Invariant Visual Features[C]. IEEE International Conference on Robotics and Automation, Shanghai, China, 2011: 9-13.
    [47] M. Cummins, P. Newman. Highly Scalable Appearance-Only SLAM: FAB-MAP 2.0[C]. In Proceedings of Robotics Science and Systems, Seattle, USA, 2009.
    [48] A. Angeli, S. Doncieux, D. Filliat. Real-Time Visual Loop-Closure Detection[C]. IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 2008: 1842-1847.
    [49] M. Cummins, P. Newman. Accelerated Appearance-Only SLAM[C]. IEEE International Conference on Robotics and Automation, Pasadena, CA, 2008: 1828-1833.
    [50] I. Posner, D. Schroeter, P. Newman. Online generation of scene descriptions in urban environments[J]. Robotics and Autonomous Systems, 2008, Vol. 56(11): 901-914.
    [51] K. L. Ho. Loop Closing Detection in SLAM using Scene Appearance. For the degree of Doctor of Philosophy[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops Vols 1-3, 2007, University of Oxford.
    [52] R. Hartley, A. Zisserman. Multiple View Geometry in Computer Vision[D]. 2000, Cambridge: Cambridge University Press.
    [53] D. Nister. An efficient solution to the five-point relative pose problem[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, Vol. 26(6): 756-770.
    [54] H. Zhang, B. Li, D. Yang. Keyframe Detection for Appearance-Based Visual SLAM[C]. IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, 2010: 2071-2076.
    [55]李博,杨丹,雷明,葛永新.基于近邻消息传递的自适应局部线性嵌入[J].光电子?激光,2010, Vol. 21(5): 772-778.
    [56]杨丹,李博,赵红.鲁棒视觉词汇本的自适应构造与自然场景分类[J].电子与信息学报, 2010, Vol. 32(9): 2139-2144.
    [57]李博,杨丹,邓林.移动机器人闭环检测的视觉字典树金字塔TF-IDF得分匹配方法[J].自动化学报, 2011.
    [58] H. H. Kim, Y. H. Kim. Toward a Conceptual Framework of Key-Frame Extraction and Storyboard Display for Video Summarization[J]. Journal of the American Society for Information Science and Technology, 2010, Vol. 61(5): 927-939.
    [59] W. Abd-Almageed. Online, Simultaneous Shot Boundary Detection and Key Frame Extraction for Sports Videos Using Rank Tracing[C]. The 15th IEEE International Conference on Image Processing, Vols 1-5, 2008: 3200-3203.
    [60] D. Besiris, F. Fotopoulou, N. Laskaris, G. Economou. Key frame extraction in video sequences: a vantage points approach[C]. IEEE Ninth Workshop on Multimedia Signal Processing, 2007: 434-437.
    [61] B. Cai, H. Du, S. T. Yin, et al. Video shot clustering algorithm based on key frame duration[C]. In Proceedings of International Conference on Information Technology and Environmental System Sciences, 2008, Vol. 4: 77-81.
    [62] J. Calic, E. Izquierdo. Efficient key-frame extraction and video analysis. International Conference on Information Technology: Coding and Computing[C], 2002: 28-33.
    [63] G. Ciocca, R. Schettini. Dynamic key-frame extraction for video summarization[C].In Proceedings of SPIE Internet Imaging VI, San Jose , USA , 2005: 137-142.
    [64] F. Dufaux. Key frame selection to represent a video[C]. International Conference on Image Processing, Vancouver, BC, Canada, 2000: 275-278.
    [65] P. Newman, K. Ho. SLAM-loop closing with visually salient feature[C]. In International Conference on Robotics and Automation, Barcelona, Spain, 2005: 635-642.
    [66] E. Mouragnon, M. Lhuillier, M. Dhome, et al. Real Time Localization and 3D Reconstruction[C]. IEEE Computer Society Conference on In Computer Vision and Pattern Recognition. 2006: 363--370.
    [67] A. Angeli, S. Doncieux, J. A. Meyer, et al. Incremental vision-based topological SLAM[C]. IEEE/RSJ International Conference on Robots and Intelligent Systems, Vols 1-3, 2008: 1031-1036.
    [68]苗盼盼.基于内容的视频检索若干技术研究[D].硕士学位论文,2010,南京:南京理工大学.
    [69] B. T. Truong, S. Venkatesh. Video abstraction: A systematic review and classification[J]. ACM Transactions on Multimedia Computing Communications and Applications, 2007, Vol. 3(1):1-35.
    [70] M. Chatzigiorgaki, A. N. Skodras. Real-Time Keyframe Extraction Towards Video Content Identification[C]. The 16th International Conference on Digital Signal Processing, Vols 1-2, 2009: 934-939.
    [71] M. M. Yeung, B. L. Leo. Efficient matching and clustering of shots[C]. In Proceedings of the International Conference on Image Processing, Washington, DC, USA, 1995: 338-341.
    [72] H. J. Zhang, J. Wu, D. Zhong, S. W. Smoliar. An integrated system for content-based video retrieval and browsing[J]. Pattern Recognition Letters, 1997, Vol. 30(4): 643-658.
    [73] X. D. Zhang, T. Y. Liu, K. T. Lo, J. Feng. Dynamic selection and effective compression of key frames for video abstraction[J]. Pattern Recognition Letters, 2003, Vol. 24(9-10): 1523-1532.
    [74] C. Kim, J. N. Hwang. Object-based video abstraction for video surveillance systems[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2002, Vol. 12(12): 1128-1138.
    [75] R. Y. Tsai. A Versatile Camera Calibration Technique for High-Accuracy 3d Machine Vision Metrology Using Off-the-Shelf Tv Cameras and Lenses[J]. IEEE Journal of Robotics and Automation, 1987, Vol. 3(4): 323-344.
    [76] J. Y. Weng, P. Cohen, M. Herniou. Camera Calibration with Distortion Models and Accuracy Evaluation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, Vol. 14(10): 965-980.
    [77] M. Ahmed, A. Farag. Nonmetric calibration of camera lens distortion: Differential methods and robust estimation[J]. IEEE Transactions on Image Processing, 2005, Vol. 14(8): 1215-1230.
    [78] L. Noskovicova, R. Ravas. Technique for Camera Calibration[C]. Annals of Daaam for 2008 & Proceedings of the 19th International Daaam Symposium, 2008: 975-976.
    [79] M. Brady, G. E. Legge. Camera calibration for natural image studies and vision research[J]. Journal of the Optical Society of America a-Optics Image Science and Vision, 2009, Vol. 26(1): 30-42.
    [80] G. Brida, I. P. Degiovanni, M. Genovese, et al. Detection of multimode spatial correlation in PDC and application to the absolute calibration of a CCD camera[J]. Optics Express, 2010, Vol. 18(20): 20572-20584.
    [81] L. M. Song, M. P. Wang, L. Lu, H. J. Huan. High precision camera calibration in vision measurement[J]. Optics and Laser Technology, 2007, Vol. 39(7): 1413-1420.
    [82]李中伟,王从军,史玉升. 3D测量系统中的高精度摄像机标定算法[J].光电工程, 2007, Vol. 35(4): 58-63.
    [83] R. Li, J. H. Lewis, X. Jia, et al. Real-time 3D Tumor Localization and Volumetric ImageReconstruction using a Single X-ray Projection Image for Lung Cancer Radiotherapy[J]. International Journal of Radiation Oncology Biology Physics, 2010, Vol. 78(3): 2822-2826.
    [84] J. A. Lusk, B. Nutter. Automated 3-D Reconstruction of Stereo Fundus Images via Camera Calibration and Image Rectification[C]. IEEE International Symposium on Computer-Based Medical Systems, New Mexico, USA, 2009: 179-185.
    [85] G. Peng, X. H. Huang, J. Gao, C. Li. Camera Modeling and Distortion Calibration for Mobile Robot Vision[C]. The 7th World Congress on Intelligent Control and Automation, Vols 1-23, 2008: 1657-1662.
    [86] D. Meger, I. Rekleitis, G. Dudek. Simultaneous planning, localization, and mapping in a camera sensor network[J]. Journal of Robotics and Autonomous Systems, 2006, Vol. 54: 155-164.
    [87] J. Kelly, G. S. Sukhatme. Visual-Inertial Simultaneous Localization. Mapping and Sensor-to-Sensor Self-Calibration[C]. IEEE International Symposium on Computational Intelligence in Robotics and Automation, 2009: 360-368.
    [88] E. K. Bas, J. D. Crisman. An easy to install camera calibration for traffic monitoring[C]. IEEE Conference on Intelligent Transportation Systems, Boston, MA, 1997: 362-366.
    [89] B. Jansen, R. Deklerck. Semi-automatic calibration of 3D camera images - Monitoring activities made easy[C]. The 2nd International Conference on Pervasive Computing Technologies for Healthcare, 2008: 27-30.
    [90] N. K. Kanhere, S. T. Birchfield. A Taxonomy and Analysis of Camera Calibration Methods for Traffic Monitoring Applications[J]. IEEE Transactions on Intelligent Transportation Systems, 2010, Vol. 11(2): 441-452.
    [91] B. F. Wu, W. H.Chen, C. W. Chang, et al. Dynamic CCD camera calibration tor traffic monitoring and vehicle applications[C]. IEEE International Conference on Systems, Man and Cybernetics, Vols 1-8, 2007: 1982-1987.
    [92] K. Radkhah, T. Hemker, O. V. Stryk. A Novel Self-Calibration Method for Industrial Robots Incorporating Geometric and Nongeometric Effects[C]. International Conference on Mechatronics and Automation, Vols 1-2, Takamatsu, Japan, 2008: 863-868.
    [93] S. H. Wang, D. Du, W. Z. Zhang, et al. Detecting system modeling and hand-eye calibration for a robot to detect the ablation of hydro-turbine blades[C]. Robotic Welding, Intelligence and Automation, 2007, Vol. 362: 13-20.
    [94] T. A. Clarke, J.G. Fryer. The development of camera calibration methods and models[J]. Photogrammetric Record, 1998, Vol. 16(91): 51-66.
    [95]郑志刚.高精度摄像机标定和鲁棒立体匹配算法研究[D].博士学位论文,2008,中国科学技术大学.
    [96] J. Y. Bouguet. Camera Calibration Toolbox for Matlab[EB/OL]. 2010, Available from: http://www.vision.caltech.edu/bouguetj/calib_doc/index.html.
    [97] Mouragnon, E., et al., Monocular vision based SLAM for mobile robots[C]. 18th International Conference on Pattern Recognition, 2006, Vol. 3: 1027-1031.
    [98] T. Deselaers, D. Keysers, H. Ney. Features for image retrieval: an experimental comparison[J]. Information Retrieval, 2008, Vol. 11(2): 77-107.
    [99] A. Howard, N. Roy. The robotics data set repository (Radish)[EB/OL]. 2003: http://radish.sourceforge.net/
    [100] A. Gil, O. M. Mozos, M. Ballesta, O.Reinoso. A comparative evaluation of interest point detectors and local descriptors for visual SLAM[J]. Machine Vision and Applications, 2010, Vol. 21(6): 905-920.
    [101] I. Ulrich, I. Nourbakhsh. Appearance-based place recognition for topological localization[C]. IEEE International Conference on Robotics and Automation, San Francisco, CA, USA, 2000: 1023-1029.
    [102] P. Lamon, I. Nourbakhsh, B. Jensen, R. Siegwart. Deriving and matching image fingerprint sequences for mobile robot localization[C]. In Proceedings of the IEEE International Conference on Robotics and Automation, Seoul, Korea, 2001: 1609-1614.
    [103] B. J. A. Krose, N. Vlassis, R. Bunschoten, Y. Motomura. A probabilistic model for appearance-based robot localization[J]. Image and Vision Computing, 2001, Vol. 19(6): 381-391.
    [104] A. Torralba, K. P. Murphy, W. T. Freeman, M. A. Rubin. Context-based vision system for place and object recognition[C]. Ninth IEEE International Conference on Computer Vision, Vols 1, Nice, France, 2003: 273-280.
    [105] F. T. Ramos, B. Upcroft, S. Kumar, H. F. Durrant-Whyte. A Bayesian approach for place recognition[C]. In Proceedings of the IJCAI Workshop on Reasoning with Uncertainty in Robotics, IJCAI Edinburgh, Scotland, 2005: 1-8.
    [106] J. Wolf, W. Burgard, H. Burkhardt. Robust vision based localization by combining an image-retrieval system with Monte Carlo localization[J]. IEEE Transactions on Robotics, 2005, Vol. 21(2): 208-216.
    [107] F. Li, J. Kosecka. Probabilistic Location Recognition Using Reduced Feature Set[C]. IEEE International Conference on Robotics and Automation, Orlando, Florida, 2006: 3405-3411.
    [108] J. Q. Wang, R. Cipolla, H. B. Zha. Vision-based global localization using a visual vocabulary[C]. IEEE International Conference on Robotics and Automation, Vols 1-4, 2005:4230-4235.
    [109] G. Schindler, M. Brown, R. Szeliski. City-scale location recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Vols 1-8, 2007: 1378-1384.
    [110] S. Kim, I. S. Kweon. Simultaneous classification and visual word selection using entropy-based minimum description length[C]. IEEE International Conference on Pattern Recognition, Hong Kong, China, 2006: 650-653.
    [111] L. Yang, R. Jin, R. Sukthankar, F. Jurie. Unifying discriminative visual codebook generation with classifier training for object category recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Vols 1-12, 2008: 1253-1260.
    [112] J. Farquhar, S. Szedmak, H. Meng, J. Shawe-Taylor. Improving bags-of-keypoints image categorization[R]. 2005, Technical report, University of Southampton.
    [113] F. Moosmann, W. Triggs, F. Jurie. Randomized clustering forests for building fast and discriminative visual vocabularies[C]. Twentieth Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 2006.
    [114] J. Y. Gang, N.C. Wah, Y. Jun. Towards optimal bag-of-features for object categorization and semantic video retrieval[C]. ACM International Conference on Image and Video Retrieval, Amsterdam, Netherlands, 2007: 494-501.
    [115] Y. Linde, A. Buzo, R. M. Gray. Algorithm for Vector Quantizer Design[J]. IEEE Transactions on Communications, 1980, Vol. 28(1): 84-95.
    [116] A. Bosch, A. Zisserman, X. Munoz. Scene classification via Plsa[C]. In Proceedings of the European Conference on Computer Vision, 2006: 517-530.
    [117] C. Kenney, B. S. Manjunath, M. Zuliani. A condition number for point matching with application to registration and post-registration error estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, Vol. 25(11): 1437-1454.
    [118] S. T. Roweis, L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, Vol. 290(5500): 2323-2326.
    [119] B. J. Frey, D. Dueck. Clustering by passing messages between data points[J]. Science, 2007, Vol. 315(5814): 972-976.
    [120] K. Karen. Affinity program slashes computing times[EB/OL]. Oct. 25, 2007, Available from: http://www.news.utoronto.ca/bin6/070215-2952.asp.
    [121] L. Fei-Fei, P. Perona. A Bayesian hierarchical model for learning natural scene categories[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, Vol. 2: 524-531.
    [122] A. Asuncion, D.J. Newman. {UCI} Machine Learning Repository:Wine Data Set 2007[EB/OL]. University of California, Irvine, School of Information and Computer Sciences, Available from: http://archive.ics.uci.edu/ml/datasets/Wine
    [123] AT&T Laboratories Cambridge. The ORL Database of Faces[EB/OL]. 1992 and 1994, Available from: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.
    [124] N. Rasiwasia, N. Vasconcelos. Scene classification with low-dimensional semantic spaces and weak supervision[C]. IEEE Conference on Computer Vision and Pattern Recognition, Vols 1-12, 2008: 243-248.
    [125] S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006: 2169-2178.
    [126] C. Valgren, T. Duckett, A. Lilienthal. Incremental spectral clustering and its application to topological mapping[C]. Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Vols 1-10, 2007: 4283-4288.
    [127] J. Sivic, A. Zisserman. Video google: A text retrieval approach to object matching in videos[C]. IEEE International Conference on Computer Vision and Pattern Recognition, Nice, France, 2003: 1470-1477.
    [128] U. V. Luxburg. A Tutorial on Spectral Clustering[J]. Statistics and Computing, 2007, Vol. 17(4): 395-416.
    [129]蔡晓妍,戴冠中,杨黎斌.谱聚类算法综述[J].计算机科学, 2008, Vol. 35(7): 14-18.
    [130]高琰,谷士文,唐琏,蔡自兴.机器学习中谱聚类方法的研究[J].计算机科学, 2007, Vol. 34(2): 201-203.
    [131] F. R. Bach, M. I. Jordan. Learning spectral clustering[J]. Advances in Neural Information Processing Systems, 2004, Vol. 16: 305-312.
    [132] Y. Ng, M.I. Jordan, Y. Weiss. On spectral clustering: Analysis and an algorithm[J]. Advances in Neural Information Processing Systems, 2002, Vol.14: 849-856.
    [133] L. Zelnik-Manor, P. Perona. Self-tuning spectral clustering[J]. In Advances in Neural Information Processing Systems, 2004, Vol. 17: 1601-1608.
    [134] Y. Y. Lin, T. L. Liu, C.S. Fuh. Clustering Complex Data with Group-Dependent Feature Selection[C]. In Proceedings of the European Conference on Computer Vision, 2010: 84-97.
    [135] M. Dash, H. Liu. Feature Selection for Clustering[C]. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, Springer-Verlag London, UK, 2000: 110-121.
    [136] M. Breaban, H. Luchian. A unifying criterion for unsupervised clustering and feature selection[J]. Pattern Recognition, 2011, Vol. 44(4): 854-865.
    [137] H. Zeng, Y. M. Cheung. Feature Selection for Clustering on High Dimensional Data[C]. Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence, Springer-Verlag Berlin, Heidelberg, 2008: 913-922.
    [138] D. M. Witten, R. Tibshirani. A Framework for Feature Selection in Clustering[J]. Journal of the American Statistical Association, 2010, Vol. 105(490): 713-726.
    [139] F. Zhao, L. C. Jiao, H. Q. Liu, et al., Spectral clustering with eigenvector selection based on entropy ranking[J]. Neuro Computing, 2010, Vol. 73(10-12): 1704-1717.
    [140] T. Xiang, S. Gong. Spectral clustering with eigenvector selection[J]. Pattern Recognition, 2008, Vol. 41(3): 1012-1029.
    [141] D. Filliat. Loop Closure Detection. Datasets[EB/OL]. Available from: http://cogrob. ensta.fr/ loopclosure.html.
    [142] N. David, S. Henrik. Scalable recognition with a vocabulary tree[C]. In Proceedings of the Conference on Computer Vision and Pattern Recognition, New York, 2006: 2161-2168.
    [143] K. Grauman, T. Darrell. The pyramid match kernel: Discriminative classification with sets of image features[C]. Tenth IEEE International Conference on Computer Vision, Beijing, China, 2005, Vols. 1-2: 1458-1465.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700