融合颜色和深度信息的三维同步定位与地图构建研究

英文题名：Research on3D Simultaneous Localization and Mapping Fusing Color and Depth Information
作者：刘艳丽
论文级别：博士
学科专业名称：计算机科学与技术
中文关键词：深度信息 ; 同步定位与地图构建 ; 图像特征 ; 二进制描述符 ; 几何特征 ; 视觉里程计 ; 闭环
英文关键词：depth information ; simultaneous localization and map
英文关键词：building ; image features ; binary descriptor ; geometric feature ; visual
英文关键词：odometry ; loop closing
学位年度：2014
导师：樊晓平
学科代码：0812
学位授予单位：中南大学
论文提交日期：2014-04-01

摘要

摘要：同步定位与地图构建(Simultaneous Localization and Mapping, SLAM)是移动机器人实现未知环境中自主导航的基础,也是其实现自主化和智能化的前提条件之一。近年来,二维地图自主构建的理论与方法得到了深入研究并取得了丰富成果。随着传感器技术的进步和SLAM计算理论的不断发展,面向6自由度机器人的三维地图构建引起了研究者的关注。微软公司在2010年6月推出的廉价的RGB-D传感器——Kinect,为创建拥有丰富三维空间信息与颜色纹理信息的环境地图提供了新的可能。
     本文针对室内未知环境下基于颜色信息与深度信息的三维同步定位和地图构建进行研究。在不需要任何先验知识的情况下,Kinect在室内场景中作6自由度运动并感知周围环境信息。同时提取稳定的环境特征点来表征3D空间实际物理点,以此作为路标来构建环境的三维几何地图。具体研究工作包括以下五个部分内容：
     (1)对典型的深度摄像机Kinect在计算机视觉处理方面的应用进行了综述,针对Kinect的深度信息随着距离的增大出现显著畸变的问题,提出了一种无需人工干预的无监督学习的深度乘子图学习算法,从而达到深度校正的目的。该方法首先利用近距离测量的具有相对高精度的测量数据,采用常见的视觉测程+位姿图优化的RGB-D SLAM算法构建环境地图(须有闭环),然后利用该地图与深度测量数据的误差对深度乘子图进行学习,采用极大似然估计法逐步对其进行优化。与需要人工干预的方法不同,该方法可以在SLAM的过程中自动地完成深度校正的学习,便于用户使用。
     (2)为了降低SLAM的复杂度和提高数据关联的可信度,对图像兴趣点的检测算法进行了深入研究。通过分析阈值t与层数o两个主要参数对OpenCV库中BRISK-AGAST检测算法性能的影响,提出了一种可调节的自适应特征检测方法——可调节的BRISK-AGAST检测器。该检测器的优点在于增强所提取的环境特征点的稳定性,提高SLAM过程中数据关联的几率和可信度,同时避免过多的环境特征在地图中表示,从而降低SLAM的复杂度。
     (3)为了充分利用RGB-D图像的深度信息来更有效地区分环境特征点,对融合外观和深度信息的RGB-D图像特征描述符进行了研究,重点分析了BRAND描述符的机理。通过实验方法从运行时间、内存消耗,匹配性能等三个方面,将BRAND与EDVD、SURF、SIFT、 CSHOT、SPIN几种典型的特征描述算法作了比较,证明了它的优越性。
     (4)为了克服目前基于图优化的RGB-D SLAM算法在缺少大的闭环约束情况下误差累积过大,不适用于在线应用的缺陷,提出了一种基于视觉航迹推算和扩展信息滤波的RGB-D SLAM方法,简称VO-EIF SLAM。利用相机针孔模型和基于高斯混合的深度不确定性度量模型,建立了RGB-D特征观测的三维不确定性模型,从而得到EIF SLAM的观测模型；设计了一种基于视觉残差的视觉航迹推算算法,用来估计运动控制输入信息；采用了二进制的特征描述BRAND来进行特征匹配,有效降低了数据关联的复杂度。同时,建立空间几何不确定性和二进制描述不确定性的统一模型,避免了显示地进行数据关联。
     (5)深入研究了基于二进制描述符的快速特征关联算法,并将其应用于RGB-D SLAM快速闭环检测。分别设计和实现了二进制描述符的局部敏感哈希搜索算法和基于分层聚类树的快速二进制特征搜索算法来解决单个特征的快速关联问题；在此基础上,提出了一种融合局部几何约束的多特征点快速匹配算法,从而达到RGB-D SLAM快速闭环检测的目的。该算法利用了汉明距离来比较匹配度,有效提高了闭环检测的速度与精度。
     最后,对全文进行了总结,并对今后进一步的研究方向进行了展望。文中共有图53幅、表3个、参考文献189篇。
Abstract:Simultaneous Localization and Mapping (SLAM) is the basis for mobile robot autonomous navigating in unknown environment, and also one of the prerequisites for realization of autonomous and intelligent. In recent years, the theories and methods of two-dimensional map building have been comprehensively studied and have achieved fruitful results. With the advances in sensor technology and the continuous development of SLAM computation theory, three-dimensional map building for the6-DOF robots has attracted researchers'attention. In June2010, Microsoft Corp launched a cheap RGB-D sensor named Kinect, which provides new possibilities for creating environment map with rich3D spatial information and color texture information.
     This paper conducted simultaneous localization and map building based on RGB-D color information and depth information for indoor unknown environment. Without any prior knowledge, a Kinect does6-DOF motion in indoor scenes and perceive the surrounding environment information, extracting stable feature points of the environment to represent the actual physical point in3D space which is used as landmarks to create feature-based geometry map of the environment. The research work includes the following five parts:
     (1) Reviewed the applications of the typical depth camera "Kinect" in computer vision processing. To resolve the significant depth distortion inherent in Kinect, which degree aggravated with increasing distance, an unsupervised learning algorithm without human intervention for depth multiplier image is proposed to achieve the purpose of depth correction. It first builds the environment map using a common visual odometry+pose graph optimization RGB-D SLAM algorithm from the relatively high accurate measurement data of short distance measurement (during the process loop-closing is needed). Then, the depth multiplier image is studied driven by the errors between the map and the depth measurement data, using the maximum likelihood estimation method gradually to optimize. Differ from the methods that require human intervention, this method can complete the learning of depth correction automatically during SLAM process, which makes it easier to use.
     (2) In order to reduce the complexity of SLAM and increase the credibility of data association, interest points detection algorithm is deeply investigated. By analyzing the two parameters, threshold t and octave parameter o, impact on the performance of the BRISK-AGAST detection algorithm in OpenCV, an adjustable adaptive feature detection algorithm is proposed:adjustable-BRISK-AGAST detector. The new detector has the advantages that enhancing the stability of the extracted feature points, increasing the probability and reliability of data association in SLAM process, and avoiding excessive environmental features indicated in the map so as to reducing the complexity of SLAM.
     (3) In order to take full advantage of RGB-D image depth information to more effectively distinguish between points of interest, the RGB-D image descriptors fusing appearance and geometric shape information are studied, focusing on the analysis of the mechanism of BRAND descriptor. Experimental results show that BRAND descriptor is superior to EDVD, SURF, SIFT, CSHOT, SPIN in processing time, memory consumption, matching performance.
     (4) The current graph optimization based RGB-D SLAM algorithm is not suitable for online applications because in many cases the error accumulation will be very large due to absence of loop-closing. In order to overcome this defect, a new RGB-D SLAM method based on visual odometry and extended information filter, referred to as VO-EIF SLAM, is proposed. Using the pinhole camera model and the depth uncertainty measure model based on Gaussian mixture, a RGB-D features'three dimensional uncertainty measure model is established, which can be seen as the observation model of EIF SLAM. A visual dead reckoning algorithm based on visual residuals is devised, which is used to estimate motion control input. In addition, our observation model considers observations as sets of landmarks determined by their3D positions and their BRAND descriptors. We avoid explicit data association by marginalizing out the observation likelihood over all the possible associations, thus overcoming the problems derived from establishing incorrect correspondences between the observed landmarks and those in the map.
     (5) Deeply studied a fast feature point association algorithm based on binary descriptors, and applied it to address the problem of loop-closing for RGB-D SLAM. Designed and implemented two fast binary features searching algorithms to solve the problem of fast data association for single feature point:the locality-sensitive hashing searching algorithm and hierarchical clustering based searching algorithm. Based on these, put forward a kind of multi feature point matching algorithm fusing local geometric constraints, thus achieve the purpose of quick close-loop detection for RGB-D SLAM. These algorithms use Hamming distance to compare matching degree and effectively improve the speed and accuracy of loop closure detection.
     The conclusions and directions for future research work are discussed in the last chapter. There are53figures,3tables and189references.

引文

[1]Nilsson N J. A mobile automation:An application of artificial intelligence techniques:Proceedings of the 1st International Joint Conference on Artificial Intelligence, Washington, DC,1969[C]. IEEE.
    [2]Salvini P, Laschi C, Dario P. Do service robots need a driving license? [Industrial activities]. Robotics & Automation Magazine, IEEE,2011,18(2):12-13.
    [3]Kehoe B, Matsukawa A, Candido S, et al. Cloud-based robot grasping with the google object recognition engine:2013 IEEE International Conference on Robotics and Automation, Karlsruhe, 2013[C].IEEE.
    [4]Arumugam R, Enti V R, Liu B, et al. DAvinCi:A cloud computing framework for service robots: 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK,2010[C]. IEEE.
    [5]Strasdat H, Montiel JMM, Davison A J. Scale Drift-Aware large scale monocular SLAM:Proc. of Robotics:Science and Systems,2010[C]. The MIT Press.
    [6]Kerl C, Sturm J, Creiners D. Dense visual SLAM for RGB-D cameras:2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo,2013[C]. IEEE.
    [7]Henry P, Krainin M, Herbst E, et al. RGB-D mapping:Using Kinect-style depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research, 2012,31(5):647-663.
    [8]Kummerle R, Grisetti G, Strasdat H, et al. G2o:A general framework for graph optimization: Robotics and Automation (ICRA),2011 IEEE International Conference on, Shanghai,2011 [C].
    [9]Thrun S, Burgard W, Fox D. Probabilistic robotics (Intelligent robotics and autonomous agents)[G]. The MIT Press,2005.
    [10]Leonard J J, Feder H J S. Decoupled stochastic mapping [for mobile robot & amp; AUV navigation]. IEEE Journal of Oceanic Engineering,2001,26(4):561-571.
    [11]Leonard J, Durrant-Whyte H, Cox I J. Dynamic map building for autonomous mobile robot:IEEE International Workshop on Intelligent Robots and Systems'90, Ibaraki,1990[C]. IEEE.
    [12]余洪山,王耀南.基于粒子滤波器的移动机器人定位和地图创建研究进展.机器人,2007(03):281-289.
    [13]Montemerlo M, Thrun S, Koller D, et al. FastSLAM:A factored solution to the simultaneous localization and mapping problem:Eighteenth National Conference on Artificial Intelligence, Edmonton, Alberta, Canada,2002 [C]. American Association for Artificial Intelligence.
    [14]Montemerlo M, Thrun S, Roller D, et al. FastSLAM 2.0:An Improved Particle Filtering Algorithm for Simultaneous Localization and Mapping that Provably Converges:Proceedings of the 18th International Joint Conference on Artificial Intelligence, San Francisco, CA, USA, 2003 [C]. Morgan Kaufmann Publishers Inc.
    [15]Chanki K, Sakthivel R, Wan K C. Unscented FastSLAM:A robust and efficient solution to the SLAM problem. IEEE Transactions on Robotics,2008,24(4):808-820.
    [16]Eliazar A, Parr R. DP-SLAM:Fast, robust simultaneous localization and mapping without predetermined landmarks:Proceedings of the 18th International Joint Conference on Artificial Intelligence, San Francisco, CA, USA,2003 [C]. Morgan Kaufmann Publishers Inc.
    [17]Mitsou N, Tzafestas C. Maximum likelihood SLAM in dynamic environments:19th IEEE International Conference on Tools with Artificial Intelligence, Patras,2007[C]. IEEE.
    [18]Thrun S, Beetz M, Bennewitz M, et al. Probabilistic algorithms and the interactive museum Tour-Guide robot minerva[Z].2000:19,972-999.
    [19]Thrun S, Martin C, Liu Y, et al. A real-time expectation-maximization algorithm for acquiring multiplanar maps of indoor environments with mobile robots. IEEE Transactions on Robotics, 2004,20(3):433-443.
    [20]Smith R, Self M, Cheeseman P. Estimating uncertain spatial relationships in robotics:1987 IEEE International Conference on Robotics and Automation,1987[C]. IEEE.
    [21]Frese U. A discussion of simultaneous localization and mapping. Autonomous Robots, 2006,20(1):25-42.
    [22]Liu Y, Thrun S. Results for outdoor-SLAM using sparse extended information filters:2013 IEEE International Conference on Robotics and Automation,2003 [C]. IEEE.
    [23]Huang S, Wang Z, Dissanayake G. Sparse local submap joining filter for building Large-Scale maps. IEEE Transactions on Robotics,2008,24(5):1121-1130.
    [24]Walter M, Hover F, Leonard J. SLAM for ship hull inspection using exactly sparse extended information filters:2008 IEEE International Conference on Robotics and Automation, Pasadena, CA,2008[C]. IEEE.
    [25]Lu F, Milios E. Globally consistent range scan alignment for environment mapping. Autonomous Robots,1997,4(4):333-349.
    [26]Lu F, Milios E. Robot pose estimation in unknown environments by matching 2D range scans. Journal of Intelligent and Robotic Systems,1997,18(3):249-275.
    [27]Gutmann J, Konolige K. Incremental mapping of large cyclic environments:1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation,1999[C]. IEEE.
    [28]Konolige K. Large-scale map-making:Proceedings of the 19th National Conference on Artifical Intelligence, San Jose, California,2004[C]. AAAI Press.
    [29]Dellaert F, Kaess M. Square Root SAM:Simultaneous localization and mapping via square root information smoothing. International Journal of Robotics Reasearch,2006,25(12):1181-1203.
    [30]Eustice R M, Singh H, Leonard J J. Exactly sparse delayed-state filters for view-based SLAM. IEEE Transactions on Robotics,2006,22(6):1100-1114.
    [31]Newman P, Cole D, Ho K. Outdoor SLAM using visual appearance and laser ranging:Proceedings 2006 IEEE International Conference on Robotics and Automation, Orlando, FL,2006[C]. IEEE.
    [32]Henry P, Krainin M, Herbst E, et al. Rgbd mapping:Using depth cameras for dense 3d modeling of indoor environments:Proc. of the Intl. Symp. on Experimental Robotics,2010[C].
    [33]Grisetti G, Grzonka S, Stachniss C, et al. Efficient estimation of accurate maximum likelihood maps in 3D:2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA,2007[C]. IEEE.
    [34]Calonder M, Lepetit V, Fua P. Keypoint signatures for fast learning and recognition:Proceedings of the 10th European Conference on Computer Vision, Berlin, Heidelberg,2008[C]. Springer-Verlag.
    [35]Lourakis MIA, Argyros A A. SBA:A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software,2009,36(1):1-30.
    [36]Henry P, Fox D, Bhowmik A, et al. Patch volumes:Segmentation-based consistent mapping with RGB-D cameras:2013 International Conference on 3D Vision, Seattle, WA,2013[C]. IEEE.
    [37]Huang A S, Bachrach A, Henry P, et al. Visual odometry and mapping for autonomous flight using an RGB-D camera:Proc. of the Intl. Sym. of Robot. Research, Flagstaff, Arizona, USA,2011[C].
    [38]Audras C, Comport A, Meilland M, et al. Real-time dense appearance-based SLAM for RGB-D sensors. Proceedings of the 2011 Australasian Conference on Robotics and Automation,2011.
    [39]Steinbrucker F, Sturm J, Cremers D. Real-time visual odometry from dense RGB-D images:2011 IEEE International Conference on Computer Vision Workshops, Barcelona,2011 [C]. IEEE.
    [40]Whelan T, Johannsson H, Kaess M, et al. Robust real-time visual odometry for dense RGB-D mapping:2013 IEEE International Conference on Robotics and Automation, Karlsruhe,2013[C]. IEEE.
    [41]Whelan T, Kaess M, Fallon M, et al. Kintinuous:Spatially extended kinectfusion:RSS Workshop on RGB-D:Advanced Reasoning with Depth Cameras, Sydney, Australia,2012[C].
    [42]Hu G, Huang S, Zhao L, et al. A robust RGB-D SLAM algorithm:2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura,2012[C]. IEEE.
    [43]Lee D, Kim H, Myung H. GPU-based real-time RGB-D 3D SLAM:2012 9th International Conference on Ubiquitous Robots and Ambient Intelligence, Daejeon,2012[C]. IEEE.
    [44]Fioraio N, Konolige K. Realtime visual and point cloud slam:Proc. of the RGB-D workshop on advanced reasoning with depth cameras at robotics:Science and Systems Conf,2011 [C].
    [45]Endres F, Hess J, Engelhard N, et al. An evaluation of the RGB-D SLAM system:2012 IEEE International Conference on Robotics and Automation, Saint Paul, MN,2012[C]. IEEE.
    [46]Hornung A, Wurm K M, Bennewitz M. Humanoid robot localization in complex indoor environments:2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, 2010[C]. IEEE.
    [47]Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura,2012[C]. IEEE.
    [48]Http://vision.in.tum.de/data/datasets/rgbd-dataset[EB/OL].
    [49]Tykkalaa T, Comportb A I, Kamarainenc J, et al. Live RGB-D camera tracking for television production studios. Journal of Visual Communication and Image Representation, 2014,25(1):207-217.
    [50]Whelan T, Kaess M, Leonard J J, et al. Deformation-based loop closure for large scale dense RGB-D SLAM:2013 IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, Tokyo, Japan, 2013[C]. IEEE.
    [51]Endres F, Hess J, Sturm J, et al.3-D mapping with an RGB-D camera. IEEE Transactions on Robotics,2014,30(1):177-187.
    [52]Whelan T, Johannsson H, Kaess M, et al. Robust real-time visual odometry for dense RGB-D mapping:2013 IEEE International Conference on Robotics and Automation, Karlsruhe,2013[C]. IEEE.
    [53]Kerl C, Sturm J U R, Cremers D. Robust odometry estimation for rgb-d cameras:2013 IEEE International Conference on Robotics and Automation, Karlsruhe,2013[C]. IEEE.
    [54]Tykkala T, Audras C, Comport A I. Direct Iterative Closest Point for real-time visual odometry: 2011 IEEE International Conference on Computer Vision Workshops, Barcelona,2011 [C]. IEEE.
    [55]Ataer-Cansizoglu E, Taguchi Y, Ramalingam S, et al. Tracking an RGB-D camera using points and planes:2013 IEEE International Conference on Computer Vision Workshops, Sydney, NSW, 2013[C]. IEEE.
    [56]杨鸿,钱堃,戴先中,等.基于kinect传感器的移动机器人室内环境三维地图创建.东南大学学报(自然科学版),2013(S1):183-187.
    [57]贾松敏,王可,郭兵,等.基于rgb-d相机的移动机器人三维slam.华中科技大学学报(自然科学版),2014(01)：103-109.
    [58]杨东方,王仕成,刘华平,等.基于kinect系统的场景建模与机器人自主导航.机器人,2012(05):581-589.
    [59]梁明杰,闵华清,罗荣华.基于图优化的同时定位与地图创建综述.机器人,2013(04):500-512.
    [60]Bethencourt A, Jaulin L.3D reconstruction using interval methods on the kinect device coupled with an IMU. International Journal of Advanced Robotic Systems,2013,10:1-10.
    [61]Lazebnik S, Schmid C, Ponce J. Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories:2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2006[C]. IEEE.
    [62]Kim S, Yoon K, Kweon I S. Object recognition using a generalized robust invariant feature and gestalt's law of proximity and similarity. Pattern Recognition,2008,41(2):726-741.
    [63]Morita T, Kanade T. A sequential factorization method for recovering shape and motion from image streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(8):858-867.
    [64]Brown M, Lowe D G. Automatic panoramic image stitching using invariant features. International Journal of Computer Vision,2007,74(1):59-73.
    [65]Shi J, Tomasi C. Good features to track:1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA,1994[C]. IEEE.
    [66]Harris C, Stephens M. A combined comer and edge detector:Proceedings of the 4th Alvey Vision Conference, University of Manchester,1988[C].
    [67]Rosten E, Drummond T. Fusing points and lines for high performance tracking:Tenth IEEE International Conference Vision, Beijing,2005[C]. IEEE.
    [68]Rosten E, Drummond T. Machine learning for High-Speed corner detection//Leonardis A, Bischof H, Pinz A. Computer Vision-ECCV 2006. Springer Berlin Heidelberg,2006:430-443.
    [69]Leutenegger S, Chli M, Siegwart R Y. BRISK:Binary robust invariant scalable keypoints: Proceedings of the 2011 International Conference on Computer Vision, Washington, DC, USA, 2011[C]. IEEE Computer Society.
    [70]Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision,2004,60(2):91-110.
    [71]Bay H, Ess A, Tuytelaars T, et al. Speeded-Up robust features (SURF). Computer Vision and Image Understanding,2008,110(3):346-359.
    [72]Calonder M, Lepetit V, Strecha C, et al. BRIEF:Binary robust independent elementary features//Daniilidis K, Maragos P, Paragios N. Computer Vision-ECCV 2010. Springer Berlin Heidelberg,2010:778-792.
    [73]Rublee E, Rabaud V, Konolige K, et al. ORB:An efficient alternative to SIFT or SURF:2011 IEEE International Conference on Computer Vision, Barcelona,2011[C]. IEEE.
    [74]Saipullah K, Ismail N A, Anuar A, et al. Comparison of feature extractors for real-time object detection on android smartphone. Journal of Theoretical and Applied Information Technology, 2013,47(l):135-142.
    [75]Nourani-Vatani N, P. B, Roberts J. A study of feature extraction algorithms for optical flow tracking:Proceedings of Australasian Conference on Robotics and Automation (ACRA) 2012, Wellington, New Zealand,2012[C].
    [76]Senst T, Unger B, Keller I, et al. Performance evaluation of feature detection for local optical flow tracking:2012 International Conference on Pattern Recognition Applications and Methods, 2012[C]. SciTePress.
    [77]Rosten E, Porter R, Drummond T. Faster and better:A machine learning approach to corner detection. IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,32(1):105-119.
    [78]Taylor S, Rosten E, Drummond T. Robust feature matching in 2.3 μs:IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Miami, FL,2009[C]. IEEE Computer Society.
    [79]Klein G, Murray D. Parallel tracking and mapping for small AR workspaces:6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara,2007[C]. IEEE.
    [80]Ke Y, Sukthankar R. PCA-SIFT:A more distinctive representation for local image descriptors: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA,2004[C]. IEEE Computer Society.
    [81]Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(10):1615-1630.
    [82]Yu G, Morel J M. A fully affine invariant image comparison method:IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei,2009 [C]. IEEE.
    [83]Sirmacek B, Unsalan C. Urban-Area and building detection using SIFT keypoints and graph theory. IEEE Transactions on Geoscience and Remote Sensing,2009,47(4):1156-1167.
    [84]Se S, Lowe D G, Little J J. Vision-based global localization and mapping for mobile robots. IEEE Transactions on Robotics,2005,21(3):364-375.
    [85]Bay H, Tuytelaars T, Gool L. SURF:Speeded up robust features//Leonardis A, Bischof H, Pinz A. Computer Vision-ECCV 2006. Springer Berlin Heidelberg,2006:404-417.
    [86]Brown M, Lowe D G. Invariant Features from Inerest Point Groups:Proceedings of the British Machine Vision Conference,2002[C]. British Machine Vision Association.
    [87]Ta D, Chen W, Gelfand N, et al. SURFTrac:Efficient tracking and continuous object recognition using local feature descriptors:2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL,2009[C]. IEEE.
    [88]Segundo M P, Gomes L, Bellon O R P, et al. Automating 3D reconstruction pipeline by surf-based alignment:2012 19th IEEE International Conference on Image Processing, Orlando, FL,2012[C]. IEEE.
    [89]Murali Y, Mahesh V. Image mosaic using speeded up robust feature detection. International Journal of Advanced Research in Electronics and Communication Engineering,2012,1(3):40-45.
    [90]Ojala T, Inen M P A, Harwood D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognition,1996,29(1):51-59.
    [91]Cronje J. BFROST:Binary Features from Robust Orientation Segment Tests accelerated on the GPU:22nd Annual Symposium of the Pattern Recognition Association of South Africa, Emerald Casino and Resort, Vanderbijlpark, South Africa,2011[C].
    [92]Alahi A, Ortiz R, Vandergheynst P. FREAK:Fast retina keypoint:2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2012[C]. IEEE
    [93]Nascimento E R, Oliveira G L, Campos M F M, et al. BRAND:A robust appearance and depth descriptor for RGB-D images:2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura,2012[C]. IEEE.
    [94]Kinect camera, http://www.xbox.com/en-US/kinect/default.htm[EB/OL].
    [95]Stereo camera, http://en.wikipedia.org/wiki/Stereo camera[EB/OL].
    [96]Gokturk S B, Yalcin H, Bamji C. A Time-Of-Flight depth sensor-system description, issues and solutions:Conference on Computer Vision and Pattern Recognition Workshop 2004,2004[C].
    [97]Smisek J, Jancosek M, Pajdla T.3D with kinect:2011 IEEE International Conference on Computer Vision Workshops, Barcelona,2011[C]. IEEE.
    [98]Stoyanov T, Mojtahedzadeh R, Andreasson H, et al. Comparative evaluation of range sensor accuracy for indoor mobile robotics and automated logistics applications. Robotics and Autonomous Systems,2013,61(10):1094-1105.
    [99]Http://zh.wikipedia.org/wiki/Kinect[EB/OL].
    [100]Http://www.primesense.com[EB/OL].
    [101]余涛Kinect应用开发实战：用最自然的方式与机器对话[G].机械工业出版社,2012.
    [102]OpenNI, http://www.openni.org/[EB/OL].
    [103]Microsoft Kinect SDK, http://www.microsoft.com/enus/kinectforwindows/[EB/OL].
    [104]OpenKinect, https://github.com/OpenKinect/libfreenect/[EB/OL].
    [105]Xia L, Chen C, Aggarwal J K. Human detection using depth information by Kinect:2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Colorado Springs, CO,2011[C]. IEEE.
    [106]Rougier C, Auvinet E, Rousseau J, et al. Fall Detection from Depth Map Video Sequences: Proceedings of the 9th International Conference on Toward Useful Services for Elderly and People with Disabilities:Smart Homes and Health Telematics, Berlin, Heidelberg,2011[C]. Springer-Verlag.
    [107]Zhong Z, Weihua L, Metsis V, et al. A viewpoint-independent statistical method for fall detection: 2012 21st International Conference on Pattern Recognition, Tsukuba,2012[C]. IEEE.
    [108]Han J, Pauwels E J, de Zeeuw P M, et al. Employing a RGB-D sensor for real-time tracking of humans across multiple re-entries in a smart environment. IEEE Transactions on Consumer Electronics,2012,58(2):255-263.
    [109]Liu W, Xia T, Wan J, et al. RGB-D based multi-attribute people search in intelligent visual surveillance:Proceedings of the 18th International Conference on Advances in Multimedia Modeling, Berlin, Heidelberg,2012[C]. Springer-Verlag.
    [110]Choi W, Pantofaru C, Savarese S. Detecting and tracking people using an RGB-D camera via multiple detector fusion:2011 IEEE International Conference on Computer Vision Workshops, Barcelona,2011[C]. IEEE.
    [111]Lai K, Bo L, Ren X, et al. A large-scale hierarchical multi-view rgb-d object dataset:2011 IEEE International Conference on Robotics and Automation, Shanghai,2011 [C]. IEEE.
    [112]Silberman N, Hoiem D, Kohli P, et al. Indoor Segmentation and Support Inference from RGBD Images:Proceedings of the 12th European conference on Computer Vision, Florence, Italy, 2012[C]. Springer-Verlag.
    [113]Lai K, Bo L, Ren X, et al. Sparse distance learning for object recognition combining RGB and depth information; 2011 IEEE International Conference on Robotics and Automation, Shanghai, 2011[C]. IEEE.
    [114]Bo L, Ren X, Fox D. Unsupervised feature learning for RGB-D based object recognition//Desai J P, Dudek G, Khatib O, et al. Experimental Robotics. Springer International Publishing, 2013:387-402.
    [115]Bo L, Lai K, Ren X, et al. Object recognition with hierarchical kernel descriptors:2011 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2011 [C]. IEEE.
    [116]Ren X, Bo L, Fox D. RGB-(D) scene labeling:Features and algorithms:2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2012[C]. IEEE.
    [117]Shotton J, Fitzgibbon A, Cook M, et al. Real-time human pose recognition in parts from single depth images:2011 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2011[C]. IEEE.
    [118]Girshick R B, Shotton J, Kohli P, et al. Efficient regression of general-activity human poses from depth images.,2011[C]. IEEE.
    [119]Ye M, Wang X, Yang R, et al. Accurate 3D pose estimation from a single depth image:2011 IEEE International Conference on Computer Vision, Barcelona,2011[C]. IEEE.
    [120]Shen W, Deng K, Bai X, et al. Exemplar-based human action pose correction and tagging:2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2012[C]. IEEE.
    [121]Yang X, Tian Y. EigenJoints-based action recognition using Naive-Bayes-Nearest-Neighbor:2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI,2012[C]. IEEE.
    [122]Lu X, Chia-Chih C, Aggarwal J K. View invariant human action recognition using histograms of 3D joints:2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI,2012[C]. IEEE.
    [123]Ni B, Wang G, Moulin P. RGBD-HuDaAct:A color-depth video database for human daily activity recognition:2011 IEEE International Conference on Computer Vision Workshops, Barcelona, 2011[C].IEEE.
    [124]Fothergill S, Mentis H, Kohli P, et al. Instructing people for training gestural interactive systems: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA,2012[C]. ACM.
    [125]Cheng Z, Qin L, Ye Y, et al. Human daily action analysis with multi-view and Color-Depth data//Fusiello A, Murino V, Cucchiara R. Computer Vision-ECCV 2012. Workshops and Demonstrations. Springer Berlin Heidelberg,2012:52-61.
    [126]Tara R, Santosa P, Adji T. Hand segmentation from depth image using anthropometric approach in natural interface development. International Journal of Computer Vision,2012,3(5):l-4.
    [127]Caputo M, Denker K, Dums B, et al.3D hand gesture recognition based on sensor fusion of commodity hardware:Proc. Conf. Mensch Comput, Munchen,2012[C]. Oldenbourg Verlag.
    [128]Iason Oikonomidis N K, Argyros A. Efficient model-based 3D tracking of hand articulations using Kinect:Proceedings of the British Machine Vision Conference, University of Dundee, UK, 2011[C]. BMVA Press.
    [129]Oikonomidis I, Kyriazis N, Argyros A A. Tracking the articulated motion of two strongly interacting hands:2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI,2012[C]. IEEE.
    [130]Keskin C, Kirac F, Kara Y E, et al. Real time hand pose estimation using depth sensors:2011 IEEE International Conference on Computer Vision Workshops, Barcelona,2011[C]. IEEE.
    [131]Tang M. Recognizing Hand Gestures with Microsoft's Kinect[R].Department of Electrical Engineering, Stanford University,2011.
    [132]Yi L. Hand gesture recognition using Kinect:2012 IEEE 3rd International Conference on Software Engineering and Service Science, Beijing,2012[C]. IEEE.
    [133]Engelhard N, Endres F, Hess J, et al. Real-time 3D visual SLAM with a hand-held camera:Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden,2011[C].
    [134]Izadi S, Kim D, Hilliges O, et al. KinectFusion:Real-time 3D reconstruction and interaction using a moving depth camera:Proceedings of the 24th annual ACM symposium on User interface software and technology, Santa Barbara, California, USA,2011 [C]. ACM.
    [135]Newcombe R A, Izadi S, Hilliges O, et al. KinectFusion:Real-time dense surface mapping and tracking:2011 10th IEEE International Symposium on Mixed and Augmented Reality, Basel, 2011[C].IEEE.
    [136]Meister S, Izadi S, Kohli P, et al. When can we use KinectFusion for ground truth acquisition? Workshop on Color-Depth Camera Fusion in Robotics, IEEE International Conference on Intelligent Robots and Systems,2012[C].
    [137]Meilland M, Comport A I, Rives P. Real-time Dense Visual Tracking under Large Lighting Variations:British Machine Vision Conference, Dundee, UK,2011 [C]. BMVA Press.
    [138]Meilland M, Comport A, Rives P. Dense RGB-D mapping for Real-Time localisation and navigation:IV12 Workshop on Navigation Positiong and Mapping, Alcala de Henares, Spain, 2012[C].
    [139]Roth H, Vona M. Moving volume KinectFusion:Proc. British Machine Vision Conference, 2012[C].
    [140]Zhang Z. Flexible camera calibration by viewing a plane from unknown orientations:Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra,1999[C]. IEEE.
    [141]Teichman A, Miller S, Thrun S. Unsupervised Intrinsic Calibration of Depth Sensors via SLAM: Proceedings of Robotics:Science and Systems, Berlin, Germany,2013[C].
    [142]Herrera C. D, Kannala J, Heikkil X E, et al. Joint depth and color camera calibration with distortion correction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012,34(10):2058-2064.
    [143]Zhang Q, Pless R. Extrinsic calibration of a camera and laser range finder (improves camera calibration):Proceedings.2004 IEEE/RSJ International Conference on Intelligent Robots and Systems,2004[C]. IEEE.
    [144]Unnikrishnan R, Hebert M. Fast extrinsic calibration of a laser rangefinder to a camera[R]. Pittsburgh, PA:Robotics Institute,2005.
    [145]Scaramuzza D, Harati A, Siegwart R. Extrinsic self calibration of a camera and a 3D laser range finder from natural scenes:2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA,2007[C]. IEEE.
    [146]Fuchs S, Hirzinger G. Extrinsic and depth calibration of ToF-cameras:2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK,2008[C]. IEEE.
    [147]Lindner M, Kolb A. Calibration of the intensity-related distance error of the PMD TOF-camera: Proc. SPIE 6764, Intelligent Robots and Computer Vision XXV:Algorithms, Techniques, and Active Vision,2007[C].
    [148]Lichti D D, Kim C. A comparison of three geometric Self-Calibration methods for range cameras. Remote Sensing,2011,3(5):1014-1028.
    [149]Zhu J, Pan Z, Xu G, et al. Virtual avatar enhanced nonverbal communication from mobile phones to PCs//Pan Z, Zhang X, Rhalibi A, et al. Technologies for E-Learning and Digital Entertainment. Springer Berlin Heidelberg,2008:551-561.
    [150]N. Burrus, Kinect calibration (Nov.2011). URL http://nicolas.burrus.name/index.php/Research/KinectCalibration[EB/OL].
    [151]Herrera C. D, Kannala J, Heikkila J. Accurate and practical calibration of a depth and color camera pair//Real P, Diaz-Pernil D, Molina-Abril H, et al. Computer Analysis of Images and Patterns. Springer Berlin Heidelberg,2011:437-445.
    [152]Zhang C, Zhang Z. Calibration between depth and color sensors for commodity depth cameras: 2011 IEEE International Conference on Multimedia and Expo, Barcelona,2011 [C]. IEEE.
    [153]MSDN Library:Kinect[CP/OL]. Http://msdn.microsoft.com/en-us/library/jj131028.aspx[EB/OL].
    [154]Yan C, Schuon S, Chan D, et al.3D shape scanning with a time-of-flight camera:2010 IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA,2010[C]. IEEE.
    [155]Barry D A, Parlange J Y, Li L, et al. Analytical approximations for real values of the lambert w-function. Math. Comput. Simul.,2000,53(1-2):95-103.
    [156]Levinson J, Thrun S. Unsupervised calibration for multi-beam lasers//Khatib O, Kumar V, Sukhatme G. Experimental Robotics. Springer Berlin Heidelberg,2014:179-193.
    [157]Sheehan M, Harrison A, Newman P. Self-calibration for a 3D laser. International Journal of Robotics Research,2012,31(5):675-687.
    [158]Mair E, Hager G, Burschka D, et al. Adaptive and generic corner detection based on the accelerated segment test//Daniilidis K, Maragos P, Paragios N. Computer Vision-ECCV 2010. Springer Berlin Heidelberg,2010:183-196.
    [159]De Barros M A. A high level approach to design and implementation of real time low-level image processing operators:Proceedings of the 38th Midwest Symposium on Circuits and Systems, Rio de Janeiro,1995[C]. IEEE.
    [160]Lowe D G. Object Recognition from Local Scale-Invariant Features:Proceedings of the International Conference on Computer Vision, Kerkyra,1999[C]. IEEE.
    [161]Zaharescu A, Boyer E, Varanasi K, et al. Surface feature detection and description with applications to mesh matching.2009 IEEE Conference on Computer Vision and Pattern Recognition,2009:373-380.
    [162]Tombari F, Salti S, Di Stefano L. A combined texture-shape descriptor for enhanced 3D feature matching:2011 18th IEEE International Conference on Image Processing, Brussels,2011[C]. IEEE.
    [163]Daniilidis K, Maragos P, Paragios N, et al. Unique signatures of histograms for local surface description//Daniilidis K, Maragos P, Paragios N. Computer Vision-ECCV 2010. Springer Berlin Heidelberg,2010:356-369.
    [164]Kanezaki A, Marton Z, Pangercic D, et al. Voxelized shape and color histograms for RGB-D: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop on Active Semantic Perception and Object Search in the Real World, San Francisco, CA, USA, 2011[C]. IEEE.
    [165]Nascimento E R, Schwartz W R, Campos MFM. EDVD-Enhanced descriptor for visual and depth data:2012 21st International Conference on Pattern Recognition, Tsukuba,2012[C]. IEEE.
    [166]Forsyth D, Torr P, Zisserman A, et al. CenSurE:Center surround extremas for realtime feature detection and matching//Forsyth D, Torr P, Zisserman A. Computer Vision-ECCV 2008. Springer Berlin Heidelberg,2008:102-115.
    [167]Nascimento E R D, Oliveira G L, Vieira A O N W, et al. On the development of a robust, fast and lightweight keypoint descriptor. Neurocomputing,2013,120:141-155.
    [168]Sturm J U R, Magnenat SEP, Engelhard N, et al. Towards a benchmark for RGB-D SLAM evaluation:Proc. of the RGB-D Workshop on Advanced Reasoning with Depth Cameras at Robotics:Science and Systems Conf, Los Angeles, USA,2011[C].
    [169]Thrun S, Liu Y, Koller D, et al. Simultaneous localization and mapping with sparse extended information filters. International Journal of Robotics Research,2004,23 (7-8):693-716.
    [170]Khoshelham K, Elberink S O. Accuracy and resolution of kinect depth data for indoor mapping app:cations. Sensors.2012,12(2):1437-1454.
    [171]Engel J, Sturm J, Cremers D. Camera-based navigation of a low-cost quadrocopter:2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura,2012[C]. IEEE.
    [172]Kerl C, Sturm J, Cremers D. Robust odometry estimation for RGB-D cameras:2013 IEEE International Conference on Robotics and Automation, Karlsruhe,2013[C]. IEEE.
    [173]Weiss S, Achtelik M W, Chli M, et al. Versatile distributed pose estimation and sensor self-calibration for an autonomous MAV:2012 IEEE International Conference on Robotics and Automation, Saint Paul, MN,2012[C]. IEEE.
    [174]Valenti R G, Dryanovski I, Jaramillo C, et al. Autonomous quadrotor flight using onboard RGB-D visual odometry[Z].2014.
    [175]Lucas B D, Kanade T. An iterative image registration technique with an application to stereo vision:Proceedings of the 7th International Joint Conference on Artificial Intelligence, San Francisco, CA, USA,1981 [C]. Morgan Kaufmann Publishers Inc.
    [176]Audras C, Comport A, Meilland M, et al. Real-time dense RGB-D localisation and mapping: Australian Conference on Robotics and Automation, Monash University, Australia,2011[C].
    [177]Ma Y, Soatto S, Kosecka J. An invitation to 3-D vision:From images to geometric models. Interdisciplinary Applied Mathematics (1AM#26), Springer,2003.
    [178]Lange K L, Little R J A, Taylor J M G. Robust statistical modeling using the t distribution. Journal of the American Statistical Association,1989,84(408):881-896.
    [179]Liu C, Rubin D B. Ml estimation of the t distribution using emand its extensions, ecm and ecme. Statistica Sinica,1995,5(1):19-39.
    [180]Clemente L A, Davison A J, Reid ID, et al. Mapping large loops with a single Hand-Held camera: Proceedings of Robotics:Science and Systems, Atlanta, GA, USA,2007[C]. The MIT Press.
    [181]Neira J, Tardos J D, Castellanos J A. Linear time vehicle relocation in SLAM:2003 IEEE International Conference on Robotics and Automation, Los Alamitos,2003[C]. IEEE.
    [182]Cummins M J, Newman P M. FAB-MAP:Probabilistic localization and mapping in the space of appearance. International Journal of Robotics Research,2008,27(6):647-665.
    [183]Sivic J, Zisserman A. Video Google:A text retrieval approach to object matching in videos: Proceedings. Ninth IEEE International Conference on Computer Vision, Nice, France,2003[C]. IEEE.
    [184]Williams B, Cummins M, Neira J, et al. An image-to-map loop closing method for monocular SLAM:2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, 2008[C]. IEEE.
    [185]Williams B, Klein G, Reid I. Real-Time SLAM relocalisation:IEEE 11th International Conference on Computer Vision, Rio de Janeiro,2007[C]. IEEE.
    [186]Datar M, Indyk P. Locality-sensitive hashing scheme based on p-stable distributions:Proceedings of the Twentieth Annual Symposium on Computational Geometry, New York, NY, USA,2004[C]. ACM Press.
    [187]Indyk P, Motwani R. Approximate nearest neighbors:Towards removing the curse of dimensionality:Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, Dallas, Texas, USA,1998[C]. ACM Press.
    [188]Silpa-Anan C, Hartley R. Optimised KD-trees for fast image descriptor matching:2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK,2008[C]. IEEE.
    [189]Muja M, Lowe D G. Fast approximate nearest neighbors with automatic algorithm configuration: Proceedings of the Fourth International Conference on Computer Vision Theory and Applications, 2009[C]. INSTICC Press.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700