基于激光雷达和神经网络的移动机器人综合局部路径规划

作者：赵慧
论文级别：硕士
学科专业名称：模式识别与智能系统
中文关键词：人工势场 ; Q学习 ; 神经网络 ; 环境分类 ; 激光雷达
英文关键词：artificial potential field ; Q-learning ; neural network ; environment classifier ; laser radar
学位年度：2004
导师：蔡自兴
学科代码：081104
学位授予单位：中南大学

摘要

路径规划是移动机器人研究中的重要问题之一，本文主要研究在不确定环境下移动机器人的局部路径规划。
     人工势场法模仿物理学中势场的概念，假想环境对机器人有一定的作用力，力的方向就是机器人前进的方向。这种方法计算简单，但是存在局部最小、相邻障碍之间找不到通道、走廊环境下存在振荡等问题。对于局部最小，可以增加一个指向自由区域的力作为机器人的前进方向。
     Q学习提供智能系统在马尔可夫环境中利用经历的动作序列选择最优动作的一种学习能力。采用模拟退火算法进行随机动作选择，根据动作之间的相似性调整每个动作的Q值，提高了机器人对环境的适应能力。但是这种方法计算复杂，规划周期是人工势场法的1.5倍，所占用的存储空间是人工势场法的4倍。
     本文提出了一种综合局部路径规划方法，可以实现二者的优势互补。引入BP网络对环境进行划分，把环境分成四类：相邻障碍、走廊环境、U型区域和其它环境。其它环境采用人工势场法规划，以发挥其方便灵活的特点；相邻障碍和走廊环境等人工势场法不能正常工作的情况下采用Q学习方法；对于U型区，局部路径规划方法不能保证越过障碍，因此可以使机器人沿着较近的U型边界走出障碍。
     激光测距范围广、精度高、传输速度快，适合机器人的实时避障，本文采用激光雷达实时采集环境信息，作为分类BP网络、人工势场和Q学习网络的输入。
     试验和仿真结果证明这种方法可以实现人工势场法和Q学习之间的优势互补。通过这两种方法的综合应用，机器人在不确定环境下可以找到接近最优的路径，避免与静态障碍或者动态物体的碰撞，安全到达目的地。
Path planning is one of the most important issues of mobile robot. This paper concentrates on the local path planning of the mobile robot in uncertain environment.
    Artificial potential field imitates the concept of the tendency field in physics. Supposed the environment exert some strength on the robot, the direction of the strength is the advancing direction of the robot. This algorithm is convenient to be realized, but there are inherent limitations, such as trap situations due to local minima (cyclic behavior), no passage between closely spaced obstacles, oscillations in narrow passages. For the local minima, an additional strength towards the free area is used in the direction of the robot forwarding.
    Q-learning offers the intelligence system a kind of learning ability by utilizing the experienced movement array to choose the optimum movement in the environment of Malkov. The robot action is selected by simulated annealing algorithm. Each action's value is adjusted based on resembling among actions to improve the adaptive capacity of robot in environment. However, the Q-learning has some disadvantages such as calculation inconveniently, longer planning cycle (1.5 times compared to the artificial potential field) and lager memory space (4 times to the artificial potential field).
    A synthesized method of the local path planning, which can realize the mutual supplement with advantages of these two algorithms, is proposed in this thesis. BP neural network is introduced to classify the robot environment. This environment is divided into four kinds: closely spaced obstacles, corridor, U-shape area and other. In the last kind of environment, the algorithm of artificial potential field is used and is shown computation simpleness. The Q-learning works in the environment of closely spaced obstacles and corridor in which the artificial potential field can't work well enough. The local path planning can't guarantee


    passing obstacles for U-shape, thus the robot avoid the obstacles along the relatively near U-shape border.
    The laser range finder has very long range, high precision and quick transmission speed, and is suitable for avoiding obstacles in real time environment. Real-time environment information is obtained through laser radar, and acts as the input for the classified BP network, artificial potential field and Q-learning.
    It is proved that the method can realize mutual supplement with artificial potential field and Q-learning each other by test and simulation. Through the integrated application of these two algorithms, the robot can find the almost optimum route under the uncertain environment, prevent collisions with fixed obstacles or dynamic ones, and reach the destination safely.

引文

[1]李磊，叶涛，谭民，等．移动机器人技术研究现状与未来．机器人，2002，24(5)：4750～480
    [2]http://www.kepu.org.cn/gb/technology/robot/secret/sec101.html
    [3]李贻斌，李彩虹，刘明，等．移动机器人导航技术．山东矿业学院学报(自然科学版)，1999，18(3)：67～71
    [4]邹小兵，蔡自兴．基于传感器信息的环境非光滑建模与路径规划．自然科学进展，2002，12(11)：1188～1192
    [5]王醒策，张汝波，顾国昌．基于势场栅格法的机器人全局路径规划．哈尔滨工程大学学报，2003，24(2)：170～174
    [6]陈钢，沈林成．复杂环境下路径规划问题的遗传路径规划方法．机器人，2001，23(1)：40～45
    [7]蔡自兴，贺汉根，陈虹．未知环境中移动机器人导航控制研究的若干问题．控制与决策，2002，17(4)：385～389
    [8]Brooks R, Robis A. Layered control system for a mobile robot. IEEE Transon Robotics & Automation, 1986, 2(1): 14～23.
    [9]D.Maravall, J.de Lope, ESerradilla. Combination of Model-based and Reactive Methods in Autonomous Navigation. : Proceedings - IEEE International Conference on Robotics and Automation, 2000, 3:2328～2333
    [10]Khatib O. Real-time obstacle avoidance for manipulator and mobile robots. IJRR, 1986, 5(1): 90～98
    [11]Ge S.S, Cui Y.J. New potential functions for mobile robot path planning. IEEE Transactions on Robotics and Automation, 2000,16(5): 615-620
    [12]Prahlad Vadakkepat, Kay Chen Tan, Wang Ming-Liang. Evolutionary Artificial Potential Fields and Their Application in Real Time Robot Path Planning. Proceedings of the IEEE Conference on Evolutionary Computation, 2000, 1: 256～263
    [13]Y. Koren, J. Borenstein. Potential Field Methods and Their Inherent Limitations for Mobile Robot Navigation. Proceedings of the IEEE Conference on Robotics and Automation, 1991, 2:1398～1404
    [14]徐昕，贺汉根．神经网络增强学习的梯度算法研究．计算机学报，2003，26(2)：227～233
    [15]王敏，金·波斯科，黄心汉．基于传感器和模糊规则的机器人在动态障碍环境

    中的智能运动控制．控制理论与应用，2000，17(6)：819～825
    [16]Nachol, Chaiyaratana. Time-Optimal Path Planning and Control Using Neural Networks and Agenetic Algorithm. International Journal of Computational Intelligence and Applications, 2002, 2(2): 153～172
    [17]蔡自兴．机器人原理及其应用．长沙：中南工业大学出版社,1988．184～187
    [18]刘鹃．基于时空信息与认知模型的移动机器人导航机制研究：[博士学位论文]．长沙：中南大学，2003
    [19]霍玉晶，陈千颂，潘志文．脉冲激光雷达的时间间隔测量综述．激光与红外，2001，3l(3)：136～139
    [20]张硕生．轮式移动操作机器人的协调路径规划和鲁棒跟踪控制研究：[博士学位论文]．北京：北京科技大学，2000
    [21]Andrew Howard, Maja J Matari'c, Gaurav S Sukhatme. Mobile Sensor Network Deployment using Potential Fields: A Distributed, Scalable Solution to the Area Coverage Problem. In: Hajime Asama, Tamio Arai, Toshio Fukuda, Tsutomu Hasegawa, eds. Proceedings of the 6th International Symposium on Distributed Autonomous Robotics Systems (DARS02). Fukuoka: Springer, 2002 25～27
    [22]欧阳正柱，何克忠．基于势场法的智能移动机器人导航控制．计算机工程与应用，2001，16：128～130
    [23]Kimon P.Valavanis, Timothy Hebert, Ramesh Kolluru, etc. Mobile Robot Navigation in 2-D Dynamic Environments Using an Electrostatic Potential Field. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 2000, 30(2): 187～196
    [24]J.Borenstein, Y.Koren. The vector field histogram—Fast obstacle avoidance for mobile robots. IEEE Journal of Robotics and Automation, 1991, 7(3): 278～288
    [25]张汝波，顾国昌，刘照德．强化学习理论、算法及应用．控制理论与应用，2000，17(5)：637～642
    [26]Tom M．Mitchell．Machine Learning．北京：机械工业出版社．2003．373～377
    [27]http://necweb.neu.edu.cn/ncourse/rgzn/ch6/ch6.htm
    [28]张汝波，杨广铭，顾国昌，等．Q-学习及其在智能机器人局部路径规划中的应用研究．计算机研究与发展，1999，36(12)：1430～1436
    [29]庄晓东，孟庆春，熊建设．动态环境中基于增强式学习的路径规划方法．机器人，2001，23(7)：712～716
    [30]Carreras M, Batlle J, Ridao P. Hybrid coordination of reinforcement learning-based behaviors for AUV control. IEEE International Conference on

    Intelligent Robots and Systems, 2001, 3:1410-1415
    [31]Baldassarre Gianluca. Coarse planning for landmark navigation in a neural-network reinforcement-learning robot. IEEE International Conference on Intelligent Robots and Systems, 2001, 4:2398-2403
    [32]Cicirelli, G.Neural Reinforcement Learning for the Control of an Autonomous Mobile Vehicle. In: M.H.Hamza, eds. Proceedings of the IASTED International Conference on Robotics and Applications, Salzburg: LASTED. 2003:18～23
    [33]Abramson M, Wechsler H. Competitive reinforcement learning for combinatorial problems. Proceedings of the International Joint Conference on Neural Networks, 2001, 4:2333-2338
    [34]张汝波．提高强化学习速度的方法研究．计算机工程与应用，2001，22：38～40
    [35]Daniel Shapiro, Pat Langley, Ross Shachter. Using Background Knowledge to Speed Reinforcement Learning in Physical Agents.: Proceedings of the Interantional Conference on Autonomous Agents, 2001:254～261
    [36]http://cai.csu.edu.cn/jpkc/rengongzhineng/rengongzhineng/kejian/AI/ai.htm
    [37]Lucidarme Philippe, Liegeois, Alain. Learning Reactive Neurocontrollers using Simulated. Annealing for Mobile Robots. IEEE International .Conference on Intelligent Robots and Systems, 2003, 1: 674-679
    [38]孙羽，张汝波．神经网络在智能机器人导航系统中的应用研究．计算机工程，2002，28(1)：138～140
    [39]YONG-KYUNNA, SE-YOUNGOH. Hybrid Control for Autonomous Mobile Robot Navigation Using Neural Network Based Behavior Modules and Environment Classification. Autonomous Robots, 2003, 15:193～206
    [40]刘成良，张凯，付庄，等．神经网络在机器人路径规划中的应用研究．机器人，2001，23(7)：605～609
    [41]李春霞，杨树国，孙尧，等．BP网络在导航系统故障识别中的应用．应用科技，2003，30(5)：56～59
    [42]Gabriely Yoav, Rimon Elon. Competitive on-line coverage of grid environments by a mobile robot. Computational Geometry, 2003, 24(3): 197～224
    [43]孙羽，张汝波，顾国昌．自组织映射神经网络量化机器人强化学习方法研究．小型微型计算机系统，2002，23(5)：558～560
    [44]赵慧，蔡自兴，邹小兵．基于模糊ART和Q学习的路径规划．见：何华灿，编．中国人工智能进展(2003)．北京：中国人工智能学会北京邮电大学出版社，2003．834～838


    [45]Araujo Rui, Gouveia Goncalo, Santos Nuno. Learning self-organizing maps for navigation in dynamic worlds. Proceedings-IEEE International Conference on Robotics and Automation, 2003, 1:1312～1317
    [46]Handa Hisashi, Ninomiya Akira, Horiuchi Tadashi, etc. Adaptive state construction for reinforcement learning and its application to robot navigation problems. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, 2001, 3:1436～1441
    [47]段群杰，张铭钧，张菁．基于模糊神经网络的水下机器人局部路径规划方法．船舶工程，2001，1：54～61
    [48]禹建丽，V.Kroumov，孙增圻，等．一种快速神经网络路径规划算法．机器人，2001，23(3)：201～205
    [49]王爱民，史庆国，吕晨亮．关于移动机器人路径最优化问题．中国机械工程，2001，12(6)：685～688
    [50]飞思科技产品研发中心．MATLAB 6．5辅助神经网络分析与设计．北京：电子工业出版社，2003．64～68
    [51]袁曾任．人工神经元网络及其应用．北京：清华大学出版社，1999．66～68，78
    [52]Ye Ling-Yun, Xiong Rong. Robotic Velocity generation using neural network. Journal of Harbin Institute of Technology (New Series), 2001, 8(3): 218～221
    [53]戚德虎，康继昌．BP神经网络的设计．计算机工程与设计，1998，19(2)：48～50
    [54]阎平凡，张长水．人工神经网络与模拟进化计算．北京：清华大学出版社，2000．13～15
    [55]王雪光，郭艳兵，齐占庆．激活函数对BP网络性能的影响及其仿真研究．控制理论与应用，2002，21(4)：15～17
    [56]郭艳兵，齐占庆，王雪光．一种改进的BP网络学习算法．自动化技术与应用，2002．2：13～14
    [57]Greenman Roxana M, Stepniewski Slawomir W, Jorgensen Charles C, etc. Designing compact feedforward neural models with small training data sets. Journal of Aircraft, 2002, 39(3): 452～459
    [58]孙功星，戴长江，戴贵亮．训练样本的选取对网络性能的影响．核电子学与探测技术，1996，16(6)：401～404
    [59]王越，曹长修．BP网络局部极小产生的原因分析及避免方法．计算机工程，2002，28(6)：35～37
    [60]Ye Cang, Borenstein Johann. Characterization of a 2-D laser scanner for mobile

    robot obstacle negotiation. Proceedings - IEEE International Conference on Robotics and Automation, 2002, 3:2512-2518
    [61]Toenshoff H-K, Kruse U, Maimer H-J. Sensor systems for industrial applications under water. CIRP Annals o Manufacturing Technology, 1999, 48(1): 445-448
    [62]J.Hancock, D.Langer, M.Hebert. Active laser radar for high-performance measurements. Proceedings of the 1998 IEEE International Conference on Robotics and Automation (ICRA'98), 1998, 2: 1465～1470
    [63]杨明，王宏，何克忠．基于激光雷达的移动机器人环境建模与避障．清华大学学报(自然科学版)，2000，40(7)：112～116
    [64]Jan Axelson．串行端口大全(精英科技译)．北京：中国电力出版社，2001：21～22
    [65]裘迅．CRC的生成算法的实现．苏州职业大学学报，2002，2：59～60
    [66]Joe Campbell．串行通信C程序员指南(，徐国定，廖卫东，张庆)．北京：清华大学出版社，1995．49