基于路径积分强化学习方法的蛇形机器人目标导向运动
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Target-Directed Locomotion of a Snake-Like Robot Based on Path Integral Reinforcement Learning
  • 作者:方勇纯 ; 朱威 ; 郭宪
  • 英文作者:FANG Yongchun;ZHU Wei;GUO Xian;Institute of Robotics and Automatic Information System,College of Artificial Intelligence,Nankai University;
  • 关键词:路径积分 ; 强化学习 ; 随机最优控制 ; 蛇形机器人 ; 目标导向
  • 英文关键词:Path Integral;;Reinforcement Learning;;Stochastic Optimal Control;;Snake-Like Robot;;Target-Directed
  • 中文刊名:MSSB
  • 英文刊名:Pattern Recognition and Artificial Intelligence
  • 机构:南开大学人工智能学院机器人与信息自动化研究所;
  • 出版日期:2019-01-15
  • 出版单位:模式识别与人工智能
  • 年:2019
  • 期:v.32;No.187
  • 基金:国家自然科学基金项目(No.61603200,U1613210)资助~~
  • 语种:中文;
  • 页:MSSB201901002
  • 页数:9
  • CN:01
  • ISSN:34-1089/TP
  • 分类号:7-15
摘要
路径积分方法源于随机最优控制,是一种数值迭代方法,可求解连续非线性系统的最优控制问题,不依赖于系统模型,快速收敛.文中将基于路径积分强化学习的策略改善方法用于蛇形机器人的目标导向运动.使用路径积分强化学习方法学习蛇形机器人步态方程的参数,不仅可以在仿真环境下使蛇形机器人规避障碍到达目标点,利用仿真环境的先验知识也能在实际环境下快速完成相同的任务.实验结果验证方法的正确性.
        Path integral is derived from stochastic optimal control. It is a numerical iteration method and solves the problem of the optimal control about continuous nonlinear systems at a high convergence speed without system model. A policy improvement algorithm based on path integral reinforcement learning is proposed for the target-directed locomotion of a snake-like robot in this paper. The path integral reinforcement learning approach is employed to learn the parameters of the snake-like robot serpentine equation,and the robot is controlled to arrive at the target position fast without contacting obstacles in simulation environment. Moreover,the robot with the priori knowledge from the simulation in real environment can complete the task well. Experimental result verifies the validity of the propose algorithm.
引文
[1]HIROSE B S.Biologically Inspired Robots:Snake-Like Locomotors and Manipulators.Oxford,UK:Oxford University Press,1993.
    [2]KELASIDI E,LILJEBACK P,PETTERSEN K Y,et al.Innovation in Underwater Robots:Biologically Inspired Swimming Snake Robots.IEEE Robotics and Automation Magazine,2016,23(1):44-62.
    [3]BORENSTEIN J,HANSEN M,BORRELL A.The Omni Tread OT-4Serpentine Robot-Design and Performance.Journal of Field Robotics,2007,24(7):601-621.
    [4]ROLLINSON D,CHOSET H.Pipe Network Locomotion with a Snake Robot.Journal of Field Robotics,2014,33(3):322-336.
    [5]TANAKA M,NAKAJIMA M,SUZUKI Y,et al.Development and Control of Articulated Mobile Robot for Climbing Steep Stairs.IEEE/ASME Transactions on Mechatronics,2018,23(2):531-541.
    [6]SATO M,FUKAYA M,IWASAKI T.Serpentine Locomotion with Robotic Snakes.IEEE Control Systems Magazine,2002,22(1):64-81.
    [7]ROLLINSON D,CHOSET H.Gait-Based Compliant Control for Snake Robots//Proc of the IEEE International Conference on Robotics and Automation.Washington,USA:IEEE,2013:5123-5128.
    [8]WU X D,MA S G.Adaptive Creeping Locomotion of a CPG-Controlled Snake-Like Robot to Environment Change.Autonomous Robots,2010,28(3):283-294.
    [9]CRESPI A,IJSPEERT A J.Online Optimization of Swimming and Crawling in an Amphibious Snake Robot.IEEE Transactions on Robotics,2008,24(1):75-87.
    [10]MATSUNO F,MOGI K.Redundancy Controllable System and Control of Snake Robots Based on Kinematic Model//Proc of the IEEEConference on Decision and Control.Washington,USA:IEEE,2000,V:4791-4796.
    [11]MOHAMMADI A,REZAPOUR E,MAGGIORE M,et al.Maneuvering Control of Planar Snake Robots Using Virtual Holonomic Constraints.IEEE Transactions on Control Systems Technology,2015,24(3):884-899.
    [12]ARIIZUMI R,MATSUNO F.Dynamic Analysis of Three Snake Robot Gaits.IEEE Transactions on Robotics,2017,33(5):1075-1087.
    [13]OKAL B,ARRAS K O.Learning Socially Normative Robot Navigation Behaviors with Bayesian Inverse Reinforcement Learning//Proc of the IEEE International Conference on Robotics and Automation.Washington,USA:IEEE,2016:2889-2895.
    [14]KRETZSCHMAR H,SPIES M,SPRUNK C,et al.Socially Compliant Mobile Robot Navigation via Inverse Reinforcement Learning.International Journal of Robotics Research,2016,35(11):1289-1307.
    [15]ZHU Y K,MOTTAGHI R,KOLVE E,et al.Target-Driven Visual Navigation in Indoor Scenes Using Deep Reinforcement Learning//Proc of the IEEE International Conference on Robotics and Automation.Washington,USA:IEEE,2017:3357-3364.
    [16]GONG C H,TRAVERS M J,ASTLEY H C,et al.Kinematic Gait Synthesis for Snake Robots.International Journal of Robotics Research,2016,35(1/2/3):100-113.
    [17]THEODOROU E,BUCHLI J,SCHAAL S.A Generalized Path Integral Control Approach to Reinforcement Learning.Journal of Machine Learning Research,2010,11:3137-3181.
    [18]WILLIAMS G,DREWS P,GOLDFAIN B,et al.Aggressive Driving with Model Predictive Path Integral Control//Proc of the IEEE International Conference on Robotics and Automation.Washington,USA:IEEE,2016:1433-1440.
    [19]CHEBOTAR Y,KALAKRISHNAN M,YAHYA A,et al.Path Integral Guided Policy Search[J/OL].[2018-08-23].https://arxiv.org/pdf/1610.00529.pdf.
    [20]OKADA M,RIGAZIO L,AOSHIMA T.Path Integral Networks:End-to-End Differentiable Optimal Control[J/OL].[2018-08-23].https://arxiv.org/pdf/1706.09597.pdf.
    [21]CHATTERJEE S,NACHSTEDT T,WORGOTTER F,et al.Reinforcement Learning Approach to Generate Goal-Directed Locomotion of a Snake-Like Robot with Screw-Drive Units//Proc of the23rd International Conference on Robotics in Alpe-Adria-Danube Region.Washington,USA:IEEE,2014.DOI:10.1109/RAAD.2014.7002234.
    [22]POREZ M,IJSPEERT A J.Improved Lighthill Fish Swimming Model for Bio-inspired Robots:Modeling,Computational Aspects and Experimental Comparisons.The International Journal of Robotics Research,2014,33(10):1322-1341.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700