用户名: 密码: 验证码:
Adaptive Optimal Tracking Control for Continuous-Time Systems Using Identifier-Critic Based Dynamic Programming
详细信息    查看官网全文
摘要
This paper proposes an optimal tracking control of completely unknown affine nonlinear systems using a system augmentation scheme and a modified discounted performance function, where the feedforward and feedback control actions can be obtained simultaneously. We first present an adaptive identifier to estimate the unknown dynamics, where the estimated parameters are updated based on a new adaptive law such that these estimated parameters converge to a small neighborhood around their true values. An augmented system consisting of the tracking error and the generator of the trajectory to be tracked are then constructed, which derives an augmented tracking Hamilton-Jacobi-Bellman(HJB) equation. An online critic neural network(NN) is finally presented to approximate the optimal value function of HJB equation, and then calculate the optimal control action. This leads to an identifier-critic based adaptive dynamic programming(ADP) structure. Simulation results are presented to demonstrate the effectiveness of the proposed method.
This paper proposes an optimal tracking control of completely unknown affine nonlinear systems using a system augmentation scheme and a modified discounted performance function, where the feedforward and feedback control actions can be obtained simultaneously. We first present an adaptive identifier to estimate the unknown dynamics, where the estimated parameters are updated based on a new adaptive law such that these estimated parameters converge to a small neighborhood around their true values. An augmented system consisting of the tracking error and the generator of the trajectory to be tracked are then constructed, which derives an augmented tracking Hamilton-Jacobi-Bellman(HJB) equation. An online critic neural network(NN) is finally presented to approximate the optimal value function of HJB equation, and then calculate the optimal control action. This leads to an identifier-critic based adaptive dynamic programming(ADP) structure. Simulation results are presented to demonstrate the effectiveness of the proposed method.
引文
[1]H.Modares and F.L.Lewis,"Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning,"Automatica,vol.50,pp.1780-1792,2014.
    [2]T.Dierks and S.Jagannathan,"Optimal control of affine nonlinear continuous-time systems,"in IEEE Conference on Decision&Control,2010,pp.1568-1573.
    [3]D.Wang,D.Liu,and Q.Wei,"Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach,"Neurocomputing,vol.78,pp.14-22,2012.
    [4]G.Xiao,H.Zhang,Y.Luo,and H.Jiang,"Data-driven optimal tracking control for a class of affine non-linear continuous-time systems with completely unknown dynamics,"Iet Control Theory&Applications,vol.10,pp.700-710,2016.
    [5]F.L.Lewis,D.L.Vrabie,and V.L.Syrmos,Optimal Control,Third Edition,2012.
    [6]H.Modares and F.L.Lewis,"Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning,"IEEE Transactions on Automatic Control,vol.59,pp.3051-3056,2014.
    [7]P.A.Ioannou and J.Sun,Robust adaptive control:Springer London,2015.
    [8]S.Sastry,M.Bodson,and J.F.Bartram,"Adaptive Control:Stability,Convergence,and Robustness,"Journal of the Acoustical Society of America,vol.88,pp.588-589,1990.
    [9]K.Doya,"Reinforcement Learning in Continuous Time and Space,"Neural Computation,vol.12,pp.219-45,2000.
    [10]R.S.Sutton,A.G.Barto,and R.J.Williams,"Reinforcement learning is direct adaptive optimal control,"IEEE Control Systems,vol.12,pp.19-22,1991.
    [11]P.J.Webros,A menu of designs for reinforcement learning over time:MIT Press,1990.
    [12]A.Al-Tamimi,F.L.Lewis,and M.Abu-Khalaf,"Discrete-time nonlinear HJB solution using approximate dynamic programming:convergence proof,"IEEETransactions on Systems,Man,and Cybernetics,Part B(Cybernetics),vol.38,pp.943-949,2008.
    [13]H.Zhang,L.Cui,X.Zhang,and Y.Luo,"Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method,"IEEE Transactions on Neural Networks,vol.22,pp.2226-2236,2011.
    [14]C.Qin,H.Zhang,and Y.Luo,"Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming,"International Journal of Control,vol.87,pp.1000-1009,2014.
    [15]Y.Lv,J.Na,Q.Yang,X.Wu,and Y.Guo,"Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics,"International Journal of Control,vol.89,pp.99-112,2016.
    [16]J.Na and G.Herrmann,"Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems,"Automatica Sinica IEEE/CAA Journal of,vol.1,pp.412-422,2014.
    [17]X.Ren,F.L.Lewis,and J.Zhang,"Neural network compensation control for mechanical systems with disturbances,"Automatica,vol.45,pp.1221-1226,2009.
    [18]S.Bhasin,R.Kamalapurkar,M.Johnson,K.G.Vamvoudakis,F.L.Lewis,and W.E.Dixon,"A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems,"Automatica,vol.49,pp.82-92,2013.
    [19]H.Modares,F.L.Lewis,and M.B.Naghibi-Sistani,"Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,"IEEETransactions on Neural Networks,vol.24,pp.1513-1525,2013.
    [20]M.Abu-Khalaf and F.L.Lewis,"Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,"Automatica,vol.41,pp.779-791,2005.
    [21]K.G.Vamvoudakis and F.L.Lewis,"Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,"Automatica,vol.46,pp.3180-3187,2009.
    [22]J.Na,Y.Lv,X.Wu,Y.Guo,and Q.Chen,"Approximate optimal tracking control for continuous-time unknown nonlinear systems,"in Chinese Control Conference,2014,pp.8990-8995.
    [23]J.Na,G.Herrmann,X.Ren,M.N.Mahyuddin,and P.Barber,"Robust adaptive finite-time parameter estimation and control of nonlinear systems,"in 2011 IEEE International Symposium on Intelligent Control,2011,pp.1014-1019.
    [24]K.G.Vamvoudakis and F.L.Lewis,"Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,"Automatica,vol.46,pp.878-888,2010.
    [25]P.A.Ioannou and J.Sun,Robust adaptive control:Courier Corporation,2012.
    [26]R.Kamalapurkar,H.Dinh,S.Bhasin,and W.E.Dixon,"Approximate optimal trajectory tracking for continuous-time nonlinear systems,"Automatica,vol.51,pp.40-48,2015.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700