Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization

详细信息查看全文

作者：Amanda Lampton ; John Valasek ; Mrinal Kumar
关键词：Reinforcement learning ; Morphing ; Random grid
刊名：Control Theory and Technology
出版年：2011
出版时间：August 2011
年：2011
卷：9
期：3
页码：431-439
全文大小：481 KB
参考文献：[1]D. C. Bentivegna, C. G. Atkeson, G. Cheng. Learning to select primitives and generate sub-goals from practice. Proceedings of the IEEE/lRSJ International Conference on Intelligent Robots and Systems, New York: IEEE, 2003: 946-53.
[2]O. Simsek, A. P. Wolfe, A. G. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the 22nd International Conference on Machine Learning, New York: ACM, 2005: 816-23.CrossRef
[3]C. Clausen, H. Wechsler. Quad-Q-learning. IEEE Transactions on Neural Networks, 2000, 11(2): 279-94.CrossRef
[4]T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann Publishers Inc., 1998: 118-26.
[5]C. J. C. H.Watkins, P. Dayan. Learning from Delayed Rewards. Ph.D. thesis. Cambridge, U.K.: University of Cambridge, 1989.
[6]A. Lampton. Function Approximation and Discretization Methods for Reinforcement Learning of Highly Reconfigurable Vehicles. Ph.D. thesis. College Station, TX: Texas A&M University, 2009.
[7]A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of morphing airfoils with aerodynamic and structural effects. Journal of Aerospace Computing, Information, and Communication, 2009, 6(1): 30-0.
[8]A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of a morphing airfoil-policy and discrete learning analysis. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No.AIAA-2008-7281.
[9]A. Lampton, A. Niksch, J. Valasek. Morphing airfoil with reinforcement learning of four shape changing parameters. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No. AIAA-2008-7282.
[10]R. Sutton, A. Barto. Reinforcement Learning -An Introduction. Cambridge: MIT Press, 1998.
[11]H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM, 1992.MATH
作者单位：Amanda Lampton (1)
John Valasek (2)
Mrinal Kumar (3)

1. Systems Technology, Inc., 13766 S. Hawthorne Blvd, Hawthorne, CA, 90250, USA
2. Department of Aerospace Engineering, Texas A&M University, 3141 TAMU, College Station, TX, 77843-3141, USA
3. Department of Mechanical & Aerospace Engineering, University of Florida, 306 MAE-A, Gainesville, FL, 32611-6250, USA
刊物类别：Control; Systems Theory, Control; Optimization; Computational Intelligence; Complexity; Control, Rob
刊物主题：Control; Systems Theory, Control; Optimization; Computational Intelligence; Complexity; Control, Robotics, Mechatronics;
出版者：South China University of Technology and Academy of Mathematics and Systems Science, CAS
ISSN：2198-0942

文摘

A multiresolution state-space discretization method with pseudorandom gridding is developed for the episodic unsupervised learning method of Q-learning. It is used as the learning agent for closed-loop control of morphing or highly reconfigurable systems. This paper develops a method whereby a state-space is adaptively discretized by progressively finer pseudorandom grids around the regions of interest within the state or learning space in an effort to break the Curse of Dimensionality. Utility of the method is demonstrated with application to the problem of a morphing airfoil, which is simulated by a computationally intensive computational fluid dynamics model. By setting the multiresolution method to define the region of interest by the goal the agent seeks, it is shown that this method with the pseudorandom grid can learn a specific goal within ±0.001 while reducing the total number of state-action pairs needed to achieve this level of specificity to less than 3000. Keywords Reinforcement learning Morphing Random grid

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700