Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization
详细信息    查看全文
  • 作者:Amanda Lampton ; John Valasek ; Mrinal Kumar
  • 关键词:Reinforcement learning ; Morphing ; Random grid
  • 刊名:Control Theory and Technology
  • 出版年:2011
  • 出版时间:August 2011
  • 年:2011
  • 卷:9
  • 期:3
  • 页码:431-439
  • 全文大小:481 KB
  • 参考文献:[1]D. C. Bentivegna, C. G. Atkeson, G. Cheng. Learning to select primitives and generate sub-goals from practice. Proceedings of the IEEE/lRSJ International Conference on Intelligent Robots and Systems, New York: IEEE, 2003: 946-53.
    [2]O. Simsek, A. P. Wolfe, A. G. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the 22nd International Conference on Machine Learning, New York: ACM, 2005: 816-23.CrossRef
    [3]C. Clausen, H. Wechsler. Quad-Q-learning. IEEE Transactions on Neural Networks, 2000, 11(2): 279-94.CrossRef
    [4]T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann Publishers Inc., 1998: 118-26.
    [5]C. J. C. H.Watkins, P. Dayan. Learning from Delayed Rewards. Ph.D. thesis. Cambridge, U.K.: University of Cambridge, 1989.
    [6]A. Lampton. Function Approximation and Discretization Methods for Reinforcement Learning of Highly Reconfigurable Vehicles. Ph.D. thesis. College Station, TX: Texas A&M University, 2009.
    [7]A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of morphing airfoils with aerodynamic and structural effects. Journal of Aerospace Computing, Information, and Communication, 2009, 6(1): 30-0.
    [8]A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of a morphing airfoil-policy and discrete learning analysis. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No.AIAA-2008-7281.
    [9]A. Lampton, A. Niksch, J. Valasek. Morphing airfoil with reinforcement learning of four shape changing parameters. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No. AIAA-2008-7282.
    [10]R. Sutton, A. Barto. Reinforcement Learning -An Introduction. Cambridge: MIT Press, 1998.
    [11]H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM, 1992.MATH
  • 作者单位:Amanda Lampton (1)
    John Valasek (2)
    Mrinal Kumar (3)

    1. Systems Technology, Inc., 13766 S. Hawthorne Blvd, Hawthorne, CA, 90250, USA
    2. Department of Aerospace Engineering, Texas A&M University, 3141 TAMU, College Station, TX, 77843-3141, USA
    3. Department of Mechanical & Aerospace Engineering, University of Florida, 306 MAE-A, Gainesville, FL, 32611-6250, USA
  • 刊物类别:Control; Systems Theory, Control; Optimization; Computational Intelligence; Complexity; Control, Rob
  • 刊物主题:Control; Systems Theory, Control; Optimization; Computational Intelligence; Complexity; Control, Robotics, Mechatronics;
  • 出版者:South China University of Technology and Academy of Mathematics and Systems Science, CAS
  • ISSN:2198-0942
文摘
A multiresolution state-space discretization method with pseudorandom gridding is developed for the episodic unsupervised learning method of Q-learning. It is used as the learning agent for closed-loop control of morphing or highly reconfigurable systems. This paper develops a method whereby a state-space is adaptively discretized by progressively finer pseudorandom grids around the regions of interest within the state or learning space in an effort to break the Curse of Dimensionality. Utility of the method is demonstrated with application to the problem of a morphing airfoil, which is simulated by a computationally intensive computational fluid dynamics model. By setting the multiresolution method to define the region of interest by the goal the agent seeks, it is shown that this method with the pseudorandom grid can learn a specific goal within ±0.001 while reducing the total number of state-action pairs needed to achieve this level of specificity to less than 3000. Keywords Reinforcement learning Morphing Random grid

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700