Projective Simulation for Classical Learning Agents: A Comprehensive Investigation
详细信息    查看全文
  • 作者:Julian Mautner ; Adi Makmal ; Daniel Manzano ; Markus Tiersch…
  • 关键词:Artificial Intelligence ; Reinforcement Learning ; Embodied Agent ; Projective Simulation
  • 刊名:New Generation Computing
  • 出版年:2015
  • 出版时间:January 2015
  • 年:2015
  • 卷:33
  • 期:1
  • 页码:69-114
  • 全文大小:2,395 KB
  • 参考文献:1. Adam, S., Busoniu, L. and Babuska, R., “Experience Replay for Real-Time Reinforcement Learning Control,-in / Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42, pp. 201-12, 2012.
    2. Briegel, H. J. and De las Cuevas, G., “Projective simulation for artificial intel-Ligence,-in / Sci. Rep. 2, 400, 2012.
    3. Bull, L. and Kovacs, T. (Eds.), / Foundations of Learning Classifier Systems, Studies in Fuzziness and Soft Computing, 183, Springer Berlin-Heidelberg, 2005.
    4. Butz, M. V., Shirinov, E. and Reif, K. L., “Self-Organizing Sensorimotor Maps Plus Internal Motivations Yield Animal-Like Behavior,-in / Adaptive Behavior, 18, pp. 315-37, 2010.
    5. Butz, M. V. and Wilson, S. W., “An Algorithmic Description of XCS,-in / Proc. IWLCS -0 Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems, pp. 253-72, Springer-Verlag London, U.K., 2001.
    6. Dietterich, T. G., “Hierarchical reinforcement learning with the MAXQ value function decomposition,-in / Journal of Artificial Intelligence Research, 13, pp. 227-03, 2000.
    7. Floreano, D. and Mattiussi, C., / Bio-inspired artificial intelligence: theories, methods, and technologies, Intelligent robotics and autonomous agents, MIT Press, Cambridge Massachusetts, 2008.
    8. Holland J. H., / Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975.
    9. Lin, L. J., “Self-improving reactive agents based on reinforcement learning, planning and teaching,-in / Machine Learning 8, pp. 292-21, 1992.
    10. Ormoneit, D. and Sen, S., “Kernel-based reinforcement learning,-in / Machine Learning, 49, pp. 161178, 2002.
    11. Pfeiffer R. and Scheier C. Understanding intelligence (First ed.). MIT Press, Cambridge Massachusetts, (1999)
    12. Poole, D., Mackworth, A. and Goebel R., / Computational intelligence: A logical approach, Oxford University Press, 1998.
    13. Parr, R. and Russell, S., “Reinforcement Learning with Hierarchies of Abstract Machines,-in / Advances in Neural Information Processing Systems 10, pp. 1043-049, MIT Press, 1997.
    14. Russel, S. J. and Norvig, P., / Artificial intelligence - A modern approach (Second ed.), Prentice Hall, New Jersey, 2003.
    15. Sutton, R. S., / Temporal Credit Assignment in Reinforcement Learning, PhD Thesis, University of Massachusetts at Amherst, 1984.
    16. Sutton, R. S., “Integrated architectures for learning, planning, and reacting based on approximating dynamic programming,-in / Proc. of the Seventh International Conference on Machine Learning, Morgan Kaufmann, pp. 216-24, 1990.
    17. Sutton, R. S. and Barto, A. G., / Reinforcement learning: An introduction (First edition), MIT Press, Cambridge Massachusetts, 1998.
    18. Sutton, R. S., Precup, D. and Singh, S., “Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning,-in / Artificial Intelligence, 112, pp. 181-11, 1999.
    19. Sutton, R. S., Szepesvari, C., Geramifard, A. and Bowling, M., “Dyna-style planning with linear function approximation and prioritized sweeping,-in / Proc. of the 24th Conference on Uncertainty in Artificial Intelligence, pp. 528-36, 2008.
    20. Toussaint, M., “A sensorimotor map: Modulating lateral interactions for anticipation and planning,-in / Neural Computation 18, pp. 1132-155, 2006.
    21. Urbanowicz, R. J. and Moore, J. H., “Learning Classifier Systems: A Complete Introduction, Review, and Roadmap,-in / Journal of Artificial Evolution and Applications, 2009, Article ID 736398, 2009. doi:10.1155/2009/736398 .
    22. Watkins, C. J. C. H., / Learning from delayed rewards, PhD Thesis, University of Cambridge, England, 1989.
    23. Watkins, C. J. C. H and Dayan P., “Q-learning-in / Machine Learning 8, 279-92, 1992.
    24. Wilson S. W., “Classifier Fitness Based on Accuracy,-in / Evol. Comput. 3(2), pp. 149-75, 1995.
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Hardware
    Computer Systems Organization and Communication Networks
    Software Engineering, Programming and Operating Systems
    Computing Methodologies
  • 出版者:Ohmsha, Ltd.
  • ISSN:1882-7055
文摘
We study the model of projective simulation (PS), a novel approach to artificial intelligence based on stochastic processing of episodic memory which was recently introduced. 2) Here we provide a detailed analysis of the model and examine its performance, including its achievable efficiency, its learning times and the way both properties scale with the problems-dimension. In addition, we situate the PS agent in different learning scenarios, and study its learning abilities. A variety of new scenarios are being considered, thereby demonstrating the model’s flexibility. Furthermore, to put the PS scheme in context, we compare its performance with those of Q-learning and learning classifier systems, two popular models in the field of reinforcement learning. It is shown that PS is a competitive artificial intelligence model of unique properties and strengths.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700