深度强化学习在Atari视频游戏上的应用

英文篇名：The Application of Depth of reinforcement Learning in the Vedio Game
作者：石征锦 ; 王康
英文作者：Shi Zhengjin;Wang Kang;School Of Automation And Electrical Engineering,Shenyang Ligong University;
关键词：强化学习 ; 深度学习 ; 神经网络 ; 视频游戏
英文关键词：reinforcement learning;;deep learning;;neural network;;vedio game
中文刊名：ELEW
英文刊名：Electronics World
机构：沈阳理工大学自动化与电气工程学院;
出版日期：2017-08-23
出版单位：电子世界
年：2017
期：No.526
语种：中文;
页：ELEW201716096
页数：3
CN：16
ISSN：11-2086/TN
分类号：107-108+111

摘要

考虑到深度学习在图像特征提取上的优势,为了提高深度学习在Atari游戏上的稳定性,在卷积神经网络和强化学习改进的Q-learning算法相结合的基础上,提出了一种基于模型融合的深度神经网络结构。实验表明,新的模型能够充分学习到控制策略,并且在Atari游戏上达到或者超出普通深度强化学习模型的得分,验证了模型融合的深度强化学习在视频游戏上的稳定性和优越性。
Considering the advantage of depth learning in image feature extraction,In order to improve the depth study on the Atari game performance this paper proposes a depth neural network structure based on model fusion,convolution neural network and modified Q-learning algorithm.Experiments show that the new model can fully study the control strategy,and it achieve or exceed the scores of the general learning model in the Atari game.Proving the deep reinforcement learning based on model fusion have the stability and superiority in the video game.

引文

[1]MNIHV,KAVUKCUOGLUK,SILVERD,etal..Human-levelcontrol through deep reinforc ement learning[J].Nature,2015,518(7540):529-533.
    [2]SILVER D,HUANG A,MADDISON C,et al.Mastering the game of Go with deep neural,networks and tree search[J].Nature,2016,529(7587):484-489.
    [3]赵冬斌,邵坤,朱圆恒,李栋,陈亚冉等.深度强化学习综述:兼论计算机围棋的发展[J].控制理论与应用,DOI:10.7641/CTA.2016.60173.
    [4]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[C]//Proceedings of the NIPS Workshop on Deep Learning.Lake Tahoe:MIT Press,2013.
    [5]WATKINS C J C H.Learning from delayed rewards[D].Cambridge:University of Cambridge,1989.
    [6]Riedmiller M.Neural fitted Q iteration-first experiences with a data ecient neural reinforcement learning method[J].In:Proceedings of the 16th European Conference on Machine Learning.Porto,Portugal:Springer,2005.
    [7]Marc G Bellemare,Yavar Naddaf,Joel Veness,and Michael Bowling.The arcade learning environment:An evaluation platform for general agents[J].Journal of Artificial Intelligence Research,47:253-279,2013.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700