Value set iteration for two-person zero-sum Markov games
详细信息    查看全文
文摘
We present a novel exact algorithm called “value set iteration” (VSI) for solving two-person zero-sum Markov games (MGs) as a generalization of value iteration (VI) and as a general framework of combining multiple solution methods. We introduce a novel operator in the value function space and iteratively apply the operator with any sequence of the set of policies, extending Chang’s VSI for MDPs into the MG setting. We show that VSI for MGs converges to the equilibrium value function with at least linear convergence rate and establish that VSI can potentially improve the convergence speed in terms of the number of iterations by proper setting of the sequence of the set of policies.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700