The value functions of Markov decision processes
详细信息    查看全文
文摘
It is known that the value function of a Markov decision process, as a function of the discount factor URL="/science?_ob=MathURL&_method=retrieve&_eid=1-s2.0-S0167637716300487&_mathId=si2.gif&_user=111111111&_pii=S0167637716300487&_rdoc=1&_issn=01676377&md5=8cd069c5799882edd288853ca003bd08" title="Click to view the MathML source">λ, is the maximum of finitely many rational functions in urce">λ. Moreover, each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1. We prove the converse of this result, namely, every function that is the maximum of finitely many rational functions in urce">λ, satisfying the property that each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1, is the value function of some Markov decision process. We thereby provide a characterization of the set of value functions of Markov decision processes.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700