The value functions of Markov decision processes

详细信息查看全文

作者：Ehud Lehrer^a ; ^b ; ^{lehrer@post.tau.ac.il" class="auth_mail" title="E-mail the corresponding author} ; Eilon Solan^a ; ^{eilons@post.tau.ac.il" class="auth_mail" title="E-mail the corresponding author} ; Omri N. Solan^a ; ^{omrisola@post.tau.ac.il" class="auth_mail" title="E-mail the corresponding author}
关键词：Markov decision problems ; Value function ; Characterization
刊名：Operations Research Letters
出版年：2016
出版时间：September 2016
年：2016
卷：44
期：5
页码：587-591
全文大小：432 K

文摘

It is known that the value function of a Markov decision process, as a function of the discount factor URL="/science?_ob=MathURL&_method=retrieve&_eid=1-s2.0-S0167637716300487&_mathId=si2.gif&_user=111111111&_pii=S0167637716300487&_rdoc=1&_issn=01676377&md5=8cd069c5799882edd288853ca003bd08" title="Click to view the MathML source">λ

λ

, is the maximum of finitely many rational functions in urce">λ

λ

. Moreover, each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1. We prove the converse of this result, namely, every function that is the maximum of finitely many rational functions in urce">λ

λ

, satisfying the property that each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1, is the value function of some Markov decision process. We thereby provide a characterization of the set of value functions of Markov decision processes.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700