Semi-Markov decision processes with variance minimization criterion
详细信息    查看全文
  • 作者:Qingda Wei (1)
    Xianping Guo (2)

    1. School of Economics and Finance
    ; Huaqiao University ; Quanzhou聽 ; 362021 ; People鈥檚 Republic of China
    2. School of Mathematics and Computational Science
    ; Sun Yat-Sen University ; Guangzhou聽 ; 510275 ; People鈥檚 Republic of China
  • 关键词:Semi ; Markov decision processes ; State ; dependent discount factors ; Discount optimality equation ; Discount variance minimal policy ; 93E20 ; 90C40
  • 刊名:4OR: A Quarterly Journal of Operation Research
  • 出版年:2015
  • 出版时间:March 2015
  • 年:2015
  • 卷:13
  • 期:1
  • 页码:59-79
  • 全文大小:268 KB
  • 参考文献:1. Bertsekas, DP (2001) Dynamic programming and optimal control. Athena Scientific, Belmont
    2. Berument, H, Kilinc, Z, Ozlale, U (2004) The effects of different inflation risk prepius on interest rate spreads. Phys A 333: pp. 317-324 CrossRef
    3. Cruz-Su谩rez, D, Montes-de-Oca, R, Salem-Silva, F (2004) Conditions for the uniqueness of optima policies of discounted Markov decision processes. Math Methods Oper Res 60: pp. 415-436 CrossRef
    4. Filar, JA, Kallenberg, LCM, Lee, HM (1989) Variance-penalized Markov decision processes. Math Oper Res 14: pp. 147-161 CrossRef
    5. Gonz谩lez-Hern谩ndez, J, L贸pez-Mart铆nez, RR, Minj谩rez-Sosa, JA (2008) Adaptive policies for stochastic systems under a randomized cost criterion. Bol Soc Mat Mex 14: pp. 149-163
    6. Gonz谩lez-Hern谩ndez, J, L贸pez-Mart铆nez, RR, Minj谩rez-Sosa, JA (2009) Approximation, estimation and control of stochastic systems under randomized discounted cost criterion. Kybernetika 45: pp. 737-754
    7. Guo, XP, Yang, J (2008) A new condition and approach for zero-sum stochastic games with average payoffs. Stoch Anal Appl 26: pp. 537-561 CrossRef
    8. Guo, XP, Hern谩ndez-Lerma, O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin Heidelberg CrossRef
    9. Hern谩ndez-Lerma, O, Lasserre, JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York CrossRef
    10. Hern谩ndez-Lerma, O, Lasserre, JB (1999) Further topics on discrete-time Markov control processes. Springer, New York CrossRef
    11. Hern谩ndez-Lerma, O, Vega-Amaya, O, Carrasco, G (1999) Sample-path optimality and variance-minimization of average cost Markov control processes. SIAM J Control Optim 38: pp. 79-93 CrossRef
    12. Hinderer, K (1970) Foundations of non-stationary dynamical programming with discrete time parameter. Springer, New York CrossRef
    13. Huang, Y, Kallenberg, LCM (1994) On finding optimal policies for Markov decision chains: a unifying framework for mean-variance-tradeoffs. Math Oper Res 19: pp. 434-448 CrossRef
    14. Jaquette, SC (1973) Markov decision processes with a new optimality criterion: discrete time. Ann Stat 1: pp. 496-505 CrossRef
    15. Kadota Y, Kurano M, Yasuda M (1995) Discounted Markov decision processes with general utility. In: Proceeding of APORS鈥?94. World Scientific, pp 330鈥?37
    16. Kitaev, MY, Rykov, VV (1995) Controlled queueing systems. CRC Press, Florida
    17. Newell, RG, Pizer, WA (2003) Discounting the distant future: how much do uncertain rates increase valuation. J Environ Econ Manage 46: pp. 52-71 CrossRef
    18. Puterman, ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York CrossRef
    19. Sch盲l, M (1975) Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z Wahrscheinlichkeitstheorie Verw Gebiete 32: pp. 179-196 CrossRef
    20. Sobel, MJ (1982) The variance of discounted Markov decision processes. J Appl Probab 19: pp. 794-802 CrossRef
    21. Vega-Amaya, O On the regularity property of semi-Markov processes with Borel state spaces. In: Hern谩ndez-Hern谩ndez, D, Minj谩rez-Sosa, JA eds. (2012) Optimization, control, and applications of stochastic systems. Springer, New York, pp. 301-309 CrossRef
    22. Wakuta, W (1987) Arbitrary state semi-Markov decision processes with unbounded rewards. Optimization 18: pp. 447-454 CrossRef
    23. Wei, QD, Guo, XP (2011) Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper Res Lett 39: pp. 369-374
    24. Wei, QD, Guo, XP (2012) New average optimality conditions for semi-Markov decision processes in Borel spaces. J Optim Theory Appl 153: pp. 709-732 CrossRef
    25. Zhang, Y (2013) Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. Top 21: pp. 378-408 CrossRef
    26. Zhu, QX, Guo, XP (2007) Markov decision processes with variance minimization: a new condition and approach. Stoch Anal Appl 25: pp. 577-592 CrossRef
  • 刊物类别:Business and Economics
  • 刊物主题:Economics
    Operation Research and Decision Theory
    Optimization
    Industrial and Production Engineering
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1614-2411
文摘
We consider a variance minimization problem for semi-Markov decision processes with state-dependent discount factors in Borel spaces. The reward function may be unbounded from below and from above. Under suitable conditions, we first prove that the discount variance minimization criterion can be transformed into an equivalent expected discount criterion, and then show the existence of a discount variance minimal policy over the class of expected discount optimal stationary policies. Furthermore, we also give a value iteration algorithm for calculating the expected discount optimal value function. Finally, two examples are used to illustrate our results.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700