Multi-agent reinforcement learning based maintenance policy for a resource constrained flow line system
详细信息    查看全文
  • 作者:Xiao Wang ; Hongwei Wang ; Chao Qi
  • 关键词:Multiple yield deterioration ; Semi ; Markov decision process ; Constrained resource ; Multi ; agent reinforcement learning ; Two ; machine flow line
  • 刊名:Journal of Intelligent Manufacturing
  • 出版年:2016
  • 出版时间:April 2016
  • 年:2016
  • 卷:27
  • 期:2
  • 页码:325-333
  • 全文大小:737 KB
  • 参考文献:Aissani, N., Bekrar, A., Trentesaux, D., & Beldjilali, B. (2012). Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. Journal of Intelligent Manufacturing, 23(6), 2513–2529. doi:10.​1007/​s10845-011-0580-y .CrossRef
    Archimede, B., Letouzey, A., Memon, M. A., & Xu, J. (2013). Towards a distributed multi-agent framework for shared resources scheduling. Journal of Intelligent Manufacturing. doi:10.​1007/​s10845-013-0748-8 .
    Berenguer, C., Chu, C., & Grall, A. (1997). Inspection and maintenance planning: An application of semi-Markov decision processes. Journal of Intelligent Manufacturing, 8(5), 467–476. doi:10.​1023/​A:​1018570518804 .CrossRef
    Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(2), 156–172.CrossRef
    Busoniu, L., De Schutter, B., & Babuska, R. (2005). Learning and coordination in dynamic multiagent systems. Technical report , Delft University of Technolgy, The Netherlands.
    Cui, L., Kuo, W., Loh, H. T., & Xie, M. (2004). Optimal allocation of minimal & perfect repairs under resource constraints. IEEE Transactions on Reliability, 53(2193), 193–199.CrossRef
    Das, T. D., Gosavi, A., Mahadevan, S., & Marchalleck, N. (1999). Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45, 560–574.CrossRef
    Edwards, D. J., Holt, G. D., & Harris, F. C. (2000). A model for predicting plant maintenance costs. Construction Management and Economics, 18, 65–75.CrossRef
    Fan, H., Hu, C., Chen, M., & Zhou, D. (2011). Cooperative predictive maintenance of repairable systems with dependent failure modes and resource constraint. IEEE Transactions on Reliability, 60(1), 144–157.CrossRef
    Friedrich, H., Rogalla, O., & Dillmann, R. (1998). Integrating skills into multi-agent systems. Journal of Intelligent Manufacturing, 9(2), 119–127. doi:10.​1023/​A:​1008811827890 .CrossRef
    Gabel, T., & Riedmiller, M. (2007). On a successful application of multi-agent reinforcement learning to operations research benchmarks. In Proceedings of the 2007 IEEE symposium on approximate dynamic programming and reinforcement learning (pp. 68–75), Honolulu.
    Ganesan, R., Balakrishna, P., & Sherry, L. (2010). Improving quality of prediction in highly dynamic environments using approximate dynamic programming. Quality and Reliability Engineering International, 26(7), 717–732.CrossRef
    Herrera, I.A., & Hovden, J. (2008). Leading indicators applied to maintenance in the framework of resilience engineering: A conceptual approach. In The 3rd resilience engineering symposium (pp. 28–30), AntibesJuan Les Pins.
    Karamatsoukis, C. C., & Kyriakidis, E. G. (2010). Optimal maintenance of two stochastically deteriorating machines with an intermediate buffer. European Journal of Operational Research, 207(1), 297–308.CrossRef
    Kim, J., & Gershwin, S. B. (2005). Integrated quality and quantity modeling of a production line. OR Spectrum, 27(2–3), 287–314.
    Kuo, Y. (2006). Optimal adaptive control policy for joint machine maintenance and product quality control. European Journal of Operational Research, 171, 97–586.CrossRef
    Kyriakidis, E. G., & Dimitrakos, T. D. (2006). Optimal preventive maintenance of a production system with an intermediate buffer. European Journal of Operational Research, 168(1), 86–99.
    Liao, G. (2012). Joint production and maintenance strategy for economic production quantity model with imperfect production processes. Journal of Intelligent Manufacturing. doi:10.​1007/​s10845-012-0658-1 .
    Mosley, S. A., Teyner, T., & Uzsoy, R. M. (1998). Maintenance scheduling and staffing policies in a wafer fabrication facility. IEEE Transactions on Semiconductor Manufacturing, 11(2), 316–323.CrossRef
    Nguyen, D. G., & Murthy, D. N. P. (1981). Optimal preventive maintenance policies for repairable systems. Operations Research, 29, 1181–1194.CrossRef
    Radhoui, M., Rezg, N., & Chelbi, A. (2010). Joint quality control and preventive maintenance strategy for imperfect production processes. Journal of Intelligent Manufacturing, 21(2), 205–212. doi:10.​1007/​s10845-008-0198-x .CrossRef
    Schick, I. C., Gershwin, S. B., & Kim, J. (2005). Quality/quantity modeling and analysis of production lines subject to uncertainty. Final Report, Laboratory for Manufacturing and Productivity, Massachusetts Institute of Technology: Phase I.
    Van Noortwijk, J. M. (2009). A survey of the application of gamma processes in maintenance. Reliability Engineering and System Safety, 94, 2–21.CrossRef
    Wang, G., & Mahadevan, S. (1999). Hierarchical optimization of policy-coupled semi-Markov decision processes. In 16th International conference on machine learning (pp. 464–473), San Francisco, CA.
    Wang, H. (2002). A survey of maintenance policies of deteriorating systems. European Journal of Operational Research, 139, 469–489.CrossRef
    Zhang, F., & Jardine, A. S. (1998). Optimal maintenance models with minimal repair, periodic overhaul and complete renewal. IIE Transactions, 30(12), 1109–1119.
  • 作者单位:Xiao Wang (1) (2)
    Hongwei Wang (1)
    Chao Qi (1)

    1. Institute of Systems Engineering/The State Key Laboratory of Education Ministry for Image Processing and Intelligent Control, Huazhong University of Science and Technology, Wuhan, 430074, People’s Republic of China
    2. College of Safety Engineering, Shenyang Aerospace University, Shenyang, People’s Republic of China
  • 刊物类别:Business and Economics
  • 刊物主题:Economics
    Production and Logistics
    Manufacturing, Machines and Tools
    Automation and Robotics
  • 出版者:Springer Netherlands
  • ISSN:1572-8145
文摘
This paper investigates the maintenance problem for a flow line system consisting of two series machines with an intermediate finite buffer in between. Both machines independently deteriorate as they operate, resulting in multiple yield levels. Resource constrained imperfect preventive maintenance actions may bring the machine back to a better state. The problem is modeled as a semi-Markov decision process. A distributed multi-agent reinforcement learning algorithm is proposed to solve the problem and to obtain the control-limit maintenance policy for each machine associated with the observed state represented by yield level and buffer level. An asynchronous updating rule is used in the learning process since the state transitions of both machines are not synchronous. Experimental study is conducted to evaluate the efficiency of the proposed algorithm.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700