A reinforcement-learning approach for admission control in distributed network service systems

详细信息查看全文

作者：Xiaonong Lu ; Baoqun Yin ; Haipeng Zhang
关键词：Distributed network service system ; Admission control ; SMDP ; Reinforcement ; learning ; Policy switching mechanism
刊名：Journal of Combinatorial Optimization
出版年：2016
出版时间：April 2016
年：2016
卷：31
期：3
页码：1241-1268
全文大小：1,236 KB
参考文献：Abundo M, Cardellini V, Presti FL (2012) Admission control policies for a multi-class QoS-aware service oriented architecture. ACM SIGMETRICS Perform Eval Rev 39(4):89–98CrossRef
Altman E, Jimenez T, Koole G (2001) On optimal call admission control in a resource-sharing system. IEEE Trans Commun 49(9):1569–1668CrossRef MATH
Cao X (2005a) A basic formula for online policy gradient algorithms. IEEE Trans Autom Control 50(5):696–699MathSciNet CrossRef
Cao X (2005b) Basic ideas for event-based optimization of Markov systems. Discr Event Dyn Syst Theory Appl 15(2):169–197MathSciNet CrossRef MATH
Chen W, Shih C (2012) Architecture of portable electronic medical records system integrated with streaming media. J Med Syst 36(1):25–31CrossRef
Gosavi A (2004) Reinforcement learning for long-run average cost. Eur J Oper Res 155:657–674MathSciNet CrossRef MATH
Gosavi A (2011) Target-sensitive control of Markov and semi-Markov processes. Int J Control Autom Syst 9(5):941–951MathSciNet CrossRef
Huang Y, Fu T, Chiu D, Lui J, Huang C (2008) Challenges, design and analysis of a large-scale p2p-vod system. In: ACM SIGCOMM, pp 375–388
Janssen J (1999) Semi-Markov models: theory and applications. Springer, New YorkCrossRef MATH
Janssen J, Manca R (2005) Applied semi-Markov processes. Springer, New YorkMATH
Li J, Yang J, Xi H (2009) Dynamic threshold based admission control policy for video-on-demand systems. J Chin Comput Syst 3(3):551–554
Li Y, Cao F (2013) A basic formula for performance gradient estimation of semi-Markov decision processes. Eur J Oper Res 224:333–339MathSciNet CrossRef MATH
Lin F, Yin B, Huang J, Wu X (2012) Admission control with elastic QoS for video-on-demand systems. Int J Autom Comput 9(5):467–473CrossRef
Lu X, Yin B, Zhang H, Ling Q (2012) Admission control scheme for distributed service systems based on model and prediction. In: Chinese Control Conference, pp 5518–5523
Lu X, Yin B, Zhang H (2014) Switching-pomdp based admission control policies for service systems with distributed architecture. In: IEEE ICNSC, pp 209–214
Mundur P, Simon R, Sood A (2004) End-to-end analysis of distributed video-on-demand systems. IEEE Trans Multimed 6(1):129–141CrossRef
Mundur P, Sood A, Simon R (2005) Class-based access control for distributed video-on-demand systems. IEEE Trans Circ Syst Video Technol 15(7):844–853CrossRef
Ni J, Tstang D, Tatikonda S, Bensaou B (2007) Optimal and structured call admission control policies for resource-sharing systems. IEEE Trans Commun 55(1):158–170CrossRef
Singh S, Tadic V, Doucet A (2007) A policy gradient method for semi-Markov decision processes with application to call admission control. Eur J Oper Res 178:808–818MathSciNet CrossRef MATH
Thng I, Luo X (2004) A robust m/m/1/k scheme for providing hand-off dropping QoS in multi-service mobile networks. Wirel Netw 10(3):301–309CrossRef
Xia Z, Hao W, Yen I, Li P (2005) Architecture of portable electronic medical records system integrated with streaming media. IEEE Trans Parallel Distrib Syst 16(12):1143–1153CrossRef
Yin B, Lu S, Guo D (2011) Analysis of admission control in p2p-based media delivery network based on POMDP. Int J Innov Comput Inf Control 7(7B):4411–4422
Zhang F, Sun W (2012) P2p streaming media technology in the remote education system. Adv Mater Res 433:4893–4897CrossRef
Zhang H, Yin B, Lu X (2013) A novel dynamic model for streaming service system. In: IEEE ICSESS, pp 326–329
Zhi Y, Zhu Z, Ma X, Wang B (2006) Client-class based admission control for distributed video-on-demand system. In: International Conference on Digital Object Identifier, pp 1–4
Zhou Y, Chiu D, Lui J (2011) A simple model for chunk scheduling strategies in p2p streaming. IEEE/ACM Trans Netw 19(1):42–54CrossRef
Zimmerman R, Fu K (2003) Comprehensive statistical admission control for streaming media servers. In: ACM Multimedia Conference, pp 75–85
作者单位：Xiaonong Lu (1)
Baoqun Yin (1)
Haipeng Zhang (1)

1. Department of Automation, University of Science and Technology of China, Hefei, 230027, China
刊物类别：Mathematics and Statistics
刊物主题：Mathematics
Combinatorics
Convex and Discrete Geometry
Mathematical Modeling and IndustrialMathematics
Theory of Computation
Optimization
Operation Research and Decision Theory
出版者：Springer Netherlands
ISSN：1573-2886

文摘

In the distributed network service systems such as streaming-media systems and resource-sharing systems with multiple service nodes, admission control (AC) technology is an essential way to enhance performance. Model-based optimization approaches are good ways to be applied to analyze and solve the optimal AC policy. However, due to “the curse of dimensionality”, computing such policy for practical systems is a rather difficult task. In this paper, we consider a general model of the distributed network service systems, and address the problem of designing an optimal AC policy. An analytical model is presented for the system with fixed parameters based on semi-Markov decision process (SMDP). We design an event-driven AC policy, and the stationary randomized policy is taken as the policy structure. To solve the SMDP, both the state aggregation approach and the reinforcement-learning (RL) method with online policy optimization algorithm are applied. Then, we extend the problem by considering the system with time-varying parameters, where the arrival rates of requests at each service node may change over time. In view of this situation, an AC policy switching mechanism is presented. This mechanism allows the system to decide whether to adjust its AC policy according to the policy switching rule. And in order to maximize the gain of system, that is, to obtain the optimal AC policy switching rule, another RL-based algorithm is applied. To assess the effectiveness of SMDP-based AC policy and policy switching mechanism for the system, numerical experiments are presented. We compare the performance of optimal policies obtained by the solutions of proposed methods with other classical AC policies. The simulation results illustrate that higher performance and computational efficiency could be achieved by using the SMDP model and RL-based algorithms proposed in this paper.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700