摘要
针对蜂窝网资源分配多目标优化问题,提出了一种基于深度强化学习的蜂窝网资源分配算法。首先构建深度神经网络(DNN),优化蜂窝系统的传输速率,完成算法的前向传输过程;然后将能量效率作为奖惩值,采用Q-learning机制来构建误差函数,利用梯度下降法来训练DNN的权值,完成算法的反向训练过程。仿真结果表明,所提出的算法可以自主设置资源分配方案的偏重程度,收敛速度快,在传输速率和系统能耗的优化方面明显优于其他算法。
In order to solve multi-objective optimization problem, a resource allocation algorithm based on deep rein-forcement learning in cellular networks was proposed. Firstly, deep neural network(DNN) was built to optimize thetransmission rate of cellular system and to complete the forward transmission process of the algorithm. Then, theQ-learning mechanism was utilized to construct the error function, which used energy efficiency as the rewards. The gra-dient descent method was used to train the weights of DNN, and the reverse training process of the algorithm was com-pleted. The simulation results show that the proposed algorithm can determine optimization extent of optimal resourceallocation scheme with rapid convergence ability, it is obviously superior to the other algorithms in terms of transmissionrate and system energy consumption optimization.
引文
[1]HUANG J,YIN Y,ZHAO Y,et al.A game-theoretic resource allocation approach for intercell device-to-device communications in cellular networks[J].IEEE Transactions on Emerging Topics in Computing,2016,4(4):475-486.
[2]WANG J,CHOU S.Secure strategy proof ascending-price spectrum auction[C]//IEEE Symposium on Privacy-Aware Computing.2017:96-106.
[3]YANG T,ZHANG R,CHENG X,et al.Graph coloring based resource sharing scheme(GCRS)for D2D communications underlaying full-duplex cellular networks[J].IEEE Transactions on Vehicular Technology,2017,66(8):7506-7517.
[4]TAKSHI H,DO?AN G,ARSLAN H.Joint optimization of device to device resource and power allocation based on genetic algorithm[J].IEEE Access,2018,6:21173-21183.
[5]CHALLITA U,DONG L,SAAD W.Proactive resource management for ITE in unlicensed spectrum:a deep learning perspective[J].IEEE Transactions on Wireless Communications,2018,17(7):4674-4689.
[6]LEE W.Resource allocation for multi-channel underlay cognitive radio network based on deep neural network[J].IEEE Communications Letters,2018,22(9):1942-1945.
[7]LIU S,HU X,WANG W.Deep reinforcement learning based dynamic channel allocation algorithm in multibeam satellite systems[J].IEEEAccess,2018,6:15733-15742.
[8]赵慧,张学,刘明,等.实现无线传输能量效率最大化的功率控制新方法[J].计算机应用,2013,33(2):365-368.ZHAO H,ZHANG X,LIU M,et al.New power control scheme with maximum energy efficiency in wireless transmission[J].Journal of Computer Application,2013,33(2):365-368.
[9]GAO X Z,HAN H C,YANG K,et al.Energy efficiency optimization for D2D communications based on SCA and GP method[J].China Communications,2017,14(3):66-74.
[10]SUTTON R S,BARTO A G.Reinforcement learning:an introduction[M].Massachusetts:MIT Press,2017.
[11]焦李成,杨进,杨淑媛,等.深度学习、优化与识别[M].北京:清华大学出版社,2017.JIAO L C,ZHAO J,YANG S Y,et al.Deep learning,optimization and recognition[M].Beijing:Tsinghua University Press,2017.