集中式和分布的深钢筋学习方法，用于下行链路总和优化

论文标题

集中式和分布的深钢筋学习方法，用于下行链路总和优化

Centralized & Distributed Deep Reinforcement Learning Methods for Downlink Sum-Rate Optimization

论文作者

Khan, Ahmad Ali, Adve, Raviraj

论文摘要

对于多细胞，多用户，蜂窝网络下行链路速率率通过功率分配最大化是一个非convex和NP-HARD优化问题。在本文中，我们提出了一种有效的方法来通过单一和多代理的参与者 - 批评深度强化学习（DRL）解决此问题。具体来说，我们使用有限的 - 摩恩信托区域优化。通过大量的模拟，我们表明我们可以同时获得比最先进的优化算法（如加权最小均值误差（WMMSE）和分数编程（FP）（FP）等最先进的优化算法，而提供执行时间比这些方法更快。此外，所提出的信任区域比优势参与者 - 批评（A2C）DRL算法表现出优越的性能和收敛性。与先前的方法相反，提议的分散DRL方法允许在BSS之间进行有限的CSI和可控信息交换，同时提供竞争性能和减少培训时间。

For a multi-cell, multi-user, cellular network downlink sum-rate maximization through power allocation is a nonconvex and NP-hard optimization problem. In this paper, we present an effective approach to solving this problem through single- and multi-agent actor-critic deep reinforcement learning (DRL). Specifically, we use finite-horizon trust region optimization. Through extensive simulations, we show that we can simultaneously achieve higher spectral efficiency than state-of-the-art optimization algorithms like weighted minimum mean-squared error (WMMSE) and fractional programming (FP), while offering execution times more than two orders of magnitude faster than these approaches. Additionally, the proposed trust region methods demonstrate superior performance and convergence properties than the Advantage Actor-Critic (A2C) DRL algorithm. In contrast to prior approaches, the proposed decentralized DRL approaches allow for distributed optimization with limited CSI and controllable information exchange between BSs while offering competitive performance and reduced training times.

下载PDF全文

下载文献需遵守相关版权规定

论文标题