下一代HETNET中的联合多鼠分配和动态资源分配的基于分层的多代理DRL框架

论文标题

下一代HETNET中的联合多鼠分配和动态资源分配的基于分层的多代理DRL框架

Hierarchical Multi-Agent DRL-Based Framework for Joint Multi-RAT Assignment and Dynamic Resource Allocation in Next-Generation HetNets

论文作者

Alwarafy, Abdulmalik, Ciftler, Bekir Sait, Abdallah, Mohamed, Hamdi, Mounir, Al-Dhahir, Naofal

论文摘要

本文考虑了下一代异质无线网络（HETNETS）中的联合最佳无线电访问技术（RAT）分配和功率分配的成本感知下行链路总和最大化的问题。我们考虑一个未来的HETNET，该网状网络由多大鼠和服务多连接的边缘设备（EDS）组成，我们将问题作为混合智能非线性编程（MINP）问题提出。由于此问题的高复杂性和组合性质以及使用常规方法解决该问题的难度，我们提出了一个层次的多代理深钢筋学习（DRL）基于Deeprat的框架，称为DeepRat，以有效地解决它并学习系统动态。特别是，DeepRat框架将问题分解为两个主要阶段。大鼠分配阶段实现了单个AGENT DEEP Q网络（DQN）算法和功率分配阶段，该阶段利用了多代理的深层确定性策略梯度（DDPG）算法。使用仿真，我们演示了各种DRL代理如何有效地交互以学习系统动力学并得出全球最佳策略。此外，我们的仿真结果表明，所提出的DeepRat算法在网络实用程序方面优于现有的最新启发式方法。最后，我们定量地显示了DeepRat模型快速，动态地适应网络动力学的突然变化（例如EDS Mobility）的能力。

This paper considers the problem of cost-aware downlink sum-rate maximization via joint optimal radio access technologies (RATs) assignment and power allocation in next-generation heterogeneous wireless networks (HetNets). We consider a future HetNet comprised of multi-RATs and serving multi-connectivity edge devices (EDs), and we formulate the problem as mixed-integer non-linear programming (MINP) problem. Due to the high complexity and combinatorial nature of this problem and the difficulty to solve it using conventional methods, we propose a hierarchical multi-agent deep reinforcement learning (DRL)-based framework, called DeepRAT, to solve it efficiently and learn system dynamics. In particular, the DeepRAT framework decomposes the problem into two main stages; the RATs-EDs assignment stage, which implements a single-agent Deep Q Network (DQN) algorithm, and the power allocation stage, which utilizes a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm. Using simulations, we demonstrate how the various DRL agents efficiently interact to learn system dynamics and derive the global optimal policy. Furthermore, our simulation results show that the proposed DeepRAT algorithm outperforms existing state-of-the-art heuristic approaches in terms of network utility. Finally, we quantitatively show the ability of the DeepRAT model to quickly and dynamically adapt to abrupt changes in network dynamics, such as EDs mobility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题