论文标题
无线编码缓存的深度学习,具有未知和时间变化的内容受欢迎程度
Deep Learning for Wireless Coded Caching with Unknown and Time-Variant Content Popularity
论文作者
论文摘要
编码的缓存可以通过在多个缓存节点中分布每个文件的不同编码段来利用无线网络中的累积存储大小。本文旨在找到一种无线编码的缓存策略,以最大程度地减少折扣网络成本,涉及传输延迟和缓存更换成本,并使用深度学习工具。由于未知的,时间变化的内容受欢迎程度以及连续的,高维的动作空间,该问题是具有挑战性的。我们首先提出了一种基于聚类的长期记忆(C-LTSM)方法,以使用历史请求信息来预测内容请求的数量。这种方法利用了通过群集在不同文件之间的历史请求信息的相关性。基于预测的结果,我们然后提出了一种有监督的深层确定性政策梯度(SDDPG)方法。一方面,这种方法可以通过使用参与者批评的架构来学习连续行动空间中的缓存策略。另一方面,它通过基于对近似问题的近似问题的解决方案进行预培训来加速学习过程,从而最大程度地减少了插槽成本。基于痕量的数值结果表明,使用深度学习的拟议预测和缓存策略优于被考虑的现有方法。
Coded caching is effective in leveraging the accumulated storage size in wireless networks by distributing different coded segments of each file in multiple cache nodes. This paper aims to find a wireless coded caching policy to minimize the total discounted network cost, which involves both transmission delay and cache replacement cost, using tools from deep learning. The problem is known to be challenging due to the unknown, time-variant content popularity as well as the continuous, high-dimensional action space. We first propose a clustering based long short-term memory (C-LTSM) approach to predict the number of content requests using historical request information. This approach exploits the correlation of the historical request information between different files through clustering. Based on the predicted results, we then propose a supervised deep deterministic policy gradient (SDDPG) approach. This approach, on one hand, can learn the caching policy in continuous action space by using the actor-critic architecture. On the other hand, it accelerates the learning process by pre-training the actor network based on the solution of an approximate problem that minimizes the per-slot cost. Real-world trace-based numerical results show that the proposed prediction and caching policy using deep learning outperform the considered existing methods.