部分可观察的排队网络中的分散协调

论文标题

部分可观察的排队网络中的分散协调

Decentralized Coordination in Partially Observable Queueing Networks

论文作者

Jia, Jiekai, Tahir, Anam, Koeppl, Heinz

论文摘要

我们考虑在完全合作的多代理系统中进行沟通，在该系统中，代理人对环境有部分观察，并且必须共同采取行动以最大程度地提高整体奖励。我们有一个离散的时间排队网络，在该网络中，代理数据包仅基于当前队列长度的部分信息来排队。队列的缓冲区容量有限，因此当数据包被发送到完整队列时会发生掉落。在这项工作中，我们实施了一个通信渠道，以使代理商共享其信息以降低数据包下降率。为了有效的信息共享，我们使用一个名为ATVC的基于注意力的通信模型来选择其他代理的信息信息。然后，代理使用变异自动编码器，VAE和Experts，POE，模型的组合来推断队列状态。最终，代理商学习了他们需要进行的交流以及与谁进行交流，而不是一直与所有人进行交流。我们还从经验上表明，ATVC能够推断出队列的真实状态，并导致一项胜过现有基准的政策。

We consider communication in a fully cooperative multi-agent system, where the agents have partial observation of the environment and must act jointly to maximize the overall reward. We have a discrete-time queueing network where agents route packets to queues based only on the partial information of the current queue lengths. The queues have limited buffer capacity, so packet drops happen when they are sent to a full queue. In this work, we implemented a communication channel for the agents to share their information in order to reduce the packet drop rate. For efficient information sharing we use an attention-based communication model, called ATVC, to select informative messages from other agents. The agents then infer the state of queues using a combination of the variational auto-encoder, VAE, and product-of-experts, PoE, model. Ultimately, the agents learn what they need to communicate and with whom, instead of communicating all the time with everyone. We also show empirically that ATVC is able to infer the true state of the queues and leads to a policy which outperforms existing baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题