基于相似性的合作平衡

论文标题

基于相似性的合作平衡

Similarity-based cooperative equilibrium

论文作者

Oesterheld, Caspar, Treutlein, Johannes, Grosse, Roger, Conitzer, Vincent, Foerster, Jakob

论文摘要

随着机器学习代理在世界上的自主行动，他们将越来越多地相互互动。不幸的是，在许多社会困境中，例如单枪囚犯的困境，标准游戏理论预测，ML代理商将无法彼此合作。先前的工作表明，在一次性囚犯的困境中实现合作成果的一种方法是使代理人相互透明，即允许他们访问彼此的源代码（Rubinstein 1998，Tennenholtz，2004年），或者在ML ML代理商中。但是，完全透明度通常是不现实的，而部分透明度是司空见惯的。此外，代理商在整个透明度环境中学习合作的方式是一个挑战。在本文中，我们介绍了一个更现实的环境，其中代理只观察一个数字，表明它们彼此之间的相似程度。我们证明，这允许与完整的透明度设置相同的合作成果集。我们还通过实验证明可以使用简单的ML方法来学习合作。

As machine learning agents act more autonomously in the world, they will increasingly interact with each other. Unfortunately, in many social dilemmas like the one-shot Prisoner's Dilemma, standard game theory predicts that ML agents will fail to cooperate with each other. Prior work has shown that one way to enable cooperative outcomes in the one-shot Prisoner's Dilemma is to make the agents mutually transparent to each other, i.e., to allow them to access one another's source code (Rubinstein 1998, Tennenholtz 2004) -- or weights in the case of ML agents. However, full transparency is often unrealistic, whereas partial transparency is commonplace. Moreover, it is challenging for agents to learn their way to cooperation in the full transparency setting. In this paper, we introduce a more realistic setting in which agents only observe a single number indicating how similar they are to each other. We prove that this allows for the same set of cooperative outcomes as the full transparency setting. We also demonstrate experimentally that cooperation can be learned using simple ML methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题