具有无限许多决策者的可交换随机团队的独立随机对称政策的最佳

论文标题

具有无限许多决策者的可交换随机团队的独立随机对称政策的最佳

Optimality of Independently Randomized Symmetric Policies for Exchangeable Stochastic Teams with Infinitely Many Decision Makers

论文作者

Sanjari, Sina, Saldi, Naci, Yüksel, Serdar

论文摘要

我们研究了随机团队（也称为分散的随机控制或相同的兴趣随机动态游戏）与大量或次数无限的决策者数量的问题，并表征了（全球）最佳策略的存在和结构性。我们考虑静态和动态的非凸线团队问题，其中成本函数和动态满足交换性条件。为了达到最佳策略的存在和结构性结果，我们首先引入了控制策略的拓扑，鉴于分散的信息结构，该拓扑涉及各种放松。然后将其用于到达finetti类型的代表定理，以进行可交换策略。这导致了策略定理的代表，该策略允许无限的交换性条件。对于与$ n $决策者的随机团队问题的一般设置，在对决策者的观察和成本功能的交换性下，我们表明，如果不丧失全球最优性，就可以将寻求最佳政策的搜索仅限于$ n $ n $ exchangangangangable。然后，通过将$ n $交换的策略扩展到无限交换的策略，建立融合的参数，以进行诱导的成本，并使用呈现的finetti型定理，我们确定了一个最佳分散政策的静态和动态团队的最佳分散策略，具有无数无限的决策者，随机数量的随机数量是对称性的，即对属于对称性的（即）认同（即），并认同。特别是，与先前的工作不同，不假定政策成本的凸度。最后，我们显示了有限的$ n $ decision Maker团队问题对称的独立随机政策的最佳性，因此为$ n $ decision Maker建立近似结果弱耦合随机团队。

We study stochastic team (known also as decentralized stochastic control or identical interest stochastic dynamic game) problems with large or countably infinite number of decision makers, and characterize existence and structural properties for (globally) optimal policies. We consider both static and dynamic non-convex team problems where the cost function and dynamics satisfy an exchangeability condition. To arrive at existence and structural results on optimal policies, we first introduce a topology on control policies, which involves various relaxations given the decentralized information structure. This is then utilized to arrive at a de Finetti type representation theorem for exchangeable policies. This leads to a representation theorem for policies which admit an infinite exchangeability condition. For a general setup of stochastic team problems with $N$ decision makers, under exchangeability of observations of decision makers and the cost function, we show that without loss of global optimality, the search for optimal policies can be restricted to those that are $N$-exchangeable. Then, by extending $N$-exchangeable policies to infinitely-exchangeable ones, establishing a convergence argument for the induced costs, and using the presented de Finetti type theorem, we establish the existence of an optimal decentralized policy for static and dynamic teams with countably infinite number of decision makers, which turns out to be symmetric (i.e., identical) and randomized. In particular, unlike prior work, convexity of the cost in policies is not assumed. Finally, we show near optimality of symmetric independently randomized policies for finite $N$-decision maker team problems and thus establish approximation results for $N$-decision maker weakly coupled stochastic teams.

下载PDF全文

下载文献需遵守相关版权规定

论文标题