深度虚拟游戏的随机差异游戏的融合

论文标题

深度虚拟游戏的随机差异游戏的融合

Convergence of Deep Fictitious Play for Stochastic Differential Games

论文作者

Han, Jiequn, Hu, Ruimeng, Long, Jihao

论文摘要

随机差异游戏已被广泛用于建模代理商在金融机构的P2P贷款平台，系统风险银行系统和保险市场的P2P贷款平台中进行建模。最近提出的机器学习算法是深度虚拟的游戏，为查找大型$ n $ - 玩家不对称随机差异游戏的马尔可夫纳什平衡提供了一种新颖的有效工具[J. Han和R. Hu，《数学和科学机器学习会议》，第221-245页，PMLR，2020年]。通过结合虚拟游戏的想法，该算法将游戏分解为$ n $子优化问题，并通过深部向后的随机差分方程（BSDE）方法识别每个玩家的最佳策略，并反复出现。在本文中，我们证明了深虚拟游戏（DFP）与真正的NASH平衡的融合。我们还可以证明，基于DFP的策略形成了$ \ eps $ -NASH平衡。我们通过提出一种将游戏解散的新方法来概括算法，并提出大量人口游戏的数值结果，显示了算法的经验收敛，而不是定理中的技术假设。

Stochastic differential games have been used extensively to model agents' competitions in Finance, for instance, in P2P lending platforms from the Fintech industry, the banking system for systemic risk, and insurance markets. The recently proposed machine learning algorithm, deep fictitious play, provides a novel efficient tool for finding Markovian Nash equilibrium of large $N$-player asymmetric stochastic differential games [J. Han and R. Hu, Mathematical and Scientific Machine Learning Conference, pages 221-245, PMLR, 2020]. By incorporating the idea of fictitious play, the algorithm decouples the game into $N$ sub-optimization problems, and identifies each player's optimal strategy with the deep backward stochastic differential equation (BSDE) method parallelly and repeatedly. In this paper, we prove the convergence of deep fictitious play (DFP) to the true Nash equilibrium. We can also show that the strategy based on DFP forms an $\eps$-Nash equilibrium. We generalize the algorithm by proposing a new approach to decouple the games, and present numerical results of large population games showing the empirical convergence of the algorithm beyond the technical assumptions in the theorems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题