论文标题

平均付费对手stackelberg游戏中的脆弱性和鲁棒性

Fragility and Robustness in Mean-Payoff Adversarial Stackelberg Games

论文作者

Balachander, Mrudula, Guha, Shibashis, Raskin, Jean-François

论文摘要

两人的平均付款赛车手游戏是由领导者(播放器0)和追随者(播放器1)在双重加权图上播放的非零和无限持续游戏。这样的游戏是依次玩的:首先,领导者宣布她的策略,其次,追随者选择了他的最佳响应。如果我们不能强加追随者选择哪种最佳响应,那么我们说,尽管战略性却是对领导者的对抗性。领导者可以在这个非零和游戏中获得的最大价值称为游戏的对抗性stackelberg值(ASV)。 我们研究了这些游戏中领导者策略的鲁棒性,以抵制两种类型的偏差:(i)对不精确进行建模 - 游戏领域边缘的权重可能不是完全正确的,它们可能与正确的差异。 (ii)次优的响应 - 追随者可能会播放Epsilon-Timimal最佳回答,而不是完美的最佳回答。首先,我们表明,如果游戏为零和零,则可以在非零和情况下保证鲁棒性,而ASV的最佳策略是脆弱的。其次,我们提供了一个解决方案概念,以获取领导者的策略,这些策略既可以对不精确进行建模,又是对追随者的Epsilon最佳响应,并研究了与此解决方案概念相关的几种属性和算法问题。

Two-player mean-payoff Stackelberg games are nonzero-sum infinite duration games played on a bi-weighted graph by Leader (Player 0) and Follower (Player 1). Such games are played sequentially: first, Leader announces her strategy, second, Follower chooses his best-response. If we cannot impose which best-response is chosen by Follower, we say that Follower, though strategic, is adversarial towards Leader. The maximal value that Leader can get in this nonzero-sum game is called the adversarial Stackelberg value (ASV) of the game. We study the robustness of strategies for Leader in these games against two types of deviations: (i) Modeling imprecision - the weights on the edges of the game arena may not be exactly correct, they may be delta-away from the right one. (ii) Sub-optimal response - Follower may play epsilon-optimal best-responses instead of perfect best-responses. First, we show that if the game is zero-sum then robustness is guaranteed while in the nonzero-sum case, optimal strategies for ASV are fragile. Second, we provide a solution concept to obtain strategies for Leader that are robust to both modeling imprecision, and as well as to the epsilon-optimal responses of Follower, and study several properties and algorithmic problems related to this solution concept.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源