论文标题
基于设定的Bellman操作员的固定点
Fixed Points of the Set-Based Bellman Operator
论文作者
论文摘要
在马尔可夫决策过程(MDP)中遇到的不确定参数的动机,我们研究了参数不确定性对基于贝尔曼操作员方法的影响。具体而言,我们考虑了一个MDP家族,其中成本参数来自给定的紧凑型集合。然后,我们定义一个在成本参数中所有可能的变化下的输出函数,以产生新的值函数,以产生新的值函数,以产生一个新的值函数。最后,我们通过证明它是完整的度量空间上的承包运算符,证明了该基于集合的Bellman运营商的固定点。
Motivated by uncertain parameters encountered in Markov decision processes (MDPs), we study the effect of parameter uncertainty on Bellman operator-based methods. Specifically, we consider a family of MDPs where the cost parameters are from a given compact set. We then define a Bellman operator acting on an input set of value functions to produce a new set of value functions as the output under all possible variations in the cost parameters. Finally we prove the existence of a fixed point of this set-based Bellman operator by showing that it is a contractive operator on a complete metric space.