论文标题
有效的上下文匪徒通过连续动作
Efficient Contextual Bandits with Continuous Actions
论文作者
论文摘要
我们为上下文匪徒创建一种可拖动的算法,其连续动作具有未知的结构。我们的还原风格算法构成了大多数监督的学习表征。我们证明它在一般意义上起作用,并通过大规模实验验证新功能。
We create a computationally tractable algorithm for contextual bandits with continuous actions having unknown structure. Our reduction-style algorithm composes with most supervised learning representations. We prove that it works in a general sense and verify the new functionality with large-scale experiments.