当人类不是最佳的时：与风险感知人类合作的机器人

论文标题

当人类不是最佳的时：与风险感知人类合作的机器人

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

论文作者

Kwon, Minae, Biyik, Erdem, Talati, Aditi, Bhasin, Karan, Losey, Dylan P., Sadigh, Dorsa

论文摘要

为了安全有效地协作，机器人需要预测其人类伴侣的行为方式。当今的一些机器人模型人类仿佛也是机器人，并且假设用户始终是最佳的。其他机器人解释了人类的局限性，并放松了这一假设，因此人类是吵闹的理性的。当人类获得确定性的奖励时，这两种模型都是有意义的：即，获得$ 100或130美元的确定性。但是在现实世界中，奖励很少是确定性的。取而代之的是，我们必须做出遵守风险和不确定性的选择 - 在这些情况下，人类对次优行为表现出认知偏见。例如，当决定获得确定性100美元或仅130美元之间的时间仅80％的时间时，人们倾向于做出规避风险的选择 - 尽管这会导致预期增长较低！在本文中，我们从行为经济学中采用了一种众所周知的风险感知人类模型，称为累积前景理论，并使机器人能够在人类机器人相互作用（HRI）期间利用这种模型。在我们的用户研究中，我们提供了支持的证据，表明风险感知模型更准确地预测了次优的人类行为。我们发现这种提高的建模准确性会导致更安全，更有效的人类机器人协作。总体而言，我们扩展了现有的理性人类模型，以便协作机器人可以在HRI期间围绕次优的人类行为进行预测和计划。

In order to collaborate safely and efficiently, robots need to anticipate how their human partners will behave. Some of today's robots model humans as if they were also robots, and assume users are always optimal. Other robots account for human limitations, and relax this assumption so that the human is noisily rational. Both of these models make sense when the human receives deterministic rewards: i.e., gaining either $100 or $130 with certainty. But in real world scenarios, rewards are rarely deterministic. Instead, we must make choices subject to risk and uncertainty--and in these settings, humans exhibit a cognitive bias towards suboptimal behavior. For example, when deciding between gaining $100 with certainty or $130 only 80% of the time, people tend to make the risk-averse choice--even though it leads to a lower expected gain! In this paper, we adopt a well-known Risk-Aware human model from behavioral economics called Cumulative Prospect Theory and enable robots to leverage this model during human-robot interaction (HRI). In our user studies, we offer supporting evidence that the Risk-Aware model more accurately predicts suboptimal human behavior. We find that this increased modeling accuracy results in safer and more efficient human-robot collaboration. Overall, we extend existing rational human models so that collaborative robots can anticipate and plan around suboptimal human behavior during HRI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题