通过概率奖励和在线推断表达各种人类驾驶行为

论文标题

通过概率奖励和在线推断表达各种人类驾驶行为

Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference

论文作者

Sun, Liting, Wu, Zheng, Ma, Hengbo, Tomizuka, Masayoshi

论文摘要

在人类机器人相互作用（HRI）中，例如自动驾驶汽车，理解和代表人类行为很重要。人类的行为自然是丰富而多样的。成本/奖励学习是一种学习和代表人类行为的有效方法，已成功地应用于许多领域。但是，大多数传统的逆增强学习（IRL）算法无法充分捕获人类行为的多样性，因为它们假设给定数据集中的所有行为都是由单个成本函数产生的。在本文中，我们提出了一个直接了解连续域中成本函数的概率IRL框架。对合成数据和实际人类驾驶数据进行评估。定量和主观的结果都表明，我们提出的框架可以更好地表达各种人类驾驶行为，并提取不同的驾驶方式，这些驾驶方式与人类参与者在我们的用户研究中的解释相匹配。

In human-robot interaction (HRI) systems, such as autonomous vehicles, understanding and representing human behavior are important. Human behavior is naturally rich and diverse. Cost/reward learning, as an efficient way to learn and represent human behavior, has been successfully applied in many domains. Most of traditional inverse reinforcement learning (IRL) algorithms, however, cannot adequately capture the diversity of human behavior since they assume that all behavior in a given dataset is generated by a single cost function.In this paper, we propose a probabilistic IRL framework that directly learns a distribution of cost functions in continuous domain. Evaluations on both synthetic data and real human driving data are conducted. Both the quantitative and subjective results show that our proposed framework can better express diverse human driving behaviors, as well as extracting different driving styles that match what human participants interpret in our user study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题