论文标题
一个统一的多尺度和多任务学习框架,用于驾驶员行为推理
A Unified Multi-scale and Multi-task Learning Framework for Driver Behaviors Reasoning
论文作者
论文摘要
驾驶员和车辆之间的相互了解对于智能车辆的设计和定制的交互接口至关重要。在这项研究中,提出了一个统一的驾驶员行为推理系统,用于多尺度和多任务行为识别。具体而言,多尺度的驱动程序行为识别系统旨在基于深层编码器框架识别驾驶员的身心状态。该系统可以基于共享的编码网络共同识别具有不同时间尺度的三个驱动程序行为。驾驶员身体姿势和心理行为包括意图和情感。编码器网络是基于深卷积神经网络(CNN)设计的,并且针对不同驱动程序状态估计的几个解码器与完全连接(FC)和长短期记忆(LSTM)的复发性神经网络(RNN)提出。使用CNN编码器的联合特征学习提高了计算效率和特征多样性,而定制解码器则可以有效地进行多任务推断。所提出的框架可以用作利用不同驱动程序之间关系的解决方案,发现当驾驶员产生车道变化意图时,他们的情绪通常会保持中立状态并更多地关注任务。两个自然主义数据集用于研究模型性能,该模型性能是本地公路数据集,即crandata和一个来自Brain4Car的公共数据集。这两个数据集上的测试结果显示出对驾驶员姿势,意图和情绪识别的准确性能,并且表现优于现有方法。
Mutual understanding between driver and vehicle is critically important to the design of intelligent vehicles and customized interaction interface. In this study, a unified driver behavior reasoning system toward multi-scale and multi-tasks behavior recognition is proposed. Specifically, a multi-scale driver behavior recognition system is designed to recognize both the driver's physical and mental states based on a deep encoder-decoder framework. This system can jointly recognize three driver behaviors with different time scales based on the shared encoder network. Driver body postures and mental behaviors include intention and emotion are studied and identified. The encoder network is designed based on a deep convolutional neural network (CNN), and several decoders for different driver states estimation are proposed with fully connected (FC) and long short-term memory (LSTM) based recurrent neural networks (RNN). The joint feature learning with the CNN encoder increases the computational efficiency and feature diversity, while the customized decoders enable an efficient multi-tasks inference. The proposed framework can be used as a solution to exploit the relationship between different driver states, and it is found that when drivers generate lane change intentions, their emotions usually keep neutral state and more focus on the task. Two naturalistic datasets are used to investigate the model performance, which is a local highway dataset, namely, CranData and one public dataset from Brain4Cars. The testing results on these two datasets show accurate performance and outperform existing methods on driver postures, intention, and emotion recognition.