论文标题

使用基于变压器的多传感器融合的多视图和多模式事件检测

Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion

论文作者

Yasuda, Masahiro, Ohishi, Yasunori, Saito, Shoichiro, Harada, Noboru

论文摘要

我们解决了一项具有挑战性的任务:多视图和多模式事件检测,该事件检测通过利用分布式摄像机和麦克风及其弱标签中的数据来检测大范围真实环境中的事件。在此任务中,分布式传感器被用来补充来捕获很难用单个传感器捕获的事件,例如人们在错综复杂的房间中移动的一系列动作,或者在房间中遥远的人之间进行通信。为了使传感器在这种情况下有效合作,该系统应能够在传感器之间进行交换,并结合信息,可用于以互补方式识别事件。对于这种机制,我们提出了一个基于变压器的多传感器融合(MultitTrans),该融合(Multitrans)根据不同观点和模态的特征之间的关系结合了多传感器数据。在使用针对此任务的新收集的数据集的实验中,我们使用Multitrans提出的方法提高了事件检测性能和表现优于比较。

We tackle a challenging task: multi-view and multi-modal event detection that detects events in a wide-range real environment by utilizing data from distributed cameras and microphones and their weak labels. In this task, distributed sensors are utilized complementarily to capture events that are difficult to capture with a single sensor, such as a series of actions of people moving in an intricate room, or communication between people located far apart in a room. For sensors to cooperate effectively in such a situation, the system should be able to exchange information among sensors and combines information that is useful for identifying events in a complementary manner. For such a mechanism, we propose a Transformer-based multi-sensor fusion (MultiTrans) which combines multi-sensor data on the basis of the relationships between features of different viewpoints and modalities. In the experiments using a dataset newly collected for this task, our proposed method using MultiTrans improved the event detection performance and outperformed comparatives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源