Swinunet3D-使用移动的窗口变压器进行深度流量预测的层次结构

论文标题

Swinunet3D-使用移动的窗口变压器进行深度流量预测的层次结构

SwinUNet3D -- A Hierarchical Architecture for Deep Traffic Prediction using Shifted Window Transformers

论文作者

Bojesomo, Alabi, Marzouqi, Hasan Al, Liatsis, Panos

论文摘要

流量预测是移动管理的重要组成部分，这是驱动物流行业的重要关键。多年来，使用时间序列以及时空的动态预测，在交通预测方面已经完成了许多工作。在本文中，我们探讨了视觉变压器在UNET环境中的使用。我们完全删除了UNET中的所有基于卷积的构建块，同时使用编码器和解码器分支中的3D移动窗口变压器。此外，我们尝试在编码贴片编码之前使用特征混合，以控制特征的相互关系，同时避免时空输入的深度维度收缩。提出的网络对在神经信息处理系统（NEURIPS）竞争轨道上举行的流量图电影预测挑战2021（traffic4cast2021）提供的数据进行了测试。 Traffic4cast2021任务是从给定的交通状态的一个小时（5分钟的时间跨度为12帧平均）预测一个小时（6帧）的交通状况（体积和平均速度）。源代码可在https://github.com/bojesomo/traffic4cast2021-swinunet3d在线获得。

Traffic forecasting is an important element of mobility management, an important key that drives the logistics industry. Over the years, lots of work have been done in Traffic forecasting using time series as well as spatiotemporal dynamic forecasting. In this paper, we explore the use of vision transformer in a UNet setting. We completely remove all convolution-based building blocks in UNet, while using 3D shifted window transformer in both encoder and decoder branches. In addition, we experiment with the use of feature mixing just before patch encoding to control the inter-relationship of the feature while avoiding contraction of the depth dimension of our spatiotemporal input. The proposed network is tested on the data provided by Traffic Map Movie Forecasting Challenge 2021(Traffic4cast2021), held in the competition track of Neural Information Processing Systems (NeurIPS). Traffic4cast2021 task is to predict an hour (6 frames) of traffic conditions (volume and average speed)from one hour of given traffic state (12 frames averaged in 5 minutes time span). Source code is available online at https://github.com/bojesomo/Traffic4Cast2021-SwinUNet3D.

下载PDF全文

下载文献需遵守相关版权规定

论文标题