摩托学 - 对空中场景的细粒度语义理解

论文标题

摩托学 - 对空中场景的细粒度语义理解

SkyScapes -- Fine-Grained Semantic Understanding of Aerial Scenes

论文作者

Azimi, Seyed Majid, Henry, Corentin, Sommer, Lars, Schumann, Arne, Vig, Eleonora

论文摘要

对于从自动驾驶到映射，基础设施监控和城市管理的许多应用，了解具有厘米级准确性的复杂城市基础设施至关重要。空中图像可以立即在大面积上提供有价值的信息；然而，当前没有数据集以现实世界应用所需的粒度水平捕获了空中场景的复杂性。为了解决这个问题，我们介绍了摩天大楼，这是一种具有高度精确，细粒度注释的空中图像数据集，用于像素级语义标签。摩天大楼为31个语义类别提供注释，从建筑物，道路和植被等大型结构到精美的细节，例如12（子）类别的泳道标记。我们已经在此数据集上定义了两个主要任务：密集的语义分割和多级车道标记预测。我们进行了广泛的实验，以评估摩天大楼的最新分割方法。现有的方法难以处理各种类型的类别，对象大小，量表和细节。因此，我们提出了一种新型的多任务模型，该模型结合了语义边缘检测，并且可以更好地调整以从各种尺度中提取特征。该模型在两项任务上的区域轮廓和细节水平上都对基准进行了显着改进。

Understanding the complex urban infrastructure with centimeter-level accuracy is essential for many applications from autonomous driving to mapping, infrastructure monitoring, and urban management. Aerial images provide valuable information over a large area instantaneously; nevertheless, no current dataset captures the complexity of aerial scenes at the level of granularity required by real-world applications. To address this, we introduce SkyScapes, an aerial image dataset with highly-accurate, fine-grained annotations for pixel-level semantic labeling. SkyScapes provides annotations for 31 semantic categories ranging from large structures, such as buildings, roads and vegetation, to fine details, such as 12 (sub-)categories of lane markings. We have defined two main tasks on this dataset: dense semantic segmentation and multi-class lane-marking prediction. We carry out extensive experiments to evaluate state-of-the-art segmentation methods on SkyScapes. Existing methods struggle to deal with the wide range of classes, object sizes, scales, and fine details present. We therefore propose a novel multi-task model, which incorporates semantic edge detection and is better tuned for feature extraction from a wide range of scales. This model achieves notable improvements over the baselines in region outlines and level of detail on both tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题