特定任务的注意是您需要的对象检测需要的一件事

论文标题

特定任务的注意是您需要的对象检测需要的一件事

Task Specific Attention is one more thing you need for object detection

论文作者

Lee, Sang Yon

论文摘要

已经提出了各种模型来执行对象检测。但是，大多数人都需要许多手工设计的组件，例如锚和非最大抑制（NMS），以表现出良好的性能。为了减轻这些问题，建议了基于变压器的DETR及其变体可变形的DITR。这些解决了为对象检测模型设计头部时的许多复杂问题。但是，当将基于变压器的模型视为其他模型的对象检测中的最新方法时，对性能的疑问仍然存在，具体取决于锚定和NMS，揭示了更好的结果。此外，目前尚不清楚是否可以仅与注意模块结合使用端到端管道，因为Detr适应的变压器方法使用卷积神经网络（CNN）作为骨干体。在这项研究中，我们建议将几个注意力模块与我们的新任务特定于特定于分型变压器（TSST）相结合是一种有力的方法，可以在没有传统手工设计的组件的情况下在可可效果上产生最先进的性能。通过将通用注意模块分为两个分开的目标注意模块，该方法允许设计简单的对象检测模型。对可可基准的广泛实验证明了我们方法的有效性。代码可在https://github.com/navervision/tsst上获得

Various models have been proposed to perform object detection. However, most require many handdesigned components such as anchors and non-maximum-suppression(NMS) to demonstrate good performance. To mitigate these issues, Transformer-based DETR and its variant, Deformable DETR, were suggested. These have solved much of the complex issue in designing a head for object detection models; however, doubts about performance still exist when considering Transformer-based models as state-of-the-art methods in object detection for other models depending on anchors and NMS revealed better results. Furthermore, it has been unclear whether it would be possible to build an end-to-end pipeline in combination only with attention modules, because the DETR-adapted Transformer method used a convolutional neural network (CNN) for the backbone body. In this study, we propose that combining several attention modules with our new Task Specific Split Transformer (TSST) is a powerful method to produce the state-of-the art performance on COCO results without traditionally hand-designed components. By splitting the general-purpose attention module into two separated goal-specific attention modules, the proposed method allows for the design of simpler object detection models. Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach. Code is available at https://github.com/navervision/tsst

下载PDF全文

下载文献需遵守相关版权规定

论文标题