焦点 - UNET：用于医学图像分割的UNET样焦点调制

论文标题

焦点 - UNET：用于医学图像分割的UNET样焦点调制

Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation

论文作者

Naderi, MohammadReza, Givkashi, MohammadHossein, Piri, Fatemeh, Karimi, Nader, Samavi, Shadrokh

论文摘要

最近，已经尝试了许多尝试构建变压器基础U形体系结构，并提出了新方法优于基于CNN的竞争对手。但是，由于变形金刚的补丁分配操作，诸如预测口罩的遮挡性和裁剪边缘之类的严重问题仍然存在。在这项工作中，我们在新引入的焦点调制机制的帮助下，提出了一种新的U形架构，以进行医学图像分割。所提出的架构具有编码器和解码器的不对称深度。由于焦点模块汇总局部和全局特征的能力，我们的模型可以同时使变压器的广泛接受场和CNN的局部观察受益。这有助于提出的方法平衡本地和全局功能的用法，以优于最强大的基于变压器的U形模型之一，称为SWIN-UNET。在Synapse数据集上，我们获得了骰子得分1.68％，高清度量更好。此外，对于数据非常有限的数据，我们在NeoPoypy数据集上的骰子得分提高了4.25％。我们的实现可在以下网址获得：https：//github.com/givkashi/focal-unet

Recently, many attempts have been made to construct a transformer base U-shaped architecture, and new methods have been proposed that outperformed CNN-based rivals. However, serious problems such as blockiness and cropped edges in predicted masks remain because of transformers' patch partitioning operations. In this work, we propose a new U-shaped architecture for medical image segmentation with the help of the newly introduced focal modulation mechanism. The proposed architecture has asymmetric depths for the encoder and decoder. Due to the ability of the focal module to aggregate local and global features, our model could simultaneously benefit the wide receptive field of transformers and local viewing of CNNs. This helps the proposed method balance the local and global feature usage to outperform one of the most powerful transformer-based U-shaped models called Swin-UNet. We achieved a 1.68% higher DICE score and a 0.89 better HD metric on the Synapse dataset. Also, with extremely limited data, we had a 4.25% higher DICE score on the NeoPolyp dataset. Our implementations are available at: https://github.com/givkashi/Focal-UNet

下载PDF全文

下载文献需遵守相关版权规定

论文标题