PGL：3D医疗图像细分的先前引入的本地自我监督学习

论文标题

PGL：3D医疗图像细分的先前引入的本地自我监督学习

PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image Segmentation

论文作者

Xie, Yutong, Zhang, Jianpeng, Liao, Zehui, Xia, Yong, Shen, Chunhua

论文摘要

人们普遍认识到，图像分割中深度学习的成功取决于绝大多数密集的注释的培训数据，但是由于所需的巨大劳动和专业知识，这很难获得，特别是对于3D医学图像注释。尽管自我监督的学习（SSL）表现出了解决此问题的巨大潜力，但大多数SSL方法仅着眼于图像级的全球一致性，但忽略了局部一致性，该一致性在捕获诸如细分等密集预测任务的结构信息中起着关键作用。在本文中，我们提出了一个先验的本地（PGL）自我监督模型，该模型了解潜在特征空间中的区域局部一致性。具体而言，我们使用空间转换，该空间变换产生了同一图像的不同增强视图，因为在推断两个视图之间的位置关系之前，然后将其用于对齐同一局部区域的特征图，但在两个视图上提取。接下来，我们构建局部一致性损失，以最大程度地减少对齐特征图之间的体素差异。因此，我们的PGL模型了解了地方区域的独特表示，因此能够保留结构信息。此功能有利于下游细分任务。我们对四个公共计算机断层扫描（CT）数据集进行了广泛的评估，这些数据集涵盖11种主要人体器官和两个肿瘤。结果表明，使用预训练的PGL模型初始化下游网络会导致对随机初始化和使用基于全局一致性的模型的初始化和初始化的实质性提高。代码和预训练的权重将在以下网址提供：https：//git.io/pgl。

It has been widely recognized that the success of deep learning in image segmentation relies overwhelmingly on a myriad amount of densely annotated training data, which, however, are difficult to obtain due to the tremendous labor and expertise required, particularly for annotating 3D medical images. Although self-supervised learning (SSL) has shown great potential to address this issue, most SSL approaches focus only on image-level global consistency, but ignore the local consistency which plays a pivotal role in capturing structural information for dense prediction tasks such as segmentation. In this paper, we propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space. Specifically, we use the spatial transformations, which produce different augmented views of the same image, as a prior to deduce the location relation between two views, which is then used to align the feature maps of the same local region but being extracted on two views. Next, we construct a local consistency loss to minimize the voxel-wise discrepancy between the aligned feature maps. Thus, our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information. This ability is conducive to downstream segmentation tasks. We conducted an extensive evaluation on four public computerized tomography (CT) datasets that cover 11 kinds of major human organs and two tumors. The results indicate that using pre-trained PGL model to initialize a downstream network leads to a substantial performance improvement over both random initialization and the initialization with global consistency-based models. Code and pre-trained weights will be made available at: https://git.io/PGL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题