论文标题
放射学图像的内在流形及其在深度学习中的作用
The Intrinsic Manifolds of Radiological Images and their Role in Deep Learning
论文作者
论文摘要
歧管假设是深度学习成功背后的核心机制,因此了解图像数据的内在流形结构对于研究神经网络如何从数据中学习至关重要。固有的数据集歧管及其与学习难度的关系最近开始研究自然图像的共同领域,但是几乎没有尝试进行放射学图像的研究。我们在这里解决这个问题。首先,我们比较放射学和自然图像的固有歧管维度。我们还调查了在广泛的数据集中的内在维度和概括能力之间的关系。我们的分析表明,自然图像数据集通常比放射学图像具有更高数量的固有维度。但是,对医学图像的概括能力与内在维度之间的关系更加牢固,这可以解释为具有固有特征的放射学图像更难学习。这些结果为直觉提供了更有原则的基础,即放射学图像要比机器学习研究常见的自然图像数据集更具挑战性。我们认为,与其直接将自然图像开发的模型应用于放射学成像领域,不应对开发更适合该域的特定特征量身定制的体系结构和算法。我们的论文中显示的研究表明了这些特征以及与自然图像的差异,是朝这个方向迈出的重要第一步。
The manifold hypothesis is a core mechanism behind the success of deep learning, so understanding the intrinsic manifold structure of image data is central to studying how neural networks learn from the data. Intrinsic dataset manifolds and their relationship to learning difficulty have recently begun to be studied for the common domain of natural images, but little such research has been attempted for radiological images. We address this here. First, we compare the intrinsic manifold dimensionality of radiological and natural images. We also investigate the relationship between intrinsic dimensionality and generalization ability over a wide range of datasets. Our analysis shows that natural image datasets generally have a higher number of intrinsic dimensions than radiological images. However, the relationship between generalization ability and intrinsic dimensionality is much stronger for medical images, which could be explained as radiological images having intrinsic features that are more difficult to learn. These results give a more principled underpinning for the intuition that radiological images can be more challenging to apply deep learning to than natural image datasets common to machine learning research. We believe rather than directly applying models developed for natural images to the radiological imaging domain, more care should be taken to developing architectures and algorithms that are more tailored to the specific characteristics of this domain. The research shown in our paper, demonstrating these characteristics and the differences from natural images, is an important first step in this direction.