论文标题

通过嵌入的无监督学习对转录组学数据进行的癌症亚型

Cancer Subtyping via Embedded Unsupervised Learning on Transcriptomics Data

论文作者

Yang, Ziwei, Zhu, Lingwei, Chen, Zheng, Huang, Ming, Ono, Naoaki, Altaf-Ul-Amin, MD, Kanaya, Shigehiko

论文摘要

癌症是全球最致命的疾病之一。癌症亚型的准确诊断和分类对于有效的临床治疗是必不可少的。随着各种深度学习方法的出现,最近已经发布了自动癌症亚型系统的有希望的结果。但是,由于高维度和稀缺性,这种自动系统通常过于拟合数据。在本文中,我们建议通过直接构建基础数据分布本身来从无监督的学习角度研究自动亚型,因此可以生成足够的数据来减轻过度拟合的问题。具体而言,我们绕过了通常存在但通过矢量量化小型样本而导致的无监督学习亚型文献的强烈高斯假设。如广泛的实验结果所证明的那样,我们提出的方法可以更好地捕获潜在的空间特征,并以分子为基础对癌症亚型表现进行建模。

Cancer is one of the deadliest diseases worldwide. Accurate diagnosis and classification of cancer subtypes are indispensable for effective clinical treatment. Promising results on automatic cancer subtyping systems have been published recently with the emergence of various deep learning methods. However, such automatic systems often overfit the data due to the high dimensionality and scarcity. In this paper, we propose to investigate automatic subtyping from an unsupervised learning perspective by directly constructing the underlying data distribution itself, hence sufficient data can be generated to alleviate the issue of overfitting. Specifically, we bypass the strong Gaussianity assumption that typically exists but fails in the unsupervised learning subtyping literature due to small-sized samples by vector quantization. Our proposed method better captures the latent space features and models the cancer subtype manifestation on a molecular basis, as demonstrated by the extensive experimental results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源