论文标题

分析具有指定缺失DATA机制的高斯混合模型估算贝叶斯规则

Analysis of Estimating the Bayes Rule for Gaussian Mixture Models with a Specified Missing-Data Mechanism

论文作者

Lyu, Ziyang

论文摘要

半监督学习(SSL)方法已成功地在广泛的工程和科学领域中应用。本文通过Ahfock和McLachlan(2020)介绍了具有未分类观察的缺失机制的生成模型框架。我们表明,在部分分类的样本中,使用缺少数据机制分配的贝叶斯规则的分类器可以在两类正常同质型模型中超过完全监督的分类器,尤其是使用中等至低的重叠和缺失的类标签的比例,或具有较大的重叠,但缺失的标签很少。无论重叠区域或缺失的类标签的比例如何,它还胜过没有缺失数据机制的分类器。通过模拟,我们对具有不平等协方差的两组和三成分的正常混合模型的探索进一步证实了我们的发现。最后,我们说明了提出的分类器以及缺少DATA机制在神经元和皮肤病变数据集上的使用。

Semi-supervised learning (SSL) approaches have been successfully applied in a wide range of engineering and scientific fields. This paper investigates the generative model framework with a missingness mechanism for unclassified observations, as introduced by Ahfock and McLachlan(2020). We show that in a partially classified sample, a classifier using Bayes rule of allocation with a missing-data mechanism can surpass a fully supervised classifier in a two-class normal homoscedastic model, especially with moderate to low overlap and proportion of missing class labels, or with large overlap but few missing labels. It also outperforms a classifier with no missing-data mechanism regardless of the overlap region or the proportion of missing class labels. Our exploration of two- and three-component normal mixture models with unequal covariances through simulations further corroborates our findings. Finally, we illustrate the use of the proposed classifier with a missing-data mechanism on interneuronal and skin lesion datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源