论文标题
通过非线性编程的概率K-均值聚类
Probabilistic K-means Clustering via Nonlinear Programming
论文作者
论文摘要
K均值是具有广泛应用的经典聚类算法。然而,自1981年以来,柔软的k均值或模糊的c均值仍未解决。为了解决这个具有挑战性的开放问题,我们提出了一个新颖的聚类模型,即概率K-均值(PKM),这也是对线性相等和线性不平等现象的非线性编程模型的约束。从理论上讲,我们可以通过主动梯度投影解决模型,而效率低下。因此,我们进一步提出了最大步骤的活动梯度投影和快速最大步骤的活动梯度投影,以更有效地解决它。通过实验,我们评估了PKM的性能以及提出的方法在五个方面解决了如何解决:初始化鲁棒性,聚类性能,下降稳定性,迭代数量和收敛速度。
K-means is a classical clustering algorithm with wide applications. However, soft K-means, or fuzzy c-means at m=1, remains unsolved since 1981. To address this challenging open problem, we propose a novel clustering model, i.e. Probabilistic K-Means (PKM), which is also a nonlinear programming model constrained on linear equalities and linear inequalities. In theory, we can solve the model by active gradient projection, while inefficiently. Thus, we further propose maximum-step active gradient projection and fast maximum-step active gradient projection to solve it more efficiently. By experiments, we evaluate the performance of PKM and how well the proposed methods solve it in five aspects: initialization robustness, clustering performance, descending stability, iteration number, and convergence speed.