肯德基：$ k $ center fair集群的可扩展近似算法

论文标题

肯德基：$ k $ center fair集群的可扩展近似算法

KFC: A Scalable Approximation Algorithm for $k$-center Fair Clustering

论文作者

Harb, Elfarouk, Lam, Ho Shan

论文摘要

在本文中，我们研究了$ k-$中心目标上公平聚类的问题。在公平的聚类中，输入为$ n $点，每个点至少属于$ l $保护组之一，例如男性，女性，亚洲，西班牙裔。目的是将$ n $点聚集到$ k $簇中，以最大程度地减少经典的聚类目标函数。但是，在某些公平概念下，还有一个额外的约束，即每个集群都需要公平。这样可以确保在任何集群中没有任何群体“过分代表”或“代表性不足”。我们的作品建立在Chierichetti等人的工作基础上。（NIPS 2017），Bera等。（Neurips 2019），Ahmadian等。（KDD 2019）和Bercea等。（大约2019年）。我们获得了$ k-$中心目标功能的随机$ 3- $近似算法，击败了先前的最新状态（$ 4- $近似）。我们在实际数据集上测试了算法，并表明我们的算法在没有过度代表或代表性不足的情况下有效地找到良好的群集，在运行时速度，集群成本，同时实现类似的公平性违规，超过了当前的艺术水平。

In this paper, we study the problem of fair clustering on the $k-$center objective. In fair clustering, the input is $N$ points, each belonging to at least one of $l$ protected groups, e.g. male, female, Asian, Hispanic. The objective is to cluster the $N$ points into $k$ clusters to minimize a classical clustering objective function. However, there is an additional constraint that each cluster needs to be fair, under some notion of fairness. This ensures that no group is either "over-represented" or "under-represented" in any cluster. Our work builds on the work of Chierichetti et al. (NIPS 2017), Bera et al. (NeurIPS 2019), Ahmadian et al. (KDD 2019), and Bercea et al. (APPROX 2019). We obtain a randomized $3-$approximation algorithm for the $k-$center objective function, beating the previous state of the art ($4-$approximation). We test our algorithm on real datasets, and show that our algorithm is effective in finding good clusters without over-representation or under-representation, surpassing the current state of the art in runtime speed, clustering cost, while achieving similar fairness violations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题