论文标题

与内核公平性的遗忘数据

Oblivious Data for Fairness with Kernels

论文作者

Grünewälder, Steffen, Khaleghi, Azadeh

论文摘要

在有敏感和不敏感的特征的情况下,我们研究了算法公平性的问题,并且旨在产生新的,“忘记”的特征,这些特征紧密近似于非敏感特征,并且仅依赖于敏感的特征。我们在内核方法的背景下研究了这个问题。我们分析了最大平均差异标准的轻松版本,该版本不能保证完全独立性,而是使优化问题可以解决。我们为这个放松的优化问题得出了封闭形式的解决方案,并通过研究新生成的特征和敏感的依赖关系来补充结果。我们生成这种遗忘特征的关键要素是希尔伯特空间值的条件期望,需要从数据中估算。我们提出了一种插件方法,并说明如何控制估计错误。尽管我们的技术有助于减少偏见,但我们想指出的是,任何数据集的后处理都不可能用作精心设计的实验的替代方法。

We investigate the problem of algorithmic fairness in the case where sensitive and non-sensitive features are available and one aims to generate new, `oblivious', features that closely approximate the non-sensitive features, and are only minimally dependent on the sensitive ones. We study this question in the context of kernel methods. We analyze a relaxed version of the Maximum Mean Discrepancy criterion which does not guarantee full independence but makes the optimization problem tractable. We derive a closed-form solution for this relaxed optimization problem and complement the result with a study of the dependencies between the newly generated features and the sensitive ones. Our key ingredient for generating such oblivious features is a Hilbert-space-valued conditional expectation, which needs to be estimated from data. We propose a plug-in approach and demonstrate how the estimation errors can be controlled. While our techniques help reduce the bias, we would like to point out that no post-processing of any dataset could possibly serve as an alternative to well-designed experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源