论文标题
当旧约遇见妈妈时:瓦斯汀距离的强大估计
When OT meets MoM: Robust estimation of Wasserstein Distance
论文作者
论文摘要
Wasserstein距离从最佳运输中发行,由于其具有吸引力的几何特性和有效近似值的可用性增加,因此在机器学习中已获得了重要的意义。在这项工作中,我们考虑了当异常值污染观测值时,估计两个概率分布之间的瓦斯汀距离的问题。为此,我们研究了如何利用平均值(MOM)估计器的中位数来鲁棒化Wasserstein距离的估计。利用Wasserstein距离的双重Kantorovitch配方,我们介绍并讨论了基于MOM的新型稳健估计器,其一致性在数据污染模型下进行了研究,并为此提供了收敛速率。这些妈妈的估计器能够使Wasserstein生成的对抗网络(WGAN)与异常值进行鲁棒,这是对两个基准CIFAR10和时尚MNIST的实证研究所见证的。最终,我们讨论了如何将MOM与Wasserstein距离的熵调查近似结合在一起,并提出了一种基于MOM的简单重新加权方案,该方案可与Sinkhorn算法结合使用。
Issued from Optimal Transport, the Wasserstein distance has gained importance in Machine Learning due to its appealing geometrical properties and the increasing availability of efficient approximations. In this work, we consider the problem of estimating the Wasserstein distance between two probability distributions when observations are polluted by outliers. To that end, we investigate how to leverage Medians of Means (MoM) estimators to robustify the estimation of Wasserstein distance. Exploiting the dual Kantorovitch formulation of Wasserstein distance, we introduce and discuss novel MoM-based robust estimators whose consistency is studied under a data contamination model and for which convergence rates are provided. These MoM estimators enable to make Wasserstein Generative Adversarial Network (WGAN) robust to outliers, as witnessed by an empirical study on two benchmarks CIFAR10 and Fashion MNIST. Eventually, we discuss how to combine MoM with the entropy-regularized approximation of the Wasserstein distance and propose a simple MoM-based re-weighting scheme that could be used in conjunction with the Sinkhorn algorithm.