约翰逊 - 林登斯特劳斯的引理聚类和子空间近似：从降低尺寸。

论文标题

约翰逊 - 林登斯特劳斯的引理聚类和子空间近似：从降低尺寸。

The Johnson-Lindenstrauss Lemma for Clustering and Subspace Approximation: From Coresets to Dimension Reduction

论文作者

Charikar, Moses, Waingarten, Erik

论文摘要

我们研究了约翰逊·林顿斯（Johnson-Lindenstrauss）在各种投射聚类问题中转变的影响，从而推广了仅适用于中心聚类的最新结果[MMR19]。我们提出一个总体问题：对于欧几里得优化问题和准确的参数$ε\ in（0，1）$，什么是最小的目标维度$ t \ in \ mathbb {n} $，以至于约翰逊·林斯特拉斯（Johnson-Lindenstrauss $（1+ε）$ - 因子。我们提供了一种新技术，该技术使用核心结构来分析Johnson-Lindenstrauss变换的效果。除了适用于基于中心的聚类，我们的技术还改善了（或第一个解决）其他欧几里得优化问题，包括： $ \ bullet $ for $（k，z）$ - 子空间近似：我们表明$ t = \ tilde {o}（zk^2 /ε^3）$足够，而先前的最佳绑定为$ O（k /ε^2）$，仅适用于case $ z = 2 $ z = 2 $ [CEMMP15]。 $ \ bullet $ for $（k，z）$ - 平面近似值：我们显示$ t = \ tilde {o}（zk^2/ε^3）$足够，完全消除了先前绑定的$ \ tilde {o}（zk^2 \ log n/ε^3）$ [kkk^2 \ log n/ε^3）$的依赖。 $ \ bullet $ for $（k，z）$ - 行近似：我们显示$ t = o（（（k \ log \ log \ log \ log n + z + \ log（1 /ε）） /ε^3）$足够，而我们的是第一个给出任何降低尺寸的结果。

We study the effect of Johnson-Lindenstrauss transforms in various projective clustering problems, generalizing recent results which only applied to center-based clustering [MMR19]. We ask the general question: for a Euclidean optimization problem and an accuracy parameter $ε\in (0, 1)$, what is the smallest target dimension $t \in \mathbb{N}$ such that a Johnson-Lindenstrauss transform $Π\colon \mathbb{R}^d \to \mathbb{R}^t$ preserves the cost of the optimal solution up to a $(1+ε)$-factor. We give a new technique which uses coreset constructions to analyze the effect of the Johnson-Lindenstrauss transform. Our technique, in addition applying to center-based clustering, improves on (or is the first to address) other Euclidean optimization problems, including: $\bullet$ For $(k,z)$-subspace approximation: we show that $t = \tilde{O}(zk^2 / ε^3)$ suffices, whereas the prior best bound, of $O(k/ε^2)$, only applied to the case $z = 2$ [CEMMP15]. $\bullet$ For $(k,z)$-flat approximation: we show $t = \tilde{O}(zk^2/ε^3)$ suffices, completely removing the dependence on $n$ from the prior bound $\tilde{O}(zk^2 \log n/ε^3)$ of [KR15]. $\bullet$ For $(k,z)$-line approximation: we show $t = O((k \log \log n + z + \log(1/ε)) / ε^3)$ suffices, and ours is the first to give any dimension reduction result.

下载PDF全文

下载文献需遵守相关版权规定

论文标题