论文标题

随机HADAMARD变换的均匀近似值与应用

Uniform Approximations for Randomized Hadamard Transforms with Applications

论文作者

Cherapanamjeri, Yeshwanth, Nelson, Jelani

论文摘要

随机的Hadamard变换(RHT)已成为计算有效的替代方法,用于在计算机科学和机器学习中使用一系列范围的密集的非结构化随机矩阵。对于诸如降低降低和压缩感应等多种应用,基于RHT的方法的理论保证与使用具有I.I.D. \条目的密集随机矩阵的方法可比。但是,几个这样的应用在低维度中,其中从矩阵中采样的行数很少。先前的参数不适用于在内核近似等机器学习应用中经常发现的高维度。给定一个带有高斯对角线的Rhts的合奏,$ \ {m^i \} _ {i = 1}^m $,以及任何$ 1 $ -lipschitz函数,$ f:\ sathbb {r} \ to \ mathbb {r} 1}^m $在$ \ |上均匀地收敛到其期望。 v \ | \ leq 1 $以与使用真正高斯矩阵获得的速率相当。然后,我们使用不平等来得出在高维状态下两种应用的提高保证:1)内核近似和2)距离估计。对于内核近似,我们证明了第一个\ emph {统一}的近似保证,用于通过RHTS构建的随机特征,在距离估计中构建的理论上是借用理论上的理由,而对于距离估计,我们的收敛结果意味着数据结构具有作者对先前工作的改善的运行时保证。我们认为,我们的普遍不平等可能会在其他应用中找到使用。

Randomized Hadamard Transforms (RHTs) have emerged as a computationally efficient alternative to the use of dense unstructured random matrices across a range of domains in computer science and machine learning. For several applications such as dimensionality reduction and compressed sensing, the theoretical guarantees for methods based on RHTs are comparable to approaches using dense random matrices with i.i.d.\ entries. However, several such applications are in the low-dimensional regime where the number of rows sampled from the matrix is rather small. Prior arguments are not applicable to the high-dimensional regime often found in machine learning applications like kernel approximation. Given an ensemble of RHTs with Gaussian diagonals, $\{M^i\}_{i = 1}^m$, and any $1$-Lipschitz function, $f: \mathbb{R} \to \mathbb{R}$, we prove that the average of $f$ over the entries of $\{M^i v\}_{i = 1}^m$ converges to its expectation uniformly over $\| v \| \leq 1$ at a rate comparable to that obtained from using truly Gaussian matrices. We use our inequality to then derive improved guarantees for two applications in the high-dimensional regime: 1) kernel approximation and 2) distance estimation. For kernel approximation, we prove the first \emph{uniform} approximation guarantees for random features constructed through RHTs lending theoretical justification to their empirical success while for distance estimation, our convergence result implies data structures with improved runtime guarantees over previous work by the authors. We believe our general inequality is likely to find use in other applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源