论文标题
在均匀抖动的一位量化下的高维统计估计
High Dimensional Statistical Estimation under Uniformly Dithered One-bit Quantization
论文作者
论文摘要
在本文中,我们提出了一个均匀抖动的1位量化方案,以进行高维统计估计。该方案包含截断,抖动和量化,作为典型步骤。作为规范示例,量化方案应用于稀疏协方差矩阵估计,稀疏线性回归(即压缩传感)和矩阵完成的估计问题。我们研究了高斯和重尾政权,其中假定重尾数据的潜在分布具有某种秩序的界限。我们根据1位量化数据提出了新的估计器。在高斯次级政权中,我们的估算值接近最小值,这表明我们的量化计划的成本很小。在重尾状态下,虽然我们的估计速率本质上的速度较慢,但这些结果是1位量化且重型尾部设置中的第一个结果,或者已经改善了某些尊重的现有可比结果。在我们的环境中的观察结果下,压缩传感和矩阵完成的速率几乎很紧。我们的1位压缩传感结果具有一般感应矢量,是次高斯或重型尾部。我们还首先研究了一个新颖的环境,其中量化了协变量和响应。此外,我们对1位矩阵完成的方法不依赖于可能性,并代表了第一种可靠的方法来固定噪声,并具有未知分布的噪声。提出了有关合成数据的实验结果,以支持我们的理论分析。
In this paper, we propose a uniformly dithered 1-bit quantization scheme for high-dimensional statistical estimation. The scheme contains truncation, dithering, and quantization as typical steps. As canonical examples, the quantization scheme is applied to the estimation problems of sparse covariance matrix estimation, sparse linear regression (i.e., compressed sensing), and matrix completion. We study both sub-Gaussian and heavy-tailed regimes, where the underlying distribution of heavy-tailed data is assumed to have bounded moments of some order. We propose new estimators based on 1-bit quantized data. In sub-Gaussian regime, our estimators achieve near minimax rates, indicating that our quantization scheme costs very little. In heavy-tailed regime, while the rates of our estimators become essentially slower, these results are either the first ones in an 1-bit quantized and heavy-tailed setting, or already improve on existing comparable results from some respect. Under the observations in our setting, the rates are almost tight in compressed sensing and matrix completion. Our 1-bit compressed sensing results feature general sensing vector that is sub-Gaussian or even heavy-tailed. We also first investigate a novel setting where both the covariate and response are quantized. In addition, our approach to 1-bit matrix completion does not rely on likelihood and represent the first method robust to pre-quantization noise with unknown distribution. Experimental results on synthetic data are presented to support our theoretical analysis.