论文标题
改善了从缺失渠道恢复人口的算法
Improved Algorithms for Population Recovery from the Deletion Channel
论文作者
论文摘要
人口恢复问题要求一个人恢复$ n $ n $ bit字符串上未知的分布,允许从分布中汲取独立的噪音样本。最近,Ban等。 [BCF+19]研究了通过删除通道诱导噪声的问题。这个问题概括了著名的痕量重建问题,其中人们希望在删除频道下学习一个字符串。 Ban等。展示了如何使用$ \ exp \ big(n^{1/2} \ cdot(\ log n)^{o(\ ell)} \ big)$ samples $ \ exp \ big(n^{1/2} \ cdot(n^{1/2} \ big big)$样本。在这项工作中,我们仅使用$ \ exp \ big(\ tilde {o}(n^{1/3})\ cdot \ ell^2 \ big)$样品,通过开发[dos17,np17]的算法的高音类似物,哪些算法的较高词, $ \ exp \ big(\ tilde {o}(n^{1/3})\ big)$样本。我们还为第一个算法提供了$ n $中的运行时次指数,以$ \ exp \ big(\ tilde {o}(n^{1/3})(n^{1/3})\ cdot \ ell^3 \ big)$ splice和时间。 值得注意的是,当$ \ ell = o(1)$时,我们对$ n $的依赖几乎与[DOS17,NP17]的上限匹配,并且我们将对$ \ ell $的依赖从doubly降低到单个指数。因此,我们能够学习大量的弦乐混合物:虽然Ban等人的算法只能学习$ o(\ log n/\ log log \ log \ log \ log n)$ strings的混合物,并具有较高的样品数量,但我们能够学习$ n^{o(1)$ \ exp $ \ exp $ \ big big big(1/3^y^lig)的混合物 时间。
The population recovery problem asks one to recover an unknown distribution over $n$-bit strings given access to independent noisy samples of strings drawn from the distribution. Recently, Ban et al. [BCF+19] studied the problem where the noise is induced through the deletion channel. This problem generalizes the famous trace reconstruction problem, where one wishes to learn a single string under the deletion channel. Ban et al. showed how to learn $\ell$-sparse distributions over strings using $\exp\big(n^{1/2} \cdot (\log n)^{O(\ell)}\big)$ samples. In this work, we learn the distribution using only $\exp\big(\tilde{O}(n^{1/3}) \cdot \ell^2\big)$ samples, by developing a higher-moment analog of the algorithms of [DOS17, NP17], which solve trace reconstruction in $\exp\big(\tilde{O}(n^{1/3})\big)$ samples. We also give the first algorithm with a runtime subexponential in $n$, solving population recovery in $\exp\big(\tilde{O}(n^{1/3}) \cdot \ell^3\big)$ samples and time. Notably, our dependence on $n$ nearly matches the upper bound of [DOS17, NP17] when $\ell = O(1)$, and we reduce the dependence on $\ell$ from doubly to singly exponential. Therefore, we are able to learn large mixtures of strings: while Ban et al.'s algorithm can only learn a mixture of $O(\log n/\log \log n)$ strings with a subexponential number of samples, we are able to learn a mixture of $n^{o(1)}$ strings in $\exp\big(n^{1/3 + o(1)}\big)$ samples and time.