改善了从缺失渠道恢复人口的算法

论文标题

改善了从缺失渠道恢复人口的算法

Improved Algorithms for Population Recovery from the Deletion Channel

论文作者

Narayanan, Shyam

论文摘要

人口恢复问题要求一个人恢复$ n $ n $ bit字符串上未知的分布，允许从分布中汲取独立的噪音样本。最近，Ban等。 [BCF+19]研究了通过删除通道诱导噪声的问题。这个问题概括了著名的痕量重建问题，其中人们希望在删除频道下学习一个字符串。 Ban等。展示了如何使用$ \ exp \ big（n^{1/2} \ cdot（\ log n）^{o（\ ell）} \ big）$ samples $ \ exp \ big（n^{1/2} \ cdot（n^{1/2} \ big big）$样本。在这项工作中，我们仅使用$ \ exp \ big（\ tilde {o}（n^{1/3}）\ cdot \ ell^2 \ big）$样品，通过开发[dos17，np17]的算法的高音类似物，哪些算法的较高词， $ \ exp \ big（\ tilde {o}（n^{1/3}）\ big）$样本。我们还为第一个算法提供了$ n $中的运行时次指数，以$ \ exp \ big（\ tilde {o}（n^{1/3}）（n^{1/3}）\ cdot \ ell^3 \ big）$ splice和时间。值得注意的是，当$ \ ell = o（1）$时，我们对$ n $的依赖几乎与[DOS17，NP17]的上限匹配，并且我们将对$ \ ell $的依赖从doubly降低到单个指数。因此，我们能够学习大量的弦乐混合物：虽然Ban等人的算法只能学习$ o（\ log n/\ log log \ log \ log \ log n）$ strings的混合物，并具有较高的样品数量，但我们能够学习$ n^{o（1）$ \ exp $ \ exp $ \ big big big（1/3^y^lig）的混合物时间。

The population recovery problem asks one to recover an unknown distribution over $n$-bit strings given access to independent noisy samples of strings drawn from the distribution. Recently, Ban et al. [BCF+19] studied the problem where the noise is induced through the deletion channel. This problem generalizes the famous trace reconstruction problem, where one wishes to learn a single string under the deletion channel. Ban et al. showed how to learn $\ell$-sparse distributions over strings using $\exp\big(n^{1/2} \cdot (\log n)^{O(\ell)}\big)$ samples. In this work, we learn the distribution using only $\exp\big(\tilde{O}(n^{1/3}) \cdot \ell^2\big)$ samples, by developing a higher-moment analog of the algorithms of [DOS17, NP17], which solve trace reconstruction in $\exp\big(\tilde{O}(n^{1/3})\big)$ samples. We also give the first algorithm with a runtime subexponential in $n$, solving population recovery in $\exp\big(\tilde{O}(n^{1/3}) \cdot \ell^3\big)$ samples and time. Notably, our dependence on $n$ nearly matches the upper bound of [DOS17, NP17] when $\ell = O(1)$, and we reduce the dependence on $\ell$ from doubly to singly exponential. Therefore, we are able to learn large mixtures of strings: while Ban et al.'s algorithm can only learn a mixture of $O(\log n/\log \log n)$ strings with a subexponential number of samples, we are able to learn a mixture of $n^{o(1)}$ strings in $\exp\big(n^{1/3 + o(1)}\big)$ samples and time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题