多率VAE：火车一次，获取完整的利率延伸曲线

论文标题

多率VAE：火车一次，获取完整的利率延伸曲线

Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

论文作者

Bae, Juhan, Zhang, Michael R., Ruan, Michael, Wang, Eric, Hasegawa, So, Ba, Jimmy, Grosse, Roger

论文摘要

变异自动编码器（VAE）是学习在广泛应用程序中使用的数据的潜在数据的强大工具。实际上，VAE通常需要多次培训回合来选择潜在变量应保留的信息。重建误差（失真）和KL差异（速率）之间的这种权衡通常由超参数$β$参数化。在本文中，我们引入了多率VAE（MR-VAE），这是一个在单次训练中学习与各种$β$相对应的最佳参数的计算有效框架。关键思想是将响应函数明确提出，该响应函数将$β$映射到使用HyperNetworks的最佳参数。 MR-VAE构建了一个紧凑的响应超网，其中预激活是基于$β$的条件门控。我们通过分析线性VAE并表明它可以完全代表线性VAE的响应函数来证明所提出的体系结构是合理的。借助学习的超网络，MR-VAE可以在无需额外训练的情况下构建速率 - 延伸曲线，并且可以通过高参数调整来部署。从经验上讲，我们的方法具有竞争力，并且通常超过具有最小计算和内存开销的多个$β$ VAE培训的性能。

Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable should retain. This trade-off between the reconstruction error (distortion) and the KL divergence (rate) is typically parameterized by a hyperparameter $β$. In this paper, we introduce Multi-Rate VAE (MR-VAE), a computationally efficient framework for learning optimal parameters corresponding to various $β$ in a single training run. The key idea is to explicitly formulate a response function that maps $β$ to the optimal parameters using hypernetworks. MR-VAEs construct a compact response hypernetwork where the pre-activations are conditionally gated based on $β$. We justify the proposed architecture by analyzing linear VAEs and showing that it can represent response functions exactly for linear VAEs. With the learned hypernetwork, MR-VAEs can construct the rate-distortion curve without additional training and can be deployed with significantly less hyperparameter tuning. Empirically, our approach is competitive and often exceeds the performance of multiple $β$-VAEs training with minimal computation and memory overheads.

下载PDF全文

下载文献需遵守相关版权规定

论文标题