使用JKO方案在连续归一流的流中驯服高参数调谐

论文标题

使用JKO方案在连续归一流的流中驯服高参数调谐

Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the JKO Scheme

论文作者

Vidal, Alexander, Fung, Samy Wu, Tenorio, Luis, Osher, Stanley, Nurbekyan, Levon

论文摘要

归一化流（NF）是将所选概率分布转换为正态分布的映射。这样的流是一种通用技术，用于机器学习和数据科学中的数据生成和密度估计。使用NF获得的密度估计需要更改变量公式，该公式涉及计算NF转换的雅各布决定因素。为了漫步地计算这种决定因素，连续归一化流（CNF）估计映射及其雅各布式的决定因素。最佳运输（OT）理论已成功地用于协助查找CNF，通过将它们作为OT问题提出问题，并以软惩罚来实施标准正态分布作为目标度量。基于OT的CNF的缺点是添加了超级参数$α$，它控制了软惩罚的强度并需要进行大量调整。我们提出了JKO-Flow，这是一种算法，用于求解基于OT的CNF，而无需调整$α$。这是通过将OT CNF框架集成到Wasserstein梯度流动框架（也称为JKO方案）中来实现的。我们不用调整$α$，而是反复解决固定$α$有效地执行JKO更新的优化问题，并使用时间步$α$进行JKO更新。因此，我们通过反复解决更简单的问题而不是解决大型$α$的可能更困难的问题来获得“分歧和征服”算法。

A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the mapping and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, $α$, that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $α$. This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning $α$, we repeatedly solve the optimization problem for a fixed $α$ effectively performing a JKO update with a time-step $α$. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large $α$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题