论文标题
预期信息最大化:使用I-Procotion进行混合密度估计
Expected Information Maximization: Using the I-Projection for Mixture Density Estimation
论文作者
论文摘要
建模高度多模式数据是机器学习中的一个具有挑战性的问题。大多数算法是基于最大化的可能性,该可能性对应于M(Oment) - 对模型分布的数据分布的预测。 M细投影迫使模型在无法表示的模式下平均。相比之下,I(信息) - 投影忽略了数据中的这些模式,而将重点放在模型所代表的模式上。每当我们处理高度多模式的数据时,这种行为就会吸引人,其中正确对单个模式进行建模比涵盖所有模式更为重要。尽管存在这一优势,但由于缺乏可以根据数据有效优化它的算法,因此很少使用I投票。在这项工作中,我们提出了一种称为预期信息最大化(EIM)的新算法,用于仅基于一般潜在变量模型的样本来计算I-trokostion,我们将重点介绍高斯混合模型和专家的高斯混合物。我们的方法应用了一个与i投影目标的变异上限,该目标将原始目标分解为每个混合物组件以及系数的单个目标,从而有效地优化了。与GAN相似,我们的方法采用歧视器,但使用紧密的上限使用更稳定的优化程序。我们表明,与最近的GAN方法相比,我们的算法在计算I投票方面更有效,我们说明了对两个行人和交通预测数据集建模多模式行为的方法的有效性。
Modelling highly multi-modal data is a challenging problem in machine learning. Most algorithms are based on maximizing the likelihood, which corresponds to the M(oment)-projection of the data distribution to the model distribution. The M-projection forces the model to average over modes it cannot represent. In contrast, the I(information)-projection ignores such modes in the data and concentrates on the modes the model can represent. Such behavior is appealing whenever we deal with highly multi-modal data where modelling single modes correctly is more important than covering all the modes. Despite this advantage, the I-projection is rarely used in practice due to the lack of algorithms that can efficiently optimize it based on data. In this work, we present a new algorithm called Expected Information Maximization (EIM) for computing the I-projection solely based on samples for general latent variable models, where we focus on Gaussian mixtures models and Gaussian mixtures of experts. Our approach applies a variational upper bound to the I-projection objective which decomposes the original objective into single objectives for each mixture component as well as for the coefficients, allowing an efficient optimization. Similar to GANs, our approach employs discriminators but uses a more stable optimization procedure, using a tight upper bound. We show that our algorithm is much more effective in computing the I-projection than recent GAN approaches and we illustrate the effectiveness of our approach for modelling multi-modal behavior on two pedestrian and traffic prediction datasets.