论文标题
贝叶斯稀疏因子分析与内核观测
Bayesian Sparse Factor Analysis with Kernelized Observations
论文作者
论文摘要
多视图问题可以面临潜在变量模型,因为它们能够找到低维投影,这些预测可以公平地捕获每个基准表征的多个视图之间的相关性。另一方面,传统上,高维性和非线性问题是通过内核方法来处理的,从而诱导了潜在投影和数据本身之间的(非)线性函数。但是,它们通常带有可伸缩性问题和过度拟合的说明。在这里,我们建议将这两种方法合并为单个模型,以便我们可以利用多视图潜在模型和内核方法的最佳功能,并克服它们的局限性。 特别是,我们将概率因子分析与我们称为内核观察结果相结合,其中该模型的重点是重建不是数据本身,而是其与内核函数测量的其他数据点的关系。该模型可以结合几种类型的视图(或非内核化),并且可以处理异质数据并在半监督的设置中工作。此外,通过包括足够的先验,它可以为基于贝叶斯相关矢量(RVS)的自动选择的内核观测值提供紧凑的解决方案 - 并可以包括特征选择功能。使用几个公共数据库,我们证明了我们的方法(及其扩展)W.R.T.的潜力。常见的多视图学习模型,例如内核规范相关分析或多种相关性确定。
Multi-view problems can be faced with latent variable models since they are able to find low-dimensional projections that fairly capture the correlations among the multiple views that characterise each datum. On the other hand, high-dimensionality and non-linear issues are traditionally handled by kernel methods, inducing a (non)-linear function between the latent projection and the data itself. However, they usually come with scalability issues and exposition to overfitting. Here, we propose merging both approaches into single model so that we can exploit the best features of multi-view latent models and kernel methods and, moreover, overcome their limitations. In particular, we combine probabilistic factor analysis with what we refer to as kernelized observations, in which the model focuses on reconstructing not the data itself, but its relationship with other data points measured by a kernel function. This model can combine several types of views (kernelized or not), and it can handle heterogeneous data and work in semi-supervised settings. Additionally, by including adequate priors, it can provide compact solutions for the kernelized observations -- based in a automatic selection of Bayesian Relevance Vectors (RVs) -- and can include feature selection capabilities. Using several public databases, we demonstrate the potential of our approach (and its extensions) w.r.t. common multi-view learning models such as kernel canonical correlation analysis or manifold relevance determination.