论文标题
重新思考贝叶斯学习进行数据分析:先验和推断稀疏感知建模的艺术
Rethinking Bayesian Learning for Data Analysis: The Art of Prior and Inference in Sparsity-Aware Modeling
论文作者
论文摘要
信号处理和机器学习的稀疏建模已成为科学研究的重点,已有二十年了。除其他外,有监督的稀疏感知学习包括:a)歧视方法和b)生成方法的两条主要途径。后者,更广泛地称为贝叶斯方法,使不确定性评估W.R.T.执行的预测。此外,由于它们独特的能力使与参数估计值相关的不确定性边缘化,因此它们可以更好地利用相关的先前信息并自然地将鲁棒性引入模型。此外,可以通过培训数据学习与采用先验相关的超参数。为了实施稀疏性学习,关键点在于选择歧视方法的函数正规化程序和贝叶斯学习先前分布的选择。在过去的十年左右的时间里,由于对深度学习的深入研究,重点是歧视技术。但是,贝叶斯方法的回来正在发生,这为深神经网络的设计提供了新的启示,这也与贝叶斯模型建立了公司联系,并激发了无监督学习的新途径,例如贝叶斯张量分解。 本文的目标是两个方面。首先,要以统一的方式审查,将促进稀疏的先验纳入三种非常流行的数据建模工具,即深层神经网络,高斯流程和张量分解方面的最新进展。其次,从不同方面回顾其相关的推理技术,包括:通过优化和变异推理方法最大化证据。还讨论了诸如小数据困境,自动模型结构搜索和自然预测不确定性评估之类的挑战。展示了典型的信号处理和机器学习任务。
Sparse modeling for signal processing and machine learning has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning comprises two major paths paved by: a) discriminative methods and b) generative methods. The latter, more widely known as Bayesian methods, enable uncertainty evaluation w.r.t. the performed predictions. Furthermore, they can better exploit related prior information and naturally introduce robustness into the model, due to their unique capacity to marginalize out uncertainties related to the parameter estimates. Moreover, hyper-parameters associated with the adopted priors can be learnt via the training data. To implement sparsity-aware learning, the crucial point lies in the choice of the function regularizer for discriminative methods and the choice of the prior distribution for Bayesian learning. Over the last decade or so, due to the intense research on deep learning, emphasis has been put on discriminative techniques. However, a come back of Bayesian methods is taking place that sheds new light on the design of deep neural networks, which also establish firm links with Bayesian models and inspire new paths for unsupervised learning, such as Bayesian tensor decomposition. The goal of this article is two-fold. First, to review, in a unified way, some recent advances in incorporating sparsity-promoting priors into three highly popular data modeling tools, namely deep neural networks, Gaussian processes, and tensor decomposition. Second, to review their associated inference techniques from different aspects, including: evidence maximization via optimization and variational inference methods. Challenges such as small data dilemma, automatic model structure search, and natural prediction uncertainty evaluation are also discussed. Typical signal processing and machine learning tasks are demonstrated.