论文标题
加权多数票的二阶Pac-Bayesian界限
Second Order PAC-Bayesian Bounds for the Weighted Majority Vote
论文作者
论文摘要
我们对多类分类中加权多数票的预期风险进行了新的分析。该分析通过集合成员考虑到预测的相关性,并提供了适合有效最小化的界限,这可以改善多数投票的权重。我们还提供了用于二进制分类的专门版本,该版本允许利用其他未标记的数据进行更严格的风险估算。在实验中,我们将界限应用于随机森林中树木的加权,并表明,与常用的一级结合相反,新结合的最小化通常不会导致整体测试误差的降解。
We present a novel analysis of the expected risk of weighted majority vote in multiclass classification. The analysis takes correlation of predictions by ensemble members into account and provides a bound that is amenable to efficient minimization, which yields improved weighting for the majority vote. We also provide a specialized version of our bound for binary classification, which allows to exploit additional unlabeled data for tighter risk estimation. In experiments, we apply the bound to improve weighting of trees in random forests and show that, in contrast to the commonly used first order bound, minimization of the new bound typically does not lead to degradation of the test error of the ensemble.