论文标题
超越交换性的保形预测
Conformal prediction beyond exchangeability
论文作者
论文摘要
共形预测是一种流行的现代技术,用于为任意机器学习模型提供有效的预测推断。它的有效性取决于数据交换性的假设,以及对数据拟合算法的对称性作为数据的函数。但是,当实践中部署预测模型时,经常会违反交换性。例如,如果数据分布会随着时间的流逝而漂移,则数据点将不再可交换。此外,在这种情况下,我们可能希望使用一种非对称算法,该算法将最近的观察结果视为更相关。本文概括了解决这两个方面的共构预测:我们采用加权分位数来引入鲁棒性,以防止分布漂移,并设计一种新的随机化技术,以允许不对不对准数据点的算法。事实证明,我们的新方法是可靠的,当由于分配漂移或实际数据的其他具有挑战性的特征而违反交换性时,覆盖范围的损失大大降低,同时,如果数据点实际上可以交换,则与现有的保险范围相同的保证。我们通过对电力和选举预测进行了模拟和实现实验,证明了这些新工具的实际实用性。
Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use a nonsymmetric algorithm that treats recent observations as more relevant. This paper generalizes conformal prediction to deal with both aspects: we employ weighted quantiles to introduce robustness against distribution drift, and design a new randomization technique to allow for algorithms that do not treat data points symmetrically. Our new methods are provably robust, with substantially less loss of coverage when exchangeability is violated due to distribution drift or other challenging features of real data, while also achieving the same coverage guarantees as existing conformal prediction methods if the data points are in fact exchangeable. We demonstrate the practical utility of these new tools with simulations and real-data experiments on electricity and election forecasting.