当反向倾向评分不起作用时：无偏学习的仿射校正以排名

论文标题

当反向倾向评分不起作用时：无偏学习的仿射校正以排名

When Inverse Propensity Scoring does not Work: Affine Corrections for Unbiased Learning to Rank

论文作者

Vardasbi, Ali, Oosterhuis, Harrie, de Rijke, Maarten

论文摘要

除了经过充分研究的位置偏见外，信任偏见是用户与排名交互的另一种偏见类型：用户更有可能单击错误地单击W.R.T.他们对高度排名项目的偏好是因为他们信任排名系统。尽管以前的工作已经在用户中观察到了这种行为，但我们证明现有的反事实学习（CLTR）方法不会消除这种偏见，包括专门设计用于减轻这种偏见的方法。此外，我们证明，在非平凡的情况下，反向倾向评分（IPS）主要无法纠正信任偏见。我们的主要贡献是基于仿射校正的新估计器：它同时重新启动并惩罚具有高信任偏见的等级上显示的项目。我们的估计器是第一个被证明可以消除信任偏见和位置偏见的影响的估计器。此外，我们表明我们的估计器是对现有CLTR框架的概括：如果没有信任偏见，它将减少到原始的IPS估计器。我们的半合成实验表明，通过消除信任偏差的影响，除了位置偏差外，CLTR可以比以前更接近最佳排名系统。

Besides position bias, which has been well-studied, trust bias is another type of bias prevalent in user interactions with rankings: users are more likely to click incorrectly w.r.t. their preferences on highly ranked items because they trust the ranking system. While previous work has observed this behavior in users, we prove that existing Counterfactual Learning to Rank (CLTR) methods do not remove this bias, including methods specifically designed to mitigate this type of bias. Moreover, we prove that Inverse Propensity Scoring (IPS) is principally unable to correct for trust bias under non-trivial circumstances. Our main contribution is a new estimator based on affine corrections: it both reweights clicks and penalizes items displayed on ranks with high trust bias. Our estimator is the first estimator that is proven to remove the effect of both trust bias and position bias. Furthermore, we show that our estimator is a generalization of the existing CLTR framework: if no trust bias is present, it reduces to the original IPS estimator. Our semi-synthetic experiments indicate that by removing the effect of trust bias in addition to position bias, CLTR can approximate the optimal ranking system even closer than previously possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题