使用市场工具的推荐系统中位置偏差的因果估计

论文标题

使用市场工具的推荐系统中位置偏差的因果估计

Causal Estimation of Position Bias in Recommender Systems Using Marketplace Instruments

论文作者

Friedberg, Rina, Rajkumar, Karthik, Mao, Jialiang, Yao, Qian, Yu, YinYin, Liu, Min

论文摘要

在当今的数字社会中，信息检索系统（例如在线市场，新闻提要和搜索引擎）无处不在。它们通过根据预测相关性进行排名，即用户和项目之间的互动可能性（单击，共享）来促进信息发现。通常使用过去的交互作用建模，此类排名有一个主要的缺点：交互取决于注意项目的收到。在用户注意力之外放置的高度相关的物品可能几乎没有互动。观察到的相互作用与真实相关性之间的这种差异称为位置偏差。位置偏见会降低相关性估计，并且随着时间的流逝而复合时，它会使用户陷入错误的相关项目，从而导致市场效率低下。可以通过随机实验来确定位置偏差，但是这种方法的成本和可行性可能会令人难以置信。过去的研究还提出了倾向得分方法，这些方法无法充分解决未观察到的混杂。和回归不连续设计，外部有效性差。在这项工作中，我们通过利用丰富的A/B测试将评估作为仪器变量进行排名，以解决这些问题。历史A/B测试使我们能够访问排名的外源性变化，而无需手动引入它们，从而损害用户体验和平台收入。我们在LinkedIn的两个不同的应用中演示了我们的方法论 - 饲料广告和您 - 可能的知识（PYMK）推荐人。这些市场包括广告方面的用户和广告系列，并邀请发件人和接收者在Pymk上。通过利用事先实验，我们获得了与用户相关性正交的项目排名中的准实验变化。我们的方法提供了可靠的位置效果估计，可以很好地处理未观察到的混杂，更大的推广性，并且很容易扩展到其他信息检索系统。

Information retrieval systems, such as online marketplaces, news feeds, and search engines, are ubiquitous in today's digital society. They facilitate information discovery by ranking retrieved items on predicted relevance, i.e. likelihood of interaction (click, share) between users and items. Typically modeled using past interactions, such rankings have a major drawback: interaction depends on the attention items receive. A highly-relevant item placed outside a user's attention could receive little interaction. This discrepancy between observed interaction and true relevance is termed the position bias. Position bias degrades relevance estimation and when it compounds over time, it can silo users into false relevant items, causing marketplace inefficiencies. Position bias may be identified with randomized experiments, but such an approach can be prohibitive in cost and feasibility. Past research has also suggested propensity score methods, which do not adequately address unobserved confounding; and regression discontinuity designs, which have poor external validity. In this work, we address these concerns by leveraging the abundance of A/B tests in ranking evaluations as instrumental variables. Historical A/B tests allow us to access exogenous variation in rankings without manually introducing them, harming user experience and platform revenue. We demonstrate our methodology in two distinct applications at LinkedIn - feed ads and the People-You-May-Know (PYMK) recommender. The marketplaces comprise users and campaigns on the ads side, and invite senders and recipients on PYMK. By leveraging prior experimentation, we obtain quasi-experimental variation in item rankings that is orthogonal to user relevance. Our method provides robust position effect estimates that handle unobserved confounding well, greater generalizability, and easily extends to other information retrieval systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题