论文标题
卡尔曼(Kalman)在CTR预测中过滤对用户行为建模的注意力
Kalman Filtering Attention for User Behavior Modeling in CTR Prediction
论文作者
论文摘要
点击率(CTR)预测是电子商务搜索引擎的基本任务之一。随着搜索变得更加个性化,有必要从丰富的行为数据中捕获用户兴趣。现有的用户行为建模算法会开发出不同的注意机制,以强调与查询相关的行为并抑制无关的行为。尽管经过广泛的研究,但这些注意力仍然受到两个局限性。首先,传统的关注大部分仅将注意力领域仅限于单个用户的行为,这在电子商务中不适合用户寻找与任何历史行为无关的新需求。其次,这些关注通常会偏向频繁的行为,这是不合理的,因为高频不一定表明非常重要。为了应对这两个局限性,我们提出了一种新颖的注意机制,称为卡尔曼过滤注意力(KFATT),该机制将注意力集中在关注中是最大的A后验(MAP)估计。通过合并优先级,KFATT在很少有相关的用户行为时求助于全球统计信息。此外,还合并了频率封盖机制,以纠正频繁行为的偏见。基准和100亿比例的实际生产数据集进行了离线实验,以及在线A/B测试,表明KFATT的表现都优于所有最新技术。 KFATT已部署在领先的E Commerce网站的排名系统中,每天为数亿活跃用户提供主要流量。
Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of a leading e commerce website, serving the main traffic of hundreds of millions of active users everyday.