论文标题

CONQ:大规模在线控制实验的连续分位数治疗效果

CONQ: CONtinuous Quantile Treatment Effects for Large-Scale Online Controlled Experiments

论文作者

Wang, Weinan, Zhang, Xi

论文摘要

在许多行业环境中,在线控制的实验(A/B测试)已广泛用作衡量产品或功能影响的黄金标准。大多数研究主要集中在用户参与类型指标上,特别是测量平均值(平均治疗效果,ATE)的治疗效果,并且只有少数人专注于绩效指标(例如,潜伏期),其中在分位数下测量了治疗效果。由于群集样本,可伸缩性问题,密度带宽选择等无数困难,例如,以前的文献主要集中在某些预定的位置上,例如P90或P90,例如P90或P90,这并不总是传达全部图片。在本文中,我们提出了一种新型可扩展的非参数解决方案,该解决方案可以提供具有点置信区间的连续QTE范围,同时完全绕过密度估计。数值结果表明,使用渐近正态性的传统方法表现出很高的一致性。 Snap Inc.已实施了端到端管道,并在分销级别提供了有关关键性能指标的每日见解。

In many industry settings, online controlled experimentation (A/B test) has been broadly adopted as the gold standard to measure product or feature impacts. Most research has primarily focused on user engagement type metrics, specifically measuring treatment effects at mean (average treatment effects, ATE), and only a few have been focusing on performance metrics (e.g. latency), where treatment effects are measured at quantiles. Measuring quantile treatment effects (QTE) is challenging due to the myriad difficulties such as dependency introduced by clustered samples, scalability issues, density bandwidth choices, etc. In addition, previous literature has mainly focused on QTE at some pre-defined locations, such as P50 or P90, which doesn't always convey the full picture. In this paper, we propose a novel scalable non-parametric solution, which can provide a continuous range of QTE with point-wise confidence intervals while circumventing the density estimation altogether. Numerical results show high consistency with traditional methods utilizing asymptotic normality. An end-to-end pipeline has been implemented at Snap Inc., providing daily insights on key performance metrics at a distributional level.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源