基于控制实验的内核密度估计值的协变量平衡

论文标题

基于控制实验的内核密度估计值的协变量平衡

Covariate Balancing Based on Kernel Density Estimates for Controlled Experiments

论文作者

Li, Yiou, Kang, Lulu, Huang, Xiao

论文摘要

受控实验在许多应用中广泛使用，以研究输入因素与实验结果之间的因果关系。完全随机的设计通常用于将治疗水平随机分配给实验单位。当可以使用实验单元的协变量时，实验设计应在治疗组之间达到协变量平衡，因此治疗效应的统计推断不会与协变量的任何可能影响混淆。但是，协变量不平衡通常存在，因为实验是基于完整随机化的单个实现进行的。当实验单元的大小很小或中等时，它更有可能发生并恶化。在本文中，我们引入了一个新的协变量平衡标准，该标准衡量了治疗组协变量的核密度估计之间的差异。为了在随机分配处理之前达到协变量平衡，我们通过最大程度地减少标准，然后将治疗水平随机分配给分区组来分配实验单位。通过数值示例，我们表明提出的分区方法可以提高均值估计器的准确性，并优于完整的随机化和恢复方法。

Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomized design is usually used to randomly assign treatment levels to experimental units. When covariates of the experimental units are available, the experimental design should achieve covariate balancing among the treatment groups, such that the statistical inference of the treatment effects is not confounded with any possible effects of covariates. However, covariate imbalance often exists, because the experiment is carried out based on a single realization of the complete randomization. It is more likely to occur and worsen when the size of the experimental units is small or moderate. In this paper, we introduce a new covariate balancing criterion, which measures the differences between kernel density estimates of the covariates of treatment groups. To achieve covariate balance before the treatments are randomly assigned, we partition the experimental units by minimizing the criterion, then randomly assign the treatment levels to the partitioned groups. Through numerical examples, we show that the proposed partition approach can improve the accuracy of the difference-in-mean estimator and outperforms the complete randomization and rerandomization approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题