使用机器学习来测试共连分析中的因果假设

论文标题

使用机器学习来测试共连分析中的因果假设

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

论文作者

Ham, Dae Woong, Imai, Kosuke, Janson, Lucas

论文摘要

联合分析是一种流行的实验设计，用于测量多维偏好。研究人员研究了在控制其他相关因素的同时如何影响决策。当前，存在两种方法学方法来分析联合实验的数据。第一个侧重于估计每个因素的平均边际效应，同时平均其他因素。尽管这允许基于直接设计的估计，但结果严重取决于其他因素的分布以及相互作用效应的汇总方式。一种基于模型的替代方法可以计算各种兴趣，但要求研究人员正确指定模型，这是与许多因素和可能的相互作用的联合分析的具有挑战性的任务。此外，在合并相互作用时，常用的逻辑回归即使具有适度的因素，统计特性也很差。我们提出了一种基于条件随机测试的新假设检验方法，以回答联合分析的最基本问题：考虑到其他因素，感兴趣的因素是否重要？我们的方法仅基于因素的随机化，因此没有假设。但是，它允许研究人员使用任何测试统计量，包括基于复杂的机器学习算法的统计量。结果，我们能够结合现有的基于设计和基于模型的方法的优势。我们通过对移民偏好和政治候选人评估的联合分析来说明拟议的方法。我们还扩展了与联合分析中常用的规律性假设测试的建议方法。可以使用开源软件包实施提出的方法。

Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, influences decision-making. Currently, there exist two methodological approaches to analyzing data from a conjoint experiment. The first focuses on estimating the average marginal effects of each factor while averaging over the other factors. Although this allows for straightforward design-based estimation, the results critically depend on the distribution of other factors and how interaction effects are aggregated. An alternative model-based approach can compute various quantities of interest, but requires researchers to correctly specify the model, a challenging task for conjoint analysis with many factors and possible interactions. In addition, a commonly used logistic regression has poor statistical properties even with a moderate number of factors when incorporating interactions. We propose a new hypothesis testing approach based on the conditional randomization test to answer the most fundamental question of conjoint analysis: Does a factor of interest matter in any way given the other factors? Our methodology is solely based on the randomization of factors, and hence is free from assumptions. Yet, it allows researchers to use any test statistic, including those based on complex machine learning algorithms. As a result, we are able to combine the strengths of the existing design-based and model-based approaches. We illustrate the proposed methodology through conjoint analysis of immigration preferences and political candidate evaluation. We also extend the proposed approach to test for regularity assumptions commonly used in conjoint analysis. An open-source software package is available for implementing the proposed methodology.

下载PDF全文

下载文献需遵守相关版权规定

论文标题