论文标题
重叠样本的孟德尔随机化与多次暴露:贝叶斯方法
Overlapping-sample Mendelian randomisation with multiple exposures: A Bayesian approach
论文作者
论文摘要
背景:Mendelian随机分组(MR)已广泛应用于医学研究的因果推断。它使用遗传变异作为工具变量(IVS)来研究暴露与结果之间的假定因果关系。传统的MR方法主要集中在两样本的环境上,其中IV暴露关联研究和IV结果关联研究是独立的。但是,两项研究的参与者完全重叠(单样本)或部分重叠(重叠样本)并不少见。方法:我们提出了一种适用于所有三个样本设置的方法。从本质上讲,我们将两个或重叠的样本问题转换为一个样本问题,其中某些人的数据不完整。假设所有个体都是从同一人群中汲取的,而未测量的数据则是随机缺失的。然后将未观察到的数据与模型参数作为未知数量进行处理,因此可以使用Markov Chain Monte Carlo在观察到的数据和估计参数上迭代条件。我们概括了我们的模型以允许多效和多次暴露,并通过多个模拟使用四个指标来评估其性能:均值,标准偏差,覆盖范围和功率。结果:较高的样本重叠速率和更强的仪器导致估计值更高,精度和功率更高。多效性对估计值显着负面影响。然而,总体而言,覆盖范围很高,我们的模型在所有样本设置中都表现良好。结论:我们的模型提供了适用于任何样本设置的灵活性,这是MR文献的重要补充,该文献仅限于一种或两次样本场景。鉴于贝叶斯推论的性质,可以很容易地扩展到医学研究中更复杂的MR分析。
Background: Mendelian randomization (MR) has been widely applied to causal inference in medical research. It uses genetic variants as instrumental variables (IVs) to investigate putative causal relationship between an exposure and an outcome. Traditional MR methods have dominantly focussed on a two-sample setting in which IV-exposure association study and IV-outcome association study are independent. However, it is not uncommon that participants from the two studies fully overlap (one-sample) or partly overlap (overlapping-sample). Methods: We proposed a method that is applicable to all the three sample settings. In essence, we converted a two- or overlapping- sample problem to a one-sample problem where data of some or all of the individuals were incomplete. Assume that all individuals were drawn from the same population and unmeasured data were missing at random. Then the unobserved data were treated au pair with the model parameters as unknown quantities, and thus, could be imputed iteratively conditioning on the observed data and estimated parameters using Markov chain Monte Carlo. We generalised our model to allow for pleiotropy and multiple exposures and assessed its performance by a number of simulations using four metrics: mean, standard deviation, coverage and power. Results: Higher sample overlapping rate and stronger instruments led to estimates with higher precision and power. Pleiotropy had a notably negative impact on the estimates. Nevertheless, overall the coverages were high and our model performed well in all the sample settings. Conclusions: Our model offers the flexibility of being applicable to any of the sample settings, which is an important addition to the MR literature which has restricted to one- or two- sample scenarios. Given the nature of Bayesian inference, it can be easily extended to more complex MR analysis in medical research.