论文标题
使用T分布的内核错误的强大离散选择模型
Robust discrete choice models with t-distributed kernel errors
论文作者
论文摘要
离散选择响应数据中的异常值可能是由于响应变量的错误分类和错误报告以及与建模假设不一致的选择行为(例如随机效用最大化)所致。在存在异常值的情况下,标准离散选择模型产生了有偏见的估计,并遭受了预测准确性的损害。强大的统计模型对异常值的敏感性不如标准的非运动模型。本文分析了多项式概率(MNP)模型的两种鲁棒替代方法。这两个模型是ROBIT模型,其内核误差分布是重型T-D-DISTRICTIONS,以减轻异常值的影响。第一个模型是多项式抢劫(MNR)模型,其中一般的自由度参数控制核误差分布的重尾。第二个模型是广义的多项式robit(Gen-MNR)模型,比MNR更灵活,因为它允许在内核误差分布的每个维度中具有明显的重尾性。对于这两种模型,我们都会得出Gibbs采样器的后推断。在一项模拟研究中,我们说明了提出的贝叶斯估计量的出色有限样品特性,并表明,如果选择数据通过非体MNP模型的镜头包含异常值,则MNR和Gen-MNR会产生更准确的估计。在一项关于运输模式选择行为的案例研究中,MNR和Gen-MNR在样本中拟合和样本外预测准确性方面以大幅度的优于MNP。该案例研究还突出了模型之间弹性估计的差异。
Outliers in discrete choice response data may result from misclassification and misreporting of the response variable and from choice behaviour that is inconsistent with modelling assumptions (e.g. random utility maximisation). In the presence of outliers, standard discrete choice models produce biased estimates and suffer from compromised predictive accuracy. Robust statistical models are less sensitive to outliers than standard non-robust models. This paper analyses two robust alternatives to the multinomial probit (MNP) model. The two models are robit models whose kernel error distributions are heavy-tailed t-distributions to moderate the influence of outliers. The first model is the multinomial robit (MNR) model, in which a generic degrees of freedom parameter controls the heavy-tailedness of the kernel error distribution. The second model, the generalised multinomial robit (Gen-MNR) model, is more flexible than MNR, as it allows for distinct heavy-tailedness in each dimension of the kernel error distribution. For both models, we derive Gibbs samplers for posterior inference. In a simulation study, we illustrate the excellent finite sample properties of the proposed Bayes estimators and show that MNR and Gen-MNR produce more accurate estimates if the choice data contain outliers through the lens of the non-robust MNP model. In a case study on transport mode choice behaviour, MNR and Gen-MNR outperform MNP by substantial margins in terms of in-sample fit and out-of-sample predictive accuracy. The case study also highlights differences in elasticity estimates across models.