论文标题
自动变量选择的稳健分布回归
Robust Distributional Regression with Automatic Variable Selection
论文作者
论文摘要
通常会遇到具有极端观察和/或重尾错误分布的数据集,应从统计的角度仔细考虑这些功能,从而分析。与假定模型(例如异常值的存在)可能导致经典回归程序分解,可能导致不可靠的推论。可以通过超越平均值并根据协变量对刻度参数进行建模来处理其他分布偏差,例如异方差。我们提出了一种通过使用广义正态分布(GND)来解释尾巴和异质性的方法。 GND包含一个具有峰度特征的形状参数,该参数在正态分布和较重的拉普拉斯分布之间平稳地移动模型 - 从而涵盖了经典和鲁棒的回归。统计推断的关键组成部分是确定影响响应变量的协变量集。虽然正确地考虑峰度和异性症对于这项工作至关重要,但仍需要进行可变选择的程序。为此,我们使用一种新颖的惩罚估计程序,避免了典型的计算要求对调整参数的网格搜索。这在位置和比例参数取决于协变量的分布回归设置中尤其有价值,因为标准方法将具有多个调谐参数(每个分布参数一个)。我们通过使用可以直接优化的“平滑信息标准”来实现这一目标,其中调谐参数在BIC情况下固定在log(n)上。
Datasets with extreme observations and/or heavy-tailed error distributions are commonly encountered and should be analyzed with careful consideration of these features from a statistical perspective. Small deviations from an assumed model, such as the presence of outliers, can cause classical regression procedures to break down, potentially leading to unreliable inferences. Other distributional deviations, such as heteroscedasticity, can be handled by going beyond the mean and modelling the scale parameter in terms of covariates. We propose a method that accounts for heavy tails and heteroscedasticity through the use of a generalized normal distribution (GND). The GND contains a kurtosis-characterizing shape parameter that moves the model smoothly between the normal distribution and the heavier-tailed Laplace distribution - thus covering both classical and robust regression. A key component of statistical inference is determining the set of covariates that influence the response variable. While correctly accounting for kurtosis and heteroscedasticity is crucial to this endeavour, a procedure for variable selection is still required. For this purpose, we use a novel penalized estimation procedure that avoids the typical computationally demanding grid search for tuning parameters. This is particularly valuable in the distributional regression setting where the location and scale parameters depend on covariates, since the standard approach would have multiple tuning parameters (one for each distributional parameter). We achieve this by using a "smooth information criterion" that can be optimized directly, where the tuning parameters are fixed at log(n) in the BIC case.