论文标题
使用深神经网络的时间序列数据的长期缺失价值归合
Long-Term Missing Value Imputation for Time Series Data Using Deep Neural Networks
论文作者
论文摘要
我们提出了一种使用深度学习模型,特别是多层感知器(MLP)的方法,用于估计多变量时间序列数据中变量的缺失值。我们专注于填补较长的连续间隙(例如,几个月缺少的每日观察),而不是随机丢失观察结果。我们提出的差距填充算法使用一种自动化方法来确定最佳的MLP模型体系结构,从而允许给定时间序列的最佳预测性能。我们通过在三个具有不同时间序列特征的环境数据集(即每天的地下水水平,每天的土壤水分和每小时净生态系统交换)中填补各种长度(三个月至三年)的空白来测试我们的方法。我们比较了我们对基于广泛的R基于R的时间序列填充方法的方法和MTSDI的方法的准确性。结果表明,使用MLP填补较大的间隙会导致更好的结果,尤其是当数据表现非线性时。因此,我们的方法可以使用一个在一个变量中具有较大差距的数据集,这在许多长期的环境监测观察结果中很常见。
We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for determining the optimal MLP model architecture, thus allowing for optimal prediction performance for the given time series. We tested our approach by filling gaps of various lengths (three months to three years) in three environmental datasets with different time series characteristics, namely daily groundwater levels, daily soil moisture, and hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled values obtained with our approach to the widely-used R-based time series gap filling methods ImputeTS and mtsdi. The results indicate that using an MLP for filling a large gap leads to better results, especially when the data behave nonlinearly. Thus, our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations.