探索贝叶斯惊喜以防止过度拟合并预测非侵入性负载监控的模型性能

论文标题

探索贝叶斯惊喜以防止过度拟合并预测非侵入性负载监控的模型性能

Exploring Bayesian Surprise to Prevent Overfitting and to Predict Model Performance in Non-Intrusive Load Monitoring

论文作者

Jones, Richard, Klemenjak, Christoph, Makonin, Stephen, Bajic, Ivan V.

论文摘要

非侵入式负载监测（NILM）是一个研究领域，该领域的目的是仅基于其聚合信号在系统中隔离组成的电载荷。大量的计算资源和研究时间是花费在培训模型上的，通常使用尽可能多的数据，这可能是由于更多数据等于更准确的模型和更好地执行算法的预先看法所驱动的。什么时候进行了足够的事先培训？尼尔姆算法何时遇到新的，看不见的数据？这项工作应用了贝叶斯惊喜的概念来回答这些问题，这些问题对于受监督和无监督算法都很重要。我们在观测窗口之前和之后量化了预测分布（称为后的惊喜）以及过渡概率（称为过渡性惊喜）之间的惊喜程度。我们比较了NILMTK支持的几种基准NILM算法的性能，以便在两种综合的惊喜衡量标准上建立一个有用的阈值。我们通过探索流行的隐藏马尔可夫模型的性能作为惊喜阈值来验证过渡惊喜的使用。最后，我们探索使用惊喜阈值作为正规化技术，以避免过度拟合跨数据库的性能。尽管本文讨论的具体意外阈值的一般性可能是无需进行进一步测试而怀疑的，但这项工作提供了明确的证据，表明存在模型性能减少有关数据集大小的回报。这对未来模型开发，数据集采集以及在部署过程中的模型灵活性有影响。

Non-Intrusive Load Monitoring (NILM) is a field of research focused on segregating constituent electrical loads in a system based only on their aggregated signal. Significant computational resources and research time are spent training models, often using as much data as possible, perhaps driven by the preconception that more data equates to more accurate models and better performing algorithms. When has enough prior training been done? When has a NILM algorithm encountered new, unseen data? This work applies the notion of Bayesian surprise to answer these questions which are important for both supervised and unsupervised algorithms. We quantify the degree of surprise between the predictive distribution (termed postdictive surprise), as well as the transitional probabilities (termed transitional surprise), before and after a window of observations. We compare the performance of several benchmark NILM algorithms supported by NILMTK, in order to establish a useful threshold on the two combined measures of surprise. We validate the use of transitional surprise by exploring the performance of a popular Hidden Markov Model as a function of surprise threshold. Finally, we explore the use of a surprise threshold as a regularization technique to avoid overfitting in cross-dataset performance. Although the generality of the specific surprise threshold discussed herein may be suspect without further testing, this work provides clear evidence that a point of diminishing returns of model performance with respect to dataset size exists. This has implications for future model development, dataset acquisition, as well as aiding in model flexibility during deployment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题