论文标题

动力学系统的内核两样本测试

A Kernel Two-sample Test for Dynamical Systems

论文作者

Solowjow, Friedrich, Baumann, Dominik, Fiedler, Christian, Jocham, Andreas, Seel, Thomas, Trimpe, Sebastian

论文摘要

评估数据流是否是从相同分布中绘制的是各种机器学习问题的核心。这与动态系统产生的数据特别相关,因为此类系统对于生物医学,经济或工程系统的许多实际过程至关重要。虽然内核两样本测试对于比较独立且相同分布的随机变量非常有力,但没有建立的方法来比较动态系统。主要问题是固有的违反独立假设。我们通过解决三个核心挑战提出了针对动态系统的两样本测试:我们(i)引入了一种新颖的混合概念,该概念捕获相关度量标准中的自相关,(ii)提出了一种有效的方法来估算纯粹依赖数据的混合速度,并将这些依赖于数据的速度集成到既定的核核两样样本测试中。结果是一种数据驱动的方法,可直接在实践中使用,并具有合理的理论保证。在从人类步行数据中进行异常检测的示例应用程序中,我们表明该测试很容易适用,没有任何人类的专家知识和功能工程。

Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems. This is particularly relevant for data generated by dynamical systems since such systems are essential for many real-world processes in biomedical, economic, or engineering systems. While kernel two-sample tests are powerful for comparing independent and identically distributed random variables, no established method exists for comparing dynamical systems. The main problem is the inherently violated independence assumption. We propose a two-sample test for dynamical systems by addressing three core challenges: we (i) introduce a novel notion of mixing that captures autocorrelations in a relevant metric, (ii) propose an efficient way to estimate the speed of mixing relying purely on data, and (iii) integrate these into established kernel two-sample tests. The result is a data-driven method that is straightforward to use in practice and comes with sound theoretical guarantees. In an example application to anomaly detection from human walking data, we show that the test is readily applicable without any human expert knowledge and feature engineering.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源