随着时间的推移，医疗数据集中的模型评估

论文标题

随着时间的推移，医疗数据集中的模型评估

Model Evaluation in Medical Datasets Over Time

论文作者

Zhou, Helen, Chen, Yuwen, Lipton, Zachary C.

论文摘要

部署在医疗保健系统中的机器学习模型将面临从不断发展的环境中汲取的数据。但是，提出此类模型的研究人员通常会以时间不足的方式对其进行评估，并在整个研究期间使用火车和测试分裂对患者进行采样。我们介绍了随着时间的推移（EMDOT）框架和Python软件包的医学数据集评估，该框架评估了模型类的性能。在五个医疗数据集和各种模型中，我们比较了两种培训策略：（1）使用所有历史数据，以及（2）使用最新数据的窗口。我们注意到绩效随时间的变化，并确定这些冲击的可能解释。

Machine learning models deployed in healthcare systems face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, with train and test splits sampling patients throughout the entire study period. We introduce the Evaluation on Medical Datasets Over Time (EMDOT) framework and Python package, which evaluates the performance of a model class over time. Across five medical datasets and a variety of models, we compare two training strategies: (1) using all historical data, and (2) using a window of the most recent data. We note changes in performance over time, and identify possible explanations for these shocks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题