论文标题
从数据不平衡的数据中对ICU医疗相关感染的预测建模。使用合奏和基于聚类的底采样方法
Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach
论文作者
论文摘要
鉴于这种感染对患者死亡率和医疗保健费用的影响,早期发现容易受到医院环境中感染感染的患者是一个挑战。这项工作既关注危险因素的识别,也集中在通过机器学习方法的强度护理单位中与医疗保健相关感染的预测。目的是支持降低感染发病率的决策。在该领域,有必要处理从不平衡数据集中构建可靠分类器的问题。我们提出了一种基于聚类的底样策略,该策略与集成分类器结合使用。为了验证我们的建议,对4616名患者的数据进行了比较研究。我们将几个单一和集合分类器应用于原始数据集,并通过不同的重采样方法进行预处理。通过专门为数据分类而设计的经典和最新指标对结果进行了分析。他们透露,与其他方法相比,该提案更有效。
Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches.