论文标题
预分级ICD10编码预测的集合模型
Ensemble model for pre-discharge icd10 coding prediction
论文作者
论文摘要
将医学诊断为临床编码的翻译在计费,病因分析和审计方面具有广泛的应用。当前,编码是手动的努力,而这种任务的自动化并不直接。挑战之一是凌乱而嘈杂的临床记录,病例复杂性以及巨大的ICD10代码空间。先前的工作主要依赖于排放笔记进行预测,并将其应用于非常有限的数据量表。我们提出了一个合奏模型,该模型结合了多个临床数据源,以进行准确的代码预测。我们进一步提出了一种评估机制,以提供预测结果的置信率。在两个新的现实世界临床数据集(住院和门诊病人)上,通过Maharaj Nakorn Chiang Mai医院进行了不变的病例分布进行了广泛的实验。对于平均精度,我们获得了0.73和0.58的多标签分类精度,F1分别为0.56和0.35,分别用于预测住院和门诊数据集的主诊断的0.71和0.4精度。
The translation of medical diagnosis to clinical coding has wide range of applications in billing, aetiology analysis, and auditing. Currently, coding is a manual effort while the automation of such task is not straight forward. Among the challenges are the messy and noisy clinical records, case complexities, along with the huge ICD10 code space. Previous work mainly relied on discharge notes for prediction and was applied to a very limited data scale. We propose an ensemble model incorporating multiple clinical data sources for accurate code predictions. We further propose an assessment mechanism to provide confidence rates in predicted outcomes. Extensive experiments were performed on two new real-world clinical datasets (inpatient & outpatient) with unaltered case-mix distributions from Maharaj Nakorn Chiang Mai Hospital. We obtain multi-label classification accuracies of 0.73 and 0.58 for average precision, 0.56 and 0.35 for F1-scores and 0.71 and 0.4 accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.