论文标题

使用MIMIC-III的标签平衡的临床BERT嵌入和文本增强的ICD代码的预测

Prediction of ICD Codes with Clinical BERT Embeddings and Text Augmentation with Label Balancing using MIMIC-III

论文作者

Biseda, Brent, Desai, Gaurav, Lin, Haifeng, Philip, Anish

论文摘要

本文使用MIMIC-III数据集实现了ICD代码预测任务的最新结果。这是通过使用临床BERT来实现的(Alsentzer等,2019)。嵌入和文本增强和标签平衡,以提高ICD章节和ICD疾病代码的F1分数。我们将改进的性能归因于使用新颖的文本增强来在训练过程中散布句子的顺序。与F1得分为0.76的前32名ICD代码预测(Keyang Xu等)相比,我们的最终F1得分为0.75,但总共获得了最高的50个ICD代码。

This paper achieves state of the art results for the ICD code prediction task using the MIMIC-III dataset. This was achieved through the use of Clinical BERT (Alsentzer et al., 2019). embeddings and text augmentation and label balancing to improve F1 scores for both ICD Chapter as well as ICD disease codes. We attribute the improved performance mainly to the use of novel text augmentation to shuffle the order of sentences during training. In comparison to the Top-32 ICD code prediction (Keyang Xu, et. al.) with an F1 score of 0.76, we achieve a final F1 score of 0.75 but on a total of the top 50 ICD codes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源