具有文本和药物嵌入的多模型模型用于不良药物反应分类

论文标题

具有文本和药物嵌入的多模型模型用于不良药物反应分类

Multimodal Model with Text and Drug Embeddings for Adverse Drug Reaction Classification

论文作者

Sakhovskiy, Andrey, Tutubalina, Elena

论文摘要

在本文中，我们关注推文的分类，作为不良药物作用（ADE）或药物反应（ADR）的潜在信号来源。遵循文本和药物结构表示是互补的，我们引入了一个具有两个组件的多模式模型。这些组件是基于BERT的最新模型，用于语言理解和分子财产预测。实验是在社交媒体挖掘的多种语言基准上进行健康研究和应用程序（＃SMM4H）计划。我们的模型在＃SMM4H 2021上分别以英文和俄语分别获得了＃SMM4H 2021共享任务1A和2的最新结果。关于SMM4H 2020任务1的法国推文的分类，我们的方法以8％的F1的绝对增益推动了最新技术。我们的实验表明，从神经网络获得的分子信息比传统分子描述符更有益于ADE分类。我们的模型的源代码可在https://github.com/andoree/smm4h_2021_classification中免费获得。

In this paper, we focus on the classification of tweets as sources of potential signals for adverse drug effects (ADEs) or drug reactions (ADRs). Following the intuition that text and drug structure representations are complementary, we introduce a multimodal model with two components. These components are state-of-the-art BERT-based models for language understanding and molecular property prediction. Experiments were carried out on multilingual benchmarks of the Social Media Mining for Health Research and Applications (#SMM4H) initiative. Our models obtained state-of-the-art results of 0.61 F1 and 0.57 F1 on #SMM4H 2021 Shared Tasks 1a and 2 in English and Russian, respectively. On the classification of French tweets from SMM4H 2020 Task 1, our approach pushes the state of the art by an absolute gain of 8% F1. Our experiments show that the molecular information obtained from neural networks is more beneficial for ADE classification than traditional molecular descriptors. The source code for our models is freely available at https://github.com/Andoree/smm4h_2021_classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题