论文标题
冷静:语言建模的持续自适应学习
CALM: Continuous Adaptive Learning for Language Modeling
论文作者
论文摘要
培训大型语言表示模型已成为自然语言处理社区的标准。这允许对任何数量的特定任务进行微调,但是,这些大型高容量模型可以继续训练特定的未标记数据,以使初始化更加强大,以实现监督任务。我们证明,在实践中,这些预训练的模型以灾难性遗忘的形式呈现出绩效恶化,当对来自胶水等通用领域的任务进行评估时。在这项工作中,我们建议对语言建模进行平静,连续的自适应学习:渲染模型的技术,以保留跨多个领域的知识。通过这些方法,我们能够减少由任务特定模型引入的监督任务的绩效差距,我们使用生物医学和临床领域中的持续学习设置证明。
Training large language representation models has become a standard in the natural language processing community. This allows for fine tuning on any number of specific tasks, however, these large high capacity models can continue to train on domain specific unlabeled data to make initialization even more robust for supervised tasks. We demonstrate that in practice these pre-trained models present performance deterioration in the form of catastrophic forgetting when evaluated on tasks from a general domain such as GLUE. In this work we propose CALM, Continuous Adaptive Learning for Language Modeling: techniques to render models which retain knowledge across multiple domains. With these methods, we are able to reduce the performance gap across supervised tasks introduced by task specific models which we demonstrate using a continual learning setting in biomedical and clinical domains.