论文标题
神经NLP模型的低资源改编
Low-Resource Adaptation of Neural NLP Models
论文作者
论文摘要
自然语言处理(NLP)的现实世界应用具有挑战性。 NLP模型在很大程度上依赖于监督的机器学习,并且需要大量注释的数据。这些资源通常基于大量可用的语言数据,例如英语新闻。但是,在NLP的实际应用中,文本资源在几个维度(例如语言,方言,主题和流派)上有所不同。找到足够数量和质量的注释数据是一项挑战。本论文的目的是调查处理信息提取和自然语言理解中这种低资源场景的方法。为此,我们研究了各种低资源环境中遥远的监督和顺序转移学习。我们开发并适应神经NLP模型,以探讨有关NLP任务的许多研究问题,该任务使用最少或没有培训数据。
Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data. These resources are often based on language data available in large quantities, such as English newswire. However, in real-world applications of NLP, the textual resources vary across several dimensions, such as language, dialect, topic, and genre. It is challenging to find annotated data of sufficient amount and quality. The objective of this thesis is to investigate methods for dealing with such low-resource scenarios in information extraction and natural language understanding. To this end, we study distant supervision and sequential transfer learning in various low-resource settings. We develop and adapt neural NLP models to explore a number of research questions concerning NLP tasks with minimal or no training data.