通过统一的多项选择观点来了解自然语言理解的零射学习者

论文标题

通过统一的多项选择观点来了解自然语言理解的零射学习者

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

论文作者

Yang, Ping, Wang, Junjie, Gan, Ruyi, Zhu, Xinyu, Zhang, Lin, Wu, Ziwei, Gao, Xinyu, Zhang, Jiaxing, Sakai, Tetsuya

论文摘要

我们为零拍学习者提出了一个新的范式，该范式是格式不可知论的，即，它与任何格式兼容，并且适用于语言任务列表，例如文本分类，常识性推理，核心解决方案和情感分析。零击学习旨在在给定任务上培训模型，以便可以解决新的学习任务而无需任何其他培训。我们的方法将零击学习转换为多项选择任务，避免了常用的大规模生成模型（例如Flan）中的问题。它不仅增加了模型的概括能力，而且还显着减少了参数的数量。我们的方法具有有效的培训和部署的优点。我们的方法显示了几个基准的最新性能，并在诸如自然语言推论和文本分类等任务上产生令人满意的结果。我们的模型仅用23500万参数就可以实现这一成功，该参数比具有数十亿个参数的最先进模型要小得多。代码和预训练模型可在https://github.com/idea-ccnl/fengshenbang-lm上找到。

We propose a new paradigm for zero-shot learners that is format agnostic, i.e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis. Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without any additional training. Our approach converts zero-shot learning into multiple-choice tasks, avoiding problems in commonly used large-scale generative models such as FLAN. It not only adds generalization ability to models but also significantly reduces the number of parameters. Our method shares the merits of efficient training and deployment. Our approach shows state-of-the-art performance on several benchmarks and produces satisfactory results on tasks such as natural language inference and text classification. Our model achieves this success with only 235M parameters, which is substantially smaller than state-of-the-art models with billions of parameters. The code and pre-trained models are available at https://github.com/IDEA-CCNL/Fengshenbang-LM .

下载PDF全文

下载文献需遵守相关版权规定

论文标题