论文标题

解决语言模型的定量推理问题

Solving Quantitative Reasoning Problems with Language Models

论文作者

Lewkowycz, Aitor, Andreassen, Anders, Dohan, David, Dyer, Ethan, Michalewski, Henryk, Ramasesh, Vinay, Slone, Ambrose, Anil, Cem, Schlag, Imanol, Gutman-Solo, Theo, Wu, Yuhuai, Neyshabur, Behnam, Gur-Ari, Guy, Misra, Vedant

论文摘要

语言模型在需要自然语言理解的各种任务上取得了非凡的表现。然而,最先进的模型通常在需要定量推理的任务上挣扎,例如在大学一级解决数学,科学和工程问题。为了帮助缩小这一差距,我们介绍了Minerva,Minerva是一种在一般自然语言数据上鉴定的大型语言模型,并进一步培训了技术内容。该模型在不使用外部工具的情况下实现了技术基准的最新性能。我们还评估了我们在需要定量推理的物理学,生物学,化学,经济学和其他科学方面的200多个本科生问题上评估我们的模型,并发现该模型可以正确回答其中的三分之一。

Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content. The model achieves state-of-the-art performance on technical benchmarks without the use of external tools. We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning, and find that the model can correctly answer nearly a third of them.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源