论文标题
非神经模型很重要:对神经参考表达产生系统的重新评估
Non-neural Models Matter: A Re-evaluation of Neural Referring Expression Generation Systems
论文作者
论文摘要
近年来,神经模型在NLG中通常超过了基于规则的经典机器学习方法。现在通常会忽略这些经典方法,例如,当评估新的神经模型时。我们认为,不应忽略它们,因为对于某些任务,精心设计的非神经方法比神经方面的方法更好。在本文中,将在语言上下文中生成参考表达式的任务被用作示例。我们检查了两个非常不同的英语数据集(WebNLG和WSJ),并使用自动和人类评估对每种算法进行了评估。总体而言,这些评估的结果表明,与最先进的神经系统相比,具有简单规则集的基于规则的系统在两个数据集上实现或更好的性能。对于更现实的数据集WSJ,基于机器学习的系统具有精心设计的语言功能。我们希望我们的工作能够鼓励研究人员将来考虑非神经模型。
In recent years, neural models have often outperformed rule-based and classic Machine Learning approaches in NLG. These classic approaches are now often disregarded, for example when new neural models are evaluated. We argue that they should not be overlooked, since, for some tasks, well-designed non-neural approaches achieve better performance than neural ones. In this paper, the task of generating referring expressions in linguistic context is used as an example. We examined two very different English datasets (WEBNLG and WSJ), and evaluated each algorithm using both automatic and human evaluations. Overall, the results of these evaluations suggest that rule-based systems with simple rule sets achieve on-par or better performance on both datasets compared to state-of-the-art neural REG systems. In the case of the more realistic dataset, WSJ, a machine learning-based system with well-designed linguistic features performed best. We hope that our work can encourage researchers to consider non-neural models in future.