论文标题

稀疏的文字生成

Sparse Text Generation

论文作者

Martins, Pedro Henrique, Marinho, Zita, Martins, André F. T.

论文摘要

当前的最新文本生成器建立在强大的语言模型(例如GPT-2)上,从而实现了令人印象深刻的性能。但是,为避免退化文本,它们需要通过温度参数或临时截断技术从改良的软磁性进行采样,例如Top-$ k $或Nucleus采样。这在训练和测试条件之间造成了不匹配。在本文中,我们使用最近引入的Entmax转换来训练和样本,从本地稀疏的语言模型中避免这种不匹配。结果是一个文本生成器,在流利性和一致性,重复较少以及n-gram多样性方面具有良好的性能,更接近人类文本。为了评估我们的模型,我们提出了三个新的指标,以比较稀疏或截断的分布:$ε$ - perplexity,sparsemax得分和詹森 - 香农脱落。人类评估的故事完成和对话生成的实验表明,Entmax采样会导致更具吸引力,连贯的故事和对话。

Current state-of-the-art text generators build on powerful language models such as GPT-2, achieving impressive performance. However, to avoid degenerate text, they require sampling from a modified softmax, via temperature parameters or ad-hoc truncation techniques, as in top-$k$ or nucleus sampling. This creates a mismatch between training and testing conditions. In this paper, we use the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch. The result is a text generator with favorable performance in terms of fluency and consistency, fewer repetitions, and n-gram diversity closer to human text. In order to evaluate our model, we propose three new metrics for comparing sparse or truncated distributions: $ε$-perplexity, sparsemax score, and Jensen-Shannon divergence. Human-evaluated experiments in story completion and dialogue generation show that entmax sampling leads to more engaging and coherent stories and conversations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源