高斯多头关注的同时机器翻译

论文标题

高斯多头关注的同时机器翻译

Gaussian Multi-head Attention for Simultaneous Machine Translation

论文作者

Zhang, Shaolei, Feng, Yang

论文摘要

在接收流源输入时，同时的机器翻译（SIMT）输出转换，因此需要一个策略来确定从哪里开始翻译。目标和源单词之间的对齐通常意味着每个目标单词最有用的源单词，因此提供了对翻译质量和延迟的统一控制，但是不幸的是，现有的SIMT方法并未明确地对执行控制的对齐方式进行模拟。在本文中，我们提出高斯多头关注（GMA），通过以统一的方式建模对齐和翻译来制定新的SIMT策略。对于SIMT策略，GMA对每个目标单词的对齐源位置进行建模，因此等待其对齐位置开始翻译。为了将对齐的学习整合到翻译模型中，引入了以预测对准位置为中心的高斯分布作为与对齐相关的先验，该先验与与翻译相关的软关注以确定最终注意力。关于EN-VI和DE-EN任务的实验表明，我们的方法在翻译和延迟之间的权衡方面优于强大的基准。

Simultaneous machine translation (SiMT) outputs translation while receiving the streaming source inputs, and hence needs a policy to determine where to start translating. The alignment between target and source words often implies the most informative source word for each target word, and hence provides the unified control over translation quality and latency, but unfortunately the existing SiMT methods do not explicitly model the alignment to perform the control. In this paper, we propose Gaussian Multi-head Attention (GMA) to develop a new SiMT policy by modeling alignment and translation in a unified manner. For SiMT policy, GMA models the aligned source position of each target word, and accordingly waits until its aligned position to start translating. To integrate the learning of alignment into the translation model, a Gaussian distribution centered on predicted aligned position is introduced as an alignment-related prior, which cooperates with translation-related soft attention to determine the final attention. Experiments on En-Vi and De-En tasks show that our method outperforms strong baselines on the trade-off between translation and latency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题