论文标题
在学习文本样式转移和直接奖励上
On Learning Text Style Transfer with Direct Rewards
论文作者
论文摘要
在大多数情况下,缺乏平行语料库使得无法直接训练文本样式转移任务的监督模型。在本文中,我们探索了培训算法,这些算法优化了明确考虑样式转移输出不同方面的奖励功能。特别是,我们利用最初用于微调神经机器翻译模型的语义相似性指标来明确评估系统输出和输入文本之间的内容。我们还研究了现有自动指标的潜在弱点,并提出了使用这些指标进行培训的有效策略。实验结果表明,我们的模型在对强基础上的自动和人类评估中都具有显着的收益,这表明我们提出的方法和培训策略的有效性。
In most cases, the lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task. In this paper, we explore training algorithms that instead optimize reward functions that explicitly consider different aspects of the style-transferred outputs. In particular, we leverage semantic similarity metrics originally used for fine-tuning neural machine translation models to explicitly assess the preservation of content between system outputs and input texts. We also investigate the potential weaknesses of the existing automatic metrics and propose efficient strategies of using these metrics for training. The experimental results show that our model provides significant gains in both automatic and human evaluation over strong baselines, indicating the effectiveness of our proposed methods and training strategies.