上下文文本样式转移

论文标题

上下文文本样式转移

Contextual Text Style Transfer

论文作者

Cheng, Yu, Gan, Zhe, Zhang, Yizhe, Elachqar, Oussama, Li, Dianqi, Liu, Jingjing

论文摘要

我们介绍了一项新任务，上下文样式转移 - 将句子转换为所需的样式，并考虑到其周围的上下文。这给现有样式转移方法带来了两个关键挑战：（$ i $）如何保留目标句子的语义含义及其在转移过程中与周围环境的一致性；（$ ii $）如何使用有限的标记数据培训可靠的模型，并附有上下文。为了通过自然上下文保存实现高质量的样式转移，我们提出了一种上下文感知的样式转移（CAST）模型，该模型使用两个单独的编码器，用于每个输入句子及其周围环境。进一步培训分类器，以确保生成句子的上下文一致性。为了弥补缺乏并行数据，引入了其他自我重建和反向翻译损失，以便以半监督的方式利用非平行数据。引入了两个新的基准测试，即安然 - 封闭式和reddit-contept，以进行形式和进攻风格转移。这些数据集的实验结果证明了拟议的铸造模型对样式准确性，内容保存和上下文一致性指标的最先进方法的有效性。

We introduce a new task, Contextual Text Style Transfer - translating a sentence into a desired style with its surrounding context taken into account. This brings two key challenges to existing style transfer approaches: ($i$) how to preserve the semantic meaning of target sentence and its consistency with surrounding context during transfer; ($ii$) how to train a robust model with limited labeled data accompanied with context. To realize high-quality style transfer with natural context preservation, we propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context. A classifier is further trained to ensure contextual consistency of the generated sentence. To compensate for the lack of parallel data, additional self-reconstruction and back-translation losses are introduced to leverage non-parallel data in a semi-supervised fashion. Two new benchmarks, Enron-Context and Reddit-Context, are introduced for formality and offensiveness style transfer. Experimental results on these datasets demonstrate the effectiveness of the proposed CAST model over state-of-the-art methods across style accuracy, content preservation and contextual consistency metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题