论文标题
部分可观测时空混沌系统的无模型预测
Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task
论文作者
论文摘要
文本分割旨在将文本分为连续的,语义相干的段,而段标记涉及每个段的产生标签。过去的工作显示在解决文档和对话的分段和标签方面取得了成功。通过特定于任务的管道,受监督和无监督的学习目标的结合,这是可能的。在这项工作中,我们提出了一个单一的编码器神经网络,该网络可以处理长文档和对话,同时仅使用标准监督进行细分和细分标记。我们成功地展示了将组合任务作为纯生成任务解决的方法,我们称之为结构化摘要。我们将相同的技术应用于文档和对话数据,并在高资源设置和低资源设置下显示了各个数据集的最新技术性能。我们的结果为考虑文本细分和整体标记,并朝着不依赖域专业知识或特定于任务的组件迈向通用技术,建立了有力的案例。
Text segmentation aims to divide text into contiguous, semantically coherent segments, while segment labeling deals with producing labels for each segment. Past work has shown success in tackling segmentation and labeling for documents and conversations. This has been possible with a combination of task-specific pipelines, supervised and unsupervised learning objectives. In this work, we propose a single encoder-decoder neural network that can handle long documents and conversations, trained simultaneously for both segmentation and segment labeling using only standard supervision. We successfully show a way to solve the combined task as a pure generation task, which we refer to as structured summarization. We apply the same technique to both document and conversational data, and we show state of the art performance across datasets for both segmentation and labeling, under both high- and low-resource settings. Our results establish a strong case for considering text segmentation and segment labeling as a whole, and moving towards general-purpose techniques that don't depend on domain expertise or task-specific components.