论文标题
Speech Prompt:探索迅速调整语音处理任务的生成语言模型
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
论文作者
论文摘要
从自我监督学习(SSL)模型中学到的语音表示可以使各种语音处理任务受益。但是,利用SSL表示通常需要对预训练的模型进行微调或设计特定任务的下游模型和损失功能,从而导致大量记忆使用和人工劳动。最近,发现自然语言处理(NLP)的提示是一种有效的技术来利用预训练的语言模型(LMS)。具体而言,及时调整通过固定的预训练模型优化了有限数量的特定于任务的参数。结果,每个任务只需要存储一小部分参数。迅速调整通过利用预先训练的LM的预测能力来提高计算和内存效率。尽管如此,在演讲社区中很少研究这种范式。我们在本文中报告了基于生成语言模型(GSLM)的语音处理任务的及时调整范式的第一次探索。实验结果表明,及时调整技术在语音分类任务中的竞争性能与可训练参数少于专门的下游模型的竞争性能。我们进一步研究了具有挑战性的序列生成任务的技术。及时调整还表明了其潜力,同时在本文中讨论了限制和可能的研究方向。源代码可在https://github.com/ga642381/speechprompt上找到。
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech processing tasks. However, utilizing SSL representations usually requires fine-tuning the pre-trained models or designing task-specific downstream models and loss functions, causing much memory usage and human labor. Recently, prompting in Natural Language Processing (NLP) has been found to be an efficient technique to leverage pre-trained language models (LMs). Specifically, prompt tuning optimizes a limited number of task-specific parameters with a fixed pre-trained model; as a result, only a small set of parameters is needed to be stored for each task. Prompt tuning improves computation and memory efficiency by leveraging the pre-trained LM's prediction ability. Nevertheless, such a paradigm is little studied in the speech community. We report in this paper the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM). Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models. We further study the technique in challenging sequence generation tasks. Prompt tuning also demonstrates its potential, while the limitation and possible research directions are discussed in this paper. The source code is available on https://github.com/ga642381/SpeechPrompt.