在低资源语言的无监督声学建模中利用跨语性知识

论文标题

在低资源语言的无监督声学建模中利用跨语性知识

Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages

论文作者

Feng, Siyuan

论文摘要

（摘要的简短版本）本文描述了对零资源场景中自动语音识别（ASR）无监督声学建模（UAM）的调查，其中仅假定未转录的语音数据可用。 UAM不仅在解决ASR技术开发中数据稀缺的总体问题方面，而且对于许多非主流应用程序也至关重要，示例，语言保护，语言获取和病理语音评估至关重要。本研究的重点是两个研究问题。第一个问题涉及给定语言中基本（子字级）语音单元的无监督发现。在零资源条件下，语音单元只能从声学信号中推断出，而无需或涉及任何语言方向和/或约束。第二个问题称为无监督子词建模。从本质上讲，框架级特征表示需要从未转录的语音中学到。学习的功能表示形式是子词单元发现的基础。希望它在语言上是歧视性的，并且对非语言因素具有鲁棒性。在子字单元发现和建模中特别广泛地使用跨语性知识是这项研究的重点。

(Short version of Abstract) This thesis describes an investigation on unsupervised acoustic modeling (UAM) for automatic speech recognition (ASR) in the zero-resource scenario, where only untranscribed speech data is assumed to be available. UAM is not only important in addressing the general problem of data scarcity in ASR technology development but also essential to many non-mainstream applications, for examples, language protection, language acquisition and pathological speech assessment. The present study is focused on two research problems. The first problem concerns unsupervised discovery of basic (subword level) speech units in a given language. Under the zero-resource condition, the speech units could be inferred only from the acoustic signals, without requiring or involving any linguistic direction and/or constraints. The second problem is referred to as unsupervised subword modeling. In its essence a frame-level feature representation needs to be learned from untranscribed speech. The learned feature representation is the basis of subword unit discovery. It is desired to be linguistically discriminative and robust to non-linguistic factors. Particularly extensive use of cross-lingual knowledge in subword unit discovery and modeling is a focus of this research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题