关于语音识别错误在通过检索中的语音识别错误回答的影响

论文标题

关于语音识别错误在通过检索中的语音识别错误回答的影响

On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering

论文作者

Sidiropoulos, Georgios, Vakulenko, Svitlana, Kanoulas, Evangelos

论文摘要

与语音界面进行互动以查询问题回答（QA）系统越来越流行。通常，质量保证系统依靠通道检索来选择候选上下文并阅读理解以提取最终答案。尽管人们一直在关注QA系统的阅读理解部分，以防止自动语音识别（ASR）模型引入的错误，但段落检索部分仍未开发。但是，此类错误会影响通过检索的性能，从而导致端到端性能较低。为了解决这一差距，我们通过合成的ASR噪声增强了两个现有的大规模通过排名和开放域QA数据集，并研究了ASR噪声的问题，并研究词汇和密集的猎犬的鲁棒性。此外，我们研究了不同领域的数据增强技术的普遍性。每个域都是不同的语言方言或口音。最后，我们创建了一个新的数据集，其中包含人类用户提出的问题，并使用其转录表明，在处理自然ASR噪声而不是合成的ASR噪声时，检索性能会进一步降低。

Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models introduce, the passage retrieval part remains unexplored. However, such errors can affect the performance of passage retrieval, leading to inferior end-to-end performance. To address this gap, we augment two existing large-scale passage ranking and open domain QA datasets with synthetic ASR noise and study the robustness of lexical and dense retrievers against questions with ASR noise. Furthermore, we study the generalizability of data augmentation techniques across different domains; with each domain being a different language dialect or accent. Finally, we create a new dataset with questions voiced by human users and use their transcriptions to show that the retrieval performance can further degrade when dealing with natural ASR noise instead of synthetic ASR noise.

下载PDF全文

下载文献需遵守相关版权规定

论文标题