混合自动回旋传感器（帽子）

论文标题

混合自动回旋传感器（帽子）

Hybrid Autoregressive Transducer (hat)

论文作者

Variani, Ehsan, Rybach, David, Allauzen, Cyril, Riley, Michael

论文摘要

本文提出并评估了混合自动回旋换能器（HAT）模型，这是一种时间同步的engoderDecoder模型，可保留常规自动语音识别系统的模块化。 HAT模型提供了一种衡量内部语言模型质量的方法，该模型可用于决定使用外部语言模型是否有益。本文还提供了帽子模型的有限上下文版本，该版本解决了曝光偏见问题，并大大简化了整体培训和推理。我们在大规模的语音搜索任务上评估了建议的模型。与最先进的方法相比，我们的实验显示出明显的改善。

This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a time-synchronous encoderdecoder model that preserves the modularity of conventional automatic speech recognition systems. The HAT model provides a way to measure the quality of the internal language model that can be used to decide whether inference with an external language model is beneficial or not. This article also presents a finite context version of the HAT model that addresses the exposure bias problem and significantly simplifies the overall training and inference. We evaluate our proposed model on a large-scale voice search task. Our experiments show significant improvements in WER compared to the state-of-the-art approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题