互补语言模型和平行BI-LRNN用于虚假触发器缓解

论文标题

互补语言模型和平行BI-LRNN用于虚假触发器缓解

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

论文作者

Agarwal, Rishika, Niu, Xiaochuan, Dighe, Pranay, Vishnubhotla, Srikanth, Badaskar, Sameer, Naik, Devang

论文摘要

语音助手中的错误触发器是助手的意想不到的调用，不仅会降低用户体验，而且可能会损害隐私。 False触发缓解（FTM）是检测错误触发事件并适当响应用户的过程。在本文中，我们通过引入平行的ASR解码过程，该过程使用从“室外”数据源训练的特殊语言模型，为FTM问题提出了一种新颖的解决方案。这种语言模型与为助手任务优化的现有语言模型互补。从互补语言模型产生的晶格中训练的双向晶格RNN（BI-LRNN）分类器显示出$ 38.34 \％$ $相对降低，以固定速率（固定速率）$ 0.4 \％\％$ $ false抑制（FS）相对降低，与当前的bi-lrrnn模型相比。此外，我们建议根据两种语言模型的解码晶格培训平行的BI-LRNN模型，并检查各种实施方式。最终的模型导致虚假触发率进一步降低$ 10.8 \％$。

False triggers in voice assistants are unintended invocations of the assistant, which not only degrade the user experience but may also compromise privacy. False trigger mitigation (FTM) is a process to detect the false trigger events and respond appropriately to the user. In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special language model trained from "out-of-domain" data sources. Such language model is complementary to the existing language model optimized for the assistant task. A bidirectional lattice RNN (Bi-LRNN) classifier trained from the lattices generated by the complementary language model shows a $38.34\%$ relative reduction of the false trigger (FT) rate at the fixed rate of $0.4\%$ false suppression (FS) of correct invocations, compared to the current Bi-LRNN model. In addition, we propose to train a parallel Bi-LRNN model based on the decoding lattices from both language models, and examine various ways of implementation. The resulting model leads to further reduction in the false trigger rate by $10.8\%$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题