论文标题

互补语言模型和平行BI-LRNN用于虚假触发器缓解

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

论文作者

Agarwal, Rishika, Niu, Xiaochuan, Dighe, Pranay, Vishnubhotla, Srikanth, Badaskar, Sameer, Naik, Devang

论文摘要

语音助手中的错误触发器是助手的意想不到的调用,不仅会降低用户体验,而且可能会损害隐私。 False触发缓解(FTM)是检测错误触发事件并适当响应用户的过程。在本文中,我们通过引入平行的ASR解码过程,该过程使用从“室外”数据源训练的特殊语言模型,为FTM问题提出了一种新颖的解决方案。这种语言模型与为助手任务优化的现有语言模型互补。从互补语言模型产生的晶格中训练的双向晶格RNN(BI-LRNN)分类器显示出$ 38.34 \%$ $相对降低,以固定速率(固定速率)$ 0.4 \%\%$ $ false抑制(FS)相对降低,与当前的bi-lrrnn模型相比。此外,我们建议根据两种语言模型的解码晶格培训平行的BI-LRNN模型,并检查各种实施方式。最终的模型导致虚假触发率进一步降低$ 10.8 \%$。

False triggers in voice assistants are unintended invocations of the assistant, which not only degrade the user experience but may also compromise privacy. False trigger mitigation (FTM) is a process to detect the false trigger events and respond appropriately to the user. In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special language model trained from "out-of-domain" data sources. Such language model is complementary to the existing language model optimized for the assistant task. A bidirectional lattice RNN (Bi-LRNN) classifier trained from the lattices generated by the complementary language model shows a $38.34\%$ relative reduction of the false trigger (FT) rate at the fixed rate of $0.4\%$ false suppression (FS) of correct invocations, compared to the current Bi-LRNN model. In addition, we propose to train a parallel Bi-LRNN model based on the decoding lattices from both language models, and examine various ways of implementation. The resulting model leads to further reduction in the false trigger rate by $10.8\%$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源