论文标题

通过神经网络进行了惊人触发的条件计算

Surprisal-Triggered Conditional Computation with Neural Networks

论文作者

Lugosch, Loren, Nowrouzezahrai, Derek, Meyer, Brett H.

论文摘要

自回归神经网络模型已成功用于序列生成,特征提取和假设评分。本文为这些模型提供了另一种用途:将更多的计算分配给更困难的输入。在我们的模型中,使用自回旋模型既可以提取特征,又可以预测输入观测流中的观测值。根据自回归模型的当前观察结果,输入的惊喜被用作输入难度的衡量标准。反过来,这决定了使用小型,快速的网络还是使用大型慢网络。两项语音识别任务的实验表明,我们的模型可以与基线的性能相匹配,在该基线中,大型网络始终使用15%的失败。

Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源