通过神经网络进行了惊人触发的条件计算

论文标题

通过神经网络进行了惊人触发的条件计算

Surprisal-Triggered Conditional Computation with Neural Networks

论文作者

Lugosch, Loren, Nowrouzezahrai, Derek, Meyer, Brett H.

论文摘要

自回归神经网络模型已成功用于序列生成，特征提取和假设评分。本文为这些模型提供了另一种用途：将更多的计算分配给更困难的输入。在我们的模型中，使用自回旋模型既可以提取特征，又可以预测输入观测流中的观测值。根据自回归模型的当前观察结果，输入的惊喜被用作输入难度的衡量标准。反过来，这决定了使用小型，快速的网络还是使用大型慢网络。两项语音识别任务的实验表明，我们的模型可以与基线的性能相匹配，在该基线中，大型网络始终使用15％的失败。

Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题