通过简单的RNN实现LSTM的在线回归性能

论文标题

通过简单的RNN实现LSTM的在线回归性能

Achieving Online Regression Performance of LSTMs with Simple RNNs

论文作者

Vural, N. Mert, Ilhan, Fatih, Yilmaz, Selim F., Ergüt, Salih, Kozat, Suleyman S.

论文摘要

复发性神经网络（RNN）由于能够推广非线性时间依赖性的能力而广泛用于在线回归。作为RNN模型，在实践中通常首选长期 - 内存网络（LSTMS），因为这些网络能够学习长期依赖性，同时避免消失的梯度问题。但是，由于其大量参数，与简单的RNN（SRNN）相比，训练LSTM需要更长的训练时间。在本文中，我们有效地实现了LSTM的在线回归性能。为此，我们引入了一种具有线性时间复杂性的一阶训练算法。我们表明，当SRNN通过我们的算法培训时，它们在较短的训练时间内与LSTMS提供了非常相似的回归性能。我们提供了强大的理论分析来支持我们的实验结果，通过对算法的融合率提供遗憾的界限。通过大量的实验，我们验证了我们的理论工作，并证明了有关LSTM和其他最新学习模型的算法的显着改进。

Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, training LSTMs requires considerably longer training time compared to simple RNNs (SRNNs). In this paper, we achieve the online regression performance of LSTMs with SRNNs efficiently. To this end, we introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time. We provide strong theoretical analysis to support our experimental results by providing regret bounds on the convergence rate of our algorithm. Through an extensive set of experiments, we verify our theoretical work and demonstrate significant performance improvements of our algorithm with respect to LSTMs and the other state-of-the-art learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题