论文标题
基于事件的神经网络在线培训的控制
Event-Based Control for Online Training of Neural Networks
论文作者
论文摘要
卷积神经网络(CNN)已成为图像分类任务最常用的方法。在训练期间,学习率和梯度是影响模型收敛速度的两个关键因素。通常的学习率策略是基于时间的,即随着时间的流逝单调衰减。最近的最新技术集中于自适应梯度算法,即Adam及其版本。在本文中,我们考虑了一个在线学习方案,并提出了两个基于事件的控制循环,以调整经典算法E(指数)/PD(比例衍生) - 控制的学习率。将实现第一个基于事件的控制循环,以防止在模型接近最佳时的学习率突然下降。第二个基于事件的控制循环将根据学习速度决定何时切换到下一个数据批次。实验性评估使用了两个最先进的机器学习图像数据集(CIFAR-10和CIFAR-100)提供。结果表明,基于事件的E/PD优于原始算法(最终准确性较高,最终损失值较低),并且基于双事件/PD可以加速训练过程,与最先进的算法相比,最多可节省67%的培训时间,甚至可以提高性能。
Convolutional Neural Network (CNN) has become the most used method for image classification tasks. During its training the learning rate and the gradient are two key factors to tune for influencing the convergence speed of the model. Usual learning rate strategies are time-based i.e. monotonous decay over time. Recent state-of-the-art techniques focus on adaptive gradient algorithms i.e. Adam and its versions. In this paper we consider an online learning scenario and we propose two Event-Based control loops to adjust the learning rate of a classical algorithm E (Exponential)/PD (Proportional Derivative)-Control. The first Event-Based control loop will be implemented to prevent sudden drop of the learning rate when the model is approaching the optimum. The second Event-Based control loop will decide, based on the learning speed, when to switch to the next data batch. Experimental evaluationis provided using two state-of-the-art machine learning image datasets (CIFAR-10 and CIFAR-100). Results show the Event-Based E/PD is better than the original algorithm (higher final accuracy, lower final loss value), and the Double-Event-BasedE/PD can accelerate the training process, save up to 67% training time compared to state-of-the-art algorithms and even result in better performance.