纳米线网络中的深度学习

论文标题

纳米线网络中的深度学习

Deep Learning in Memristive Nanowire Networks

论文作者

Kendall, Jack D., Pantone, Ross D., Nino, Juan C.

论文摘要

Analog crossbar architectures for accelerating neural network training and inference have made tremendous progress over the past several years. These architectures are ideal for dense layers with fewer than roughly a thousand neurons. However, for large sparse layers, crossbar architectures are highly inefficient.最近将新的硬件体系结构称为MN3（MN3（MN）纳米神经网络），最近被描述为一种有效的体系结构，用于在每层数百万个神经元的命令下模拟非常宽，稀疏的神经网络层。 The MN3 utilizes a high-density memristive nanowire mesh to efficiently connect large numbers of silicon neurons with modifiable weights.在这里，为了探索MN3充当深神经网络的功能，我们描述了一种用于培训深度学习任务的深度MN3模型和基准模拟的算法。我们利用了一个简单的分段线性备忘录模型，因为我们试图证明训练原则上可能是随机纳米线体系结构的。 In future work, we intend on utilizing more realistic memristor models, and we will adapt the presented algorithm appropriately.我们表明，MN3能够执行组成，梯度传播和重量更新，从而使其可以充当深神经网络。我们表明，由MN3网络构建的模拟多层感知器（MLP）可以在流行的MNIST数据集上获得1.61％的错误率，可与基于软件的网络相当。据作者所知，这项工作代表了能够再现反向传播算法的第一个随机纳米线体系结构。

Analog crossbar architectures for accelerating neural network training and inference have made tremendous progress over the past several years. These architectures are ideal for dense layers with fewer than roughly a thousand neurons. However, for large sparse layers, crossbar architectures are highly inefficient. A new hardware architecture, dubbed the MN3 (Memristive Nanowire Neural Network), was recently described as an efficient architecture for simulating very wide, sparse neural network layers, on the order of millions of neurons per layer. The MN3 utilizes a high-density memristive nanowire mesh to efficiently connect large numbers of silicon neurons with modifiable weights. Here, in order to explore the MN3's ability to function as a deep neural network, we describe one algorithm for training deep MN3 models and benchmark simulations of the architecture on two deep learning tasks. We utilize a simple piecewise linear memristor model, since we seek to demonstrate that training is, in principle, possible for randomized nanowire architectures. In future work, we intend on utilizing more realistic memristor models, and we will adapt the presented algorithm appropriately. We show that the MN3 is capable of performing composition, gradient propagation, and weight updates, which together allow it to function as a deep neural network. We show that a simulated multilayer perceptron (MLP), built from MN3 networks, can obtain a 1.61% error rate on the popular MNIST dataset, comparable to equivalently sized software-based network. This work represents, to the authors' knowledge, the first randomized nanowire architecture capable of reproducing the backpropagation algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题