签名的二进制重量网络

论文标题

签名的二进制重量网络

Signed Binary Weight Networks

论文作者

Kuhar, Sachit, Tumanov, Alexey, Hoffman, Judy

论文摘要

深度神经网络（DNN）的有效推断对于使AI无处不在。两种重要的算法技术已经显示出有效的推论 - 稀疏性和二元化。这些技术在硬件软件级别上转化为重量稀疏性和重量重复，从而使DNN的部署具有非常低的功率和延迟要求。我们提出了一种称为签名二进制网络的新方法，以进一步提高效率（通过同时利用体重稀疏性和重量重复在一起），同时保持相似的精度。我们的方法在具有二进制的ImageNet和CIFAR10数据集上达到了可比的精度，并且可能导致69％的稀疏性。当将这些模型部署在通用设备上时，我们会观察到真正的加速，并表明这种高比例的非结构化稀疏性可以导致ASIC上的能源消耗进一步降低。

Efficient inference of Deep Neural Networks (DNNs) is essential to making AI ubiquitous. Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization. These techniques translate into weight sparsity and weight repetition at the hardware-software level enabling the deployment of DNNs with critically low power and latency requirements. We propose a new method called signed-binary networks to improve efficiency further (by exploiting both weight sparsity and weight repetition together) while maintaining similar accuracy. Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to 69% sparsity. We observe real speedup when deploying these models on general-purpose devices and show that this high percentage of unstructured sparsity can lead to a further reduction in energy consumption on ASICs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题