Phoenix：用于卷积神经网络的低精油浮点定量体系结构

论文标题

Phoenix：用于卷积神经网络的低精油浮点定量体系结构

Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks

论文作者

Wu, Chen, Wang, Mingyu, Li, Xiayu, Lu, Jicheng, Wang, Kun, He, Lei

论文摘要

卷积神经网络（CNN）以更深入和更大的成本实现最先进的表现。尽管量化（固定点和浮点）已被证明可有效减少存储和内存访问，但两个挑战 - 1）由量化而没有校准，微调或重新训练的深入CNN和2）由浮点量化引起的硬件效率低效率而导致的精度损失 - 防止处理器完全利用该收益。在本文中，我们提出了一个低精度的浮点量化处理器，名为Phoenix，以应对上述挑战。我们主要有三个关键的观察值：1）8位浮点量化的误差少于8位固定点量化； 2）不使用任何校准，微调或重新训练技术，量化之前的归一化，进一步降低了准确性降解； 3）如果应用完整的产品，则8位浮点乘数比8位固定点乘数实现更高的硬件效率。基于这些关键观察结果，我们提出了一种面向标准化的8位浮点量化方法，以减少储存和内存访问的准确性损失（分别在TOP-1/TOP-5精度的0.5％/0.3％之内）。我们进一步设计了一个硬件处理器，以解决浮点乘数引起的硬件效率低下。与最先进的加速器相比，Phoenix的性能分别为Alexnet和VGG16的相同核心区域的性能更好。

Convolutional neural networks (CNNs) achieve state-of-the-art performance at the cost of becoming deeper and larger. Although quantization (both fixed-point and floating-point) has proven effective for reducing storage and memory access, two challenges -- 1) accuracy loss caused by quantization without calibration, fine-tuning or re-training for deep CNNs and 2) hardware inefficiency caused by floating-point quantization -- prevent processors from completely leveraging the benefits. In this paper, we propose a low-precision floating-point quantization oriented processor, named Phoenix, to address the above challenges. We primarily have three key observations: 1) 8-bit floating-point quantization incurs less error than 8-bit fixed-point quantization; 2) without using any calibration, fine-tuning or re-training techniques, normalization before quantization further reduces accuracy degradation; 3) 8-bit floating-point multiplier achieves higher hardware efficiency than 8-bit fixed-point multiplier if the full-precision product is applied. Based on these key observations, we propose a normalization-oriented 8-bit floating-point quantization method to reduce storage and memory access with negligible accuracy loss (within 0.5%/0.3% for top-1/top-5 accuracy, respectively). We further design a hardware processor to address the hardware inefficiency caused by floating-point multiplier. Compared with a state-of-the-art accelerator, Phoenix is 3.32x and 7.45x better in performance with the same core area for AlexNet and VGG16, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题