通过在线量化敏感性分析通过块稳定的动态精确性神经网络培训加速

论文标题

通过在线量化敏感性分析通过块稳定的动态精确性神经网络培训加速

Block-Wise Dynamic-Precision Neural Network Training Acceleration via Online Quantization Sensitivity Analytics

论文作者

Liu, Ruoyang, Wei, Chenhan, Yang, Yixiong, Wang, Wenxun, Yang, Huazhong, Liu, Yongpan

论文摘要

数据量化是一种加速神经网络训练并减少功耗的有效方法。但是，执行低位量化训练是一项挑战：常规的平等定量量化将导致高精度损失或有限的位宽度降低，而现有的混合精确方法具有高压缩潜力，但无法执行准确，有效的位宽度分配。在这项工作中，我们提出了朝代，这是一个动态精确的神经网络培训框架。 Dynasty通过快速的在线分析提供了准确的数据灵敏度信息，并通过自适应位宽度地图生成器保持稳定的培训收敛。在CIFAR-100和Imagenet数据集上进行了网络培训实验，与8位量化基线相比，Dynasty的价格高达$ 5.1 \ times $ speedup和$ 4.7 \ times $ $ $ $ $ $ $ $ $降低能源消耗，而无需准确的下降和可忽略不计的硬件。

Data quantization is an effective method to accelerate neural network training and reduce power consumption. However, it is challenging to perform low-bit quantized training: the conventional equal-precision quantization will lead to either high accuracy loss or limited bit-width reduction, while existing mixed-precision methods offer high compression potential but failed to perform accurate and efficient bit-width assignment. In this work, we propose DYNASTY, a block-wise dynamic-precision neural network training framework. DYNASTY provides accurate data sensitivity information through fast online analytics, and maintains stable training convergence with an adaptive bit-width map generator. Network training experiments on CIFAR-100 and ImageNet dataset are carried out, and compared to 8-bit quantization baseline, DYNASTY brings up to $5.1\times$ speedup and $4.7\times$ energy consumption reduction with no accuracy drop and negligible hardware overhead.

下载PDF全文

下载文献需遵守相关版权规定

论文标题