BinaryDuo：通过耦合二进制激活来减少二进制激活网络中梯度不匹配

论文标题

BinaryDuo：通过耦合二进制激活来减少二进制激活网络中梯度不匹配

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

论文作者

Kim, Hyungjun, Kim, Kyungsu, Kim, Jinseok, Kim, Jae-Joon

论文摘要

二进制神经网络（BNNS）由于其计算降低和节省的计算，引起了人们的兴趣。但是，BNNS遭受性能降解的损失主要是由于激活引起的梯度不匹配。以前的工作试图通过减少向前通行证时使用的激活功能的差异与向后通使用的可区分近似之间的差异来解决梯度不匹配问题，这是一个间接的度量。在这项工作中，我们使用平滑损耗函数的梯度来更好地估计量化的神经网络中的梯度不匹配。使用梯度不匹配估计器的分析表明，使用更高的激活精度比修改可区分激活函数的近似值更有效。根据观察结果，我们提出了一种称为二元激活网络的新培训方案，称为二进制于二进制激活网络，其中两个二元激活在训练过程中耦合为三元激活。实验结果表明，在各种基准测试基准上，二进制二进制的表现优于最先进的BNN，其参数和计算成本相同。

Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings. However, BNNs suffer from performance degradation mainly due to the gradient mismatch caused by binarizing activations. Previous works tried to address the gradient mismatch problem by reducing the discrepancy between activation functions used at forward pass and its differentiable approximation used at backward pass, which is an indirect measure. In this work, we use the gradient of smoothed loss function to better estimate the gradient mismatch in quantized neural network. Analysis using the gradient mismatch estimator indicates that using higher precision for activation is more effective than modifying the differentiable approximation of activation function. Based on the observation, we propose a new training scheme for binary activation networks called BinaryDuo in which two binary activations are coupled into a ternary activation during training. Experimental results show that BinaryDuo outperforms state-of-the-art BNNs on various benchmarks with the same amount of parameters and computing cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题