通过对抗知识蒸馏的无数据网络量化

论文标题

通过对抗知识蒸馏的无数据网络量化

Data-Free Network Quantization With Adversarial Knowledge Distillation

论文作者

Choi, Yoojin, Choi, Jihwan, El-Khamy, Mostafa, Lee, Jungwon

论文摘要

网络量化是深度学习的重要过程，用于开发移动或边缘平台上有效的定点推断模型。但是，随着数据集越来越大，隐私法规变得更加严格，模型压缩的数据共享变得更加困难和限制。在本文中，我们考虑使用合成数据的无数据网络量化。合成数据是由发电机生成的，而在训练发生器和量化中不使用数据。为此，我们提出了无数据的对抗知识蒸馏，从而最大程度地减少了从发电机的任何对抗样本的教师和（量化）学生之间的最大距离。为了生成类似于原始数据的对抗样本，我们还提出了来自批处理层的匹配统计数据，用于生成的数据和教师中的原始数据。此外，我们通过使用多个发电机和多个学生来表明生产各种对抗样本的收益。我们的实验显示了SVHN，CIFAR-10，CIFAR-100和Tiny-ImageNet数据集的（宽）残留网络和Mobilenet的最新无数据模型压缩和量化结果。与使用原始数据集相比，准确性损失非常小。

Network quantization is an essential procedure in deep learning for development of efficient fixed-point inference models on mobile or edge platforms. However, as datasets grow larger and privacy regulations become stricter, data sharing for model compression gets more difficult and restricted. In this paper, we consider data-free network quantization with synthetic data. The synthetic data are generated from a generator, while no data are used in training the generator and in quantization. To this end, we propose data-free adversarial knowledge distillation, which minimizes the maximum distance between the outputs of the teacher and the (quantized) student for any adversarial samples from a generator. To generate adversarial samples similar to the original data, we additionally propose matching statistics from the batch normalization layers for generated data and the original data in the teacher. Furthermore, we show the gain of producing diverse adversarial samples by using multiple generators and multiple students. Our experiments show the state-of-the-art data-free model compression and quantization results for (wide) residual networks and MobileNet on SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The accuracy losses compared to using the original datasets are shown to be very minimal.

下载PDF全文

下载文献需遵守相关版权规定

论文标题