深层模型压缩的歧视感知网络修剪

论文标题

深层模型压缩的歧视感知网络修剪

Discrimination-aware Network Pruning for Deep Model Compression

论文作者

Liu, Jing, Zhuang, Bohan, Zhuang, Zhuangwei, Guo, Yong, Huang, Junzhou, Zhu, Jinhui, Tan, Mingkui

论文摘要

我们研究旨在删除冗余渠道/内核的网络修剪，从而加快了深网的推断。现有的修剪方法要么从头开始使用稀疏性约束训练，要么最大程度地减少了预训练模型的特征图与压缩的误差之间的重建误差。两种策略都有某种局限性：前一种策略在计算上是昂贵且难以收敛的，而后一种则优化了重建误差，但忽略了通道的歧视力。在本文中，我们提出了一种称为“歧视感知通道修剪（DCP）”的简单有效的方法，以选择实际上有助于判别能力的通道。请注意，频道通常由一组内核组成。除了通道的冗余外，通道中的某些内核也可能是多余的，并且无法促进网络的歧视能力，从而导致内核级别的冗余。为了解决这个问题，我们提出了一种歧视感知的内核修剪（DKP）方法，以通过删除冗余核来进一步压缩深层网络。为了防止DCP/DKP选择冗余通道/内核，我们提出了一种新的自适应停止条件，这有助于自动确定所选的通道/内核的数量，并且通常会导致更紧凑的模型具有更好的性能。对图像分类和面部识别的广泛实验证明了我们方法的有效性。例如，在ILSVRC-12上，在TOP-1的准确性方面，降低通道30％的Resnet-50模型甚至比基线模型均优于基线模型0.36％。修剪的MobilenetV1和MobilenetV2分别在移动设备上分别达到1.93倍和1.42倍的推理加速度，并且性能降解可降解。源代码和预培训模型可在https://github.com/scut-ailab/dcp上获得。

We study network pruning which aims to remove redundant channels/kernels and hence speed up the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, while the latter kind optimizes the reconstruction error but ignores the discriminative power of channels. In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power. Note that a channel often consists of a set of kernels. Besides the redundancy in channels, some kernels in a channel may also be redundant and fail to contribute to the discriminative power of the network, resulting in kernel level redundancy. To solve this, we propose a discrimination-aware kernel pruning (DKP) method to further compress deep networks by removing redundant kernels. To prevent DCP/DKP from selecting redundant channels/kernels, we propose a new adaptive stopping condition, which helps to automatically determine the number of selected channels/kernels and often results in more compact models with better performance. Extensive experiments on both image classification and face recognition demonstrate the effectiveness of our methods. For example, on ILSVRC-12, the resultant ResNet-50 model with 30% reduction of channels even outperforms the baseline model by 0.36% in terms of Top-1 accuracy. The pruned MobileNetV1 and MobileNetV2 achieve 1.93x and 1.42x inference acceleration on a mobile device, respectively, with negligible performance degradation. The source code and the pre-trained models are available at https://github.com/SCUT-AILab/DCP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题