论文标题

部分可观测时空混沌系统的无模型预测

Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

论文作者

Wimmer, Paul, Mehnert, Jens, Condurache, Alexandru Paul

论文摘要

在训练时间和推理时间,非结构化的修剪非常适合减少卷积神经网络(CNN)的记忆足迹。 CNN包含以$ k \ times k $过滤器排列的参数。标准的非结构化修剪(SP)通过将过滤器元素设置为零,从而减少了CNN的内存足迹,从而指定了约束过滤器的固定子空间。特别是如果在训练之前或期间使用修剪,这会引起强烈的偏见。为了克服这一点,我们引入了空间修剪(IP),这是一种改善现有修剪方法的通用工具。它使用基础自适应滤波器(FB)的线性组合在动态空间中表示的过滤器。对于IP,FB系数设置为零,而未固定的系数和FBS共同训练。在这项工作中,我们为IP的出色性能提供了数学证据,并证明IP在所有经过测试过的最新非结构化的修剪方法上都胜过SP。尤其是在具有挑战性的情况下,例如修剪成像网或对高稀疏度修剪,IP大大超过了SP,其运行时和参数成本相等。最后,我们表明IP的进步是由于提高的训练性和出色的概括能力所致。

Unstructured pruning is well suited to reduce the memory footprint of convolutional neural networks (CNNs), both at training and inference time. CNNs contain parameters arranged in $K \times K$ filters. Standard unstructured pruning (SP) reduces the memory footprint of CNNs by setting filter elements to zero, thereby specifying a fixed subspace that constrains the filter. Especially if pruning is applied before or during training, this induces a strong bias. To overcome this, we introduce interspace pruning (IP), a general tool to improve existing pruning methods. It uses filters represented in a dynamic interspace by linear combinations of an underlying adaptive filter basis (FB). For IP, FB coefficients are set to zero while un-pruned coefficients and FBs are trained jointly. In this work, we provide mathematical evidence for IP's superior performance and demonstrate that IP outperforms SP on all tested state-of-the-art unstructured pruning methods. Especially in challenging situations, like pruning for ImageNet or pruning to high sparsity, IP greatly exceeds SP with equal runtime and parameter costs. Finally, we show that advances of IP are due to improved trainability and superior generalization ability.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源