分裂为苗条：香草卷积中被忽视的冗余

论文标题

分裂为苗条：香草卷积中被忽视的冗余

Split to Be Slim: An Overlooked Redundancy in Vanilla Convolution

论文作者

Zhang, Qiulin, Jiang, Zhuqing, Lu, Qishuo, Han, Jia'nan, Zeng, Zhengxin, Gao, Shang-hua, Men, Aidong

论文摘要

已经提出了许多有效的解决方案，以减少推理加速度模型的冗余。然而，共同的方法主要集中于消除不太重要的过滤器或构建有效的操作，同时忽略了特征地图中的模式冗余。我们揭示了一层中的许多特征地图具有相似但不相同的模式。但是，很难识别具有相似模式的功能是冗余的还是包含基本细节。因此，我们建议基于\ textbf {cons} iNolutional操作（即SPCONV），而不是直接删除不确定的冗余功能，而是提出了一个\ textbf {sp} lite \ textbf {cons}，以耐受具有相似模式的功能，但需要更少的计算。具体而言，我们将输入特征映射分为代表性部分和不确定的冗余部分，其中通过相对沉重的计算从代表性部分提取了内在信息，而在不确定的冗余部分中的微小隐藏细节则使用一些轻量级操作处理。为了重新校准和融合这两组处理的特征，我们提出了一个无参数的特征融合模块。此外，我们的SPCONV旨在以插件方式替换香草卷积。没有任何铃铛和哨子，基准的实验结果表明，配备SPCONV的网络在GPU上的准确性和推理时间始终超过最先进的基线，而FLOPS和参数急剧下降。

Many effective solutions have been proposed to reduce the redundancy of models for inference acceleration. Nevertheless, common approaches mostly focus on eliminating less important filters or constructing efficient operations, while ignoring the pattern redundancy in feature maps. We reveal that many feature maps within a layer share similar but not identical patterns. However, it is difficult to identify if features with similar patterns are redundant or contain essential details. Therefore, instead of directly removing uncertain redundant features, we propose a \textbf{sp}lit based \textbf{conv}olutional operation, namely SPConv, to tolerate features with similar patterns but require less computation. Specifically, we split input feature maps into the representative part and the uncertain redundant part, where intrinsic information is extracted from the representative part through relatively heavy computation while tiny hidden details in the uncertain redundant part are processed with some light-weight operation. To recalibrate and fuse these two groups of processed features, we propose a parameters-free feature fusion module. Moreover, our SPConv is formulated to replace the vanilla convolution in a plug-and-play way. Without any bells and whistles, experimental results on benchmarks demonstrate SPConv-equipped networks consistently outperform state-of-the-art baselines in both accuracy and inference time on GPU, with FLOPs and parameters dropped sharply.

下载PDF全文

下载文献需遵守相关版权规定

论文标题