论文标题
球形图像的模棱两可与增强
Equivariance versus Augmentation for Spherical Images
论文作者
论文摘要
我们分析了旋转阶段性在卷积神经网络(CNN)中的作用,应用于球形图像。我们比较了被称为S2CNN的组等值网络的性能和经过越来越多的数据增强量的标准非等级CNN。所选的体系结构可以视为相应设计范式的基线参考。我们的模型对投影到球体的MNIST或FashionMnist数据集进行了训练和评估。对于固有旋转不变的图像分类的任务,我们发现,通过大大增加数据增强量和网络的大小,标准CNN可以至少达到与Equivariant网络相同的性能。相比之下,对于固有的语义分割任务,非等级网络的表现始终超过具有较少参数的均值网络。我们还分析和比较了不同网络的推理潜伏期和培训时间,从而实现了详细的折衷考虑,而在实践问题上进行了模棱两可的考虑。实验中使用的均等球网络可在https://github.com/janegerken/sem_seg_s2cnn上获得。
We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are trained and evaluated on single or multiple items from the MNIST or FashionMNIST dataset projected onto the sphere. For the task of image classification, which is inherently rotationally invariant, we find that by considerably increasing the amount of data augmentation and the size of the networks, it is possible for the standard CNNs to reach at least the same performance as the equivariant network. In contrast, for the inherently equivariant task of semantic segmentation, the non-equivariant networks are consistently outperformed by the equivariant networks with significantly fewer parameters. We also analyze and compare the inference latency and training times of the different networks, enabling detailed tradeoff considerations between equivariant architectures and data augmentation for practical problems. The equivariant spherical networks used in the experiments are available at https://github.com/JanEGerken/sem_seg_s2cnn .