Fouriernet：紧凑的掩码表示，例如使用可区分的形状解码器进行分割

论文标题

Fouriernet：紧凑的掩码表示，例如使用可区分的形状解码器进行分割

FourierNet: Compact mask representation for instance segmentation using differentiable shape decoders

论文作者

Riaz, Hamd ul Moqeet, Benbarka, Nuri, Zell, Andreas

论文摘要

我们提出了傅里叶，这是一个预测形状向量的单个镜头，无锚的，完全卷积实例分割方法。因此，使用快速的数值变换将此形状向量转换为掩模的轮廓点。与以前的方法相比，我们引入了一种新的训练技术，在该技术中，我们利用了一个可区分的形状解码器，该解码器管理形状矢量系数的自动重量平衡。由于其系数可解释性和快速实现，我们将傅立叶系列用作形状编码器。与多边形表示方法相比，Fouriernet显示出令人鼓舞的结果，在MS Coco 2017基准中获得30.6 MAP。在较低的图像分辨率下，它以26.6 fps的速度运行24.3地图。它仅使用八个参数即表示掩码即可到达23.3映射（请注意，仅限边界框预测需要至少四个参数）。定性分析表明，抑制傅立叶系列较高频率的合理比例仍然会产生有意义的面具。这些结果证明了我们的理解，即较低的频率组件为分割任务保留更高的信息，因此，我们可以实现压缩表示。代码可在以下网址获得：github.com/cogsys-tuebingen/fouriernet。

We present FourierNet, a single shot, anchor-free, fully convolutional instance segmentation method that predicts a shape vector. Consequently, this shape vector is converted into the masks' contour points using a fast numerical transform. Compared to previous methods, we introduce a new training technique, where we utilize a differentiable shape decoder, which manages the automatic weight balancing of the shape vector's coefficients. We used the Fourier series as a shape encoder because of its coefficient interpretability and fast implementation. FourierNet shows promising results compared to polygon representation methods, achieving 30.6 mAP on the MS COCO 2017 benchmark. At lower image resolutions, it runs at 26.6 FPS with 24.3 mAP. It reaches 23.3 mAP using just eight parameters to represent the mask (note that at least four parameters are needed for bounding box prediction only). Qualitative analysis shows that suppressing a reasonable proportion of higher frequencies of Fourier series, still generates meaningful masks. These results validate our understanding that lower frequency components hold higher information for the segmentation task, and therefore, we can achieve a compressed representation. Code is available at: github.com/cogsys-tuebingen/FourierNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题