使用深度学习技术的高分辨率光学卫星图像的温室细分

论文标题

使用深度学习技术的高分辨率光学卫星图像的温室细分

Greenhouse Segmentation on High-Resolution Optical Satellite Imagery using Deep Learning Techniques

论文作者

Baghirli, Orkhan, Ibrahimli, Imran, Mammadzada, Tarlan

论文摘要

温室细分对于气候智能农业土地利用计划至关重要。基于深度学习的方法在自然图像细分中提供了最新的性能。但是，由于环境复杂，有关高分辨率光学卫星图像的语义细分是一项艰巨的任务。在本文中，提出了一种通过Azersky（Spot-7）光学卫星获取的图像上像素分类的声音方法。特别是，采用了类似U-NET的体系结构的定制变化来识别温室。提出了两个模型，它们独特地融合了扩张的卷积和跳过连接，并将结果与基线U-NET模型的结果进行了比较。所使用的数据集由泛滥的矫正式Azersky图像（红色，绿色，蓝色和近红外通道），其分辨率为1.5米，并从阿塞拜疆的15个地区收集，那里的温室密集拥挤。这些图像覆盖了1008 $ km^2 $的累积面积，注释面具总共包含47559多边形。 $ f_1，kappa，auc $和$ iou $ $分数用于性能评估。据观察，在整个膨胀路径中，单独使用反向倾斜层不会产生令人满意的结果。因此，它们要么与双线性插值替换或耦合。所有模型都受益于硬采矿（HEM）策略。还报道说，当加权二进制跨透明副本损失与骰子损失相结合时，记录了93.29美元\％$（$ f_1 \，得分$）的最佳准确度。实验结果表明，两个提出的模型的表现都优于基线U-NET体系结构，因此，最佳模型与基线体系结构相比，最佳模型得分高4.48 \％$。

Greenhouse segmentation has pivotal importance for climate-smart agricultural land-use planning. Deep learning-based approaches provide state-of-the-art performance in natural image segmentation. However, semantic segmentation on high-resolution optical satellite imagery is a challenging task because of the complex environment. In this paper, a sound methodology is proposed for pixel-wise classification on images acquired by the Azersky (SPOT-7) optical satellite. In particular, customized variations of U-Net-like architectures are employed to identify greenhouses. Two models are proposed which uniquely incorporate dilated convolutions and skip connections, and the results are compared to that of the baseline U-Net model. The dataset used consists of pan-sharpened orthorectified Azersky images (red, green, blue,and near infrared channels) with 1.5-meter resolution and annotation masks, collected from 15 regions in Azerbaijan where the greenhouses are densely congested. The images cover the cumulative area of 1008 $km^2$ and annotation masks contain 47559 polygons in total. The $F_1, Kappa, AUC$, and $IOU$ scores are used for performance evaluation. It is observed that the use of the deconvolutional layers alone throughout the expansive path does not yield satisfactory results; therefore, they are either replaced or coupled with bilinear interpolation. All models benefit from the hard example mining (HEM) strategy. It is also reported that the best accuracy of $93.29\%$ ($F_1\,score$) is recorded when the weighted binary cross-entropy loss is coupled with the dice loss. Experimental results showed that both of the proposed models outperformed the baseline U-Net architecture such that the best model proposed scored $4.48\%$ higher in comparison to the baseline architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题