论文标题

使用深度学习技术的高分辨率光学卫星图像的温室细分

Greenhouse Segmentation on High-Resolution Optical Satellite Imagery using Deep Learning Techniques

论文作者

Baghirli, Orkhan, Ibrahimli, Imran, Mammadzada, Tarlan

论文摘要

温室细分对于气候智能农业土地利用计划至关重要。基于深度学习的方法在自然图像细分中提供了最新的性能。但是,由于环境复杂,有关高分辨率光学卫星图像的语义细分是一项艰巨的任务。在本文中,提出了一种通过Azersky(Spot-7)光学卫星获取的图像上像素分类的声音方法。特别是,采用了类似U-NET的体系结构的定制变化来识别温室。提出了两个模型,它们独特地融合了扩张的卷积和跳过连接,并将结果与​​基线U-NET模型的结果进行了比较。所使用的数据集由泛滥的矫正式Azersky图像(红色,绿色,蓝色和近红外通道),其分辨率为1.5米,并从阿塞拜疆的15个地区收集,那里的温室密集拥挤。这些图像覆盖了1008 $ km^2 $的累积面积,注释面具总共包含47559多边形。 $ f_1,kappa,auc $和$ iou $ $分数用于性能评估。据观察,在整个膨胀路径中,单独使用反向倾斜层不会产生令人满意的结果。因此,它们要么与双线性插值替换或耦合。所有模型都受益于硬采矿(HEM)策略。还报道说,当加权二进制跨透明副本损失与骰子损失相结合时,记录了93.29美元\%$($ f_1 \,得分$)的最佳准确度。实验结果表明,两个提出的模型的表现都优于基线U-NET体系结构,因此,最佳模型与基线体系结构相比,最佳模型得分高4.48 \%$。

Greenhouse segmentation has pivotal importance for climate-smart agricultural land-use planning. Deep learning-based approaches provide state-of-the-art performance in natural image segmentation. However, semantic segmentation on high-resolution optical satellite imagery is a challenging task because of the complex environment. In this paper, a sound methodology is proposed for pixel-wise classification on images acquired by the Azersky (SPOT-7) optical satellite. In particular, customized variations of U-Net-like architectures are employed to identify greenhouses. Two models are proposed which uniquely incorporate dilated convolutions and skip connections, and the results are compared to that of the baseline U-Net model. The dataset used consists of pan-sharpened orthorectified Azersky images (red, green, blue,and near infrared channels) with 1.5-meter resolution and annotation masks, collected from 15 regions in Azerbaijan where the greenhouses are densely congested. The images cover the cumulative area of 1008 $km^2$ and annotation masks contain 47559 polygons in total. The $F_1, Kappa, AUC$, and $IOU$ scores are used for performance evaluation. It is observed that the use of the deconvolutional layers alone throughout the expansive path does not yield satisfactory results; therefore, they are either replaced or coupled with bilinear interpolation. All models benefit from the hard example mining (HEM) strategy. It is also reported that the best accuracy of $93.29\%$ ($F_1\,score$) is recorded when the weighted binary cross-entropy loss is coupled with the dice loss. Experimental results showed that both of the proposed models outperformed the baseline U-Net architecture such that the best model proposed scored $4.48\%$ higher in comparison to the baseline architecture.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源