首页> 外文期刊>GIScience & remote sensing >Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data
【24h】

Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data

机译:使用Sentinel-2数据在北方景观中机器学习算法的土地覆盖和土地利用分类性能

获取原文
获取原文并翻译 | 示例
       

摘要

In recent years, the data science and remote sensing communities have started to align due to user-friendly programming tools, access to high-end consumer computing power, and the availability of free satellite data. In particular, publicly available data from the European Space Agency's Sentinel missions have been used in various remote sensing applications. However, there is a lack of studies that utilize these data to assess the performance of machine learning algorithms in complex boreal landscapes. In this article, I compare the classification performance of four non-parametric algorithms: support vector machines (SVM), random forests (RF), extreme gradient boosting (Xgboost), and deep learning (DL). The study area chosen is a complex mixed-use landscape in south-central Sweden with eight land-cover and land-use (LCLU) classes. The satellite imagery used for the classification were multi-temporal scenes from Sentinel-2 covering spring, summer, autumn and winter conditions. Using stratified random sampling, each LCLU class was allocated 1477 samples, which were divided into training (70%) and evaluation (30%) subsets. Accuracy was assessed through metrics derived from an error matrix, but primarily overall accuracy was used in allocating algorithm hierarchy. A two-proportion Z-test was used to compare the proportions of correctly classified pixels of the algorithms and a McNemar's chi-square test was used to compare class-wise predictions. The results show that the highest overall accuracy was produced by support vector machines (0.758 +/- 0.017), closely followed by extreme gradient boosting (0.751 +/- 0.017), random forests (0.739 +/- 0.018), and finally deep learning (0.733 +/- 0.0023). The Z-test comparison of classifiers showed that a third of algorithm pairings were statistically different. On a class-wise basis, McNemar's test results showed that 62% of class-wise predictions were significant from one another at the 5% level or less. Variable importance metrics show that nearly half of the top twenty Sentinel-2 bands belonged to the red edge (25%) and shortwave infrared (23%) portions of the electromagnetic spectrum, and were dominated by scenes from spring (38%) and summer (40%). The results are discussed within the scope of recent studies involving machine learning and Sentinel-2 data and key knowledge gaps identified. The article concludes with recommendations for future research.
机译:近年来,由于用户友好的编程工具,对高端消费者计算能力的访问以及免费卫星数据的可用性,数据科学和遥感界已开始保持一致。特别是,来自欧洲航天局前哨任务的公开数据已用于各种遥感应用中。然而,缺乏研究利用这些数据来评估机器学习算法在复杂寒带环境中的性能。在本文中,我比较了四种非参数算法的分类性能:支持向量机(SVM),随机森林(RF),极限梯度增强(Xgboost)和深度学习(DL)。选择的研究区域是瑞典中南部的一个复杂的混合用途景观,具有八种土地覆盖和土地利用(LCLU)类。用于分类的卫星图像是Sentinel-2的多时相场景,涵盖了春季,夏季,秋季和冬季情况。使用分层随机抽样,每个LCLU类分配了1477个样本,分为训练(70%)和评估(30%)子集。通过从误差矩阵得出的度量来评估准确性,但主要是将整体准确性用于分配算法层次结构。使用两比例Z检验来比较算法中正确分类的像素的比例,并使用McNemar卡方检验来比较逐级预测。结果表明,最高的总体精度是由支持向量机(0.758 +/- 0.017),紧随其后的是极端梯度增强(0.751 +/- 0.017),随机森林(0.739 +/- 0.018)和深度学习而产生的(0.733 +/- 0.0023)。分类器的Z检验比较显示,算法配对的三分之一在统计上不同。在分类基础上,McNemar的测试结果表明,在5%或更低的水平上,有62%的分类预测相互之间是有意义的。可变重要性指标显示,前20个Sentinel-2波段中近一半属于电磁频谱的红色边缘(25%)和短波红外(23%)部分,并且以春季(38%)和夏季的场景为主(40%)。在涉及机器学习和Sentinel-2数据以及发现的关键知识缺口的最新研究范围内讨论了结果。本文最后提出了对未来研究的建议。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号