首页> 外文学位 >Application of machine learning for soil survey updates: A case study in southeastern Ohio.
【24h】

Application of machine learning for soil survey updates: A case study in southeastern Ohio.

机译:机器学习在土壤调查更新中的应用:俄亥俄州东南部的一个案例研究。

获取原文
获取原文并翻译 | 示例

摘要

Future activities of the National Cooperative Soil Survey include SSURGO (Soil Survey Geographic Database) updates on a Major Land Resource Area (MLRA) basis. To further soil survey updating, machine learning techniques were used to build predictive soil-landscape models for two counties (Monroe and Noble) within the central Allegheny plateau (MLRA 126) in southeastern Ohio. Characterized by their similar soil forming factors but a varied survey intensity and vintage, these two counties provided a setting ideal for building and testing the application of the models. Twenty five different environmental correlates including 10m resolution raster coverages of terrain and its derivatives, climate, geology, and historic vegetation were used as predictor variables for soil class.; Randomly sampled points proportionate to the area of the different soil classes from the published soil survey of Monroe County (SSURGO) were used to train the soil-landscape model. Since map units can contain more than one component soil series, each sample point within a map unit can possibly belong to any one of them. Hence there is ambiguity in labeling of the training instances with appropriate soil series. A kNN-based heuristic approach was used to disambiguate the training set labels. The training sets were further preprocessed for removal of outliers (to reduce noise) and for selection of fewer attributes (to avoid redundancy and irrelevancy) in the data. Modeling was performed using two learning algorithms namely J48 classification tree and Random Forest (RF). The map models were then evaluated for the quality of prediction using two prediction rate measures (one based on the dominant soil series and the other based on component series probabilities) and two landscape fragmentation statistics (contiguity index and aggregation index).; Generally, Random Forest recorded a higher prediction rate and greater contiguity when compared to J48. However, Random Forest over-predicted soils such as Gilpin, Guernsey, Zanesville and Captina Series which occupy large areas, at the cost of prediction accuracy of soils which occurred in smaller proportions. The results showed that the highest prediction rate based on the dominant soil series (>0.5) and higher values of contiguity index (0.83) and aggregation index (84.2) for RF was observed in the model built using the training set preprocessed for disambiguation. This suggests an improvement in the quality of predicted maps as a result of disambiguation of training set labels. The model predictions were helpful in locating many individual component series in soil consociations and associations. The maps were useful in identifying areas of uncertainty such as misplacement of polygon boundaries, incorrect labeling and disparity along the county edges, which could serve as a guide for further field investigations. The predicted models also provided valuable information for rationalizing the mapping intensity for adjacent SSURGO maps.
机译:全国土壤合作调查的未来活动包括在主要土地资源区域(MLRA)的基础上更新SSURGO(土壤调查地理数据库)。为了进一步更新土壤调查数据,使用了机器学习技术为俄亥俄州东南部阿勒格尼高原中部两个县(门罗河和诺布尔)(MLRA 126)建立了预测性土壤-土地景观模型。这两个县的特点是土壤形成因子相似,但调查强度和年份各不相同,为建立和测试模型的应用提供了理想的设置环境。 25种不同的环境相关因素,包括10m分辨率的地形及其衍生物,气候,地质和历史植被的栅格覆盖率,被用作土壤类别的预测变量。根据已公布的门罗县土壤调查(SSURGO)中与不同土壤类别的面积成比例的随机采样点,用于训练土壤-景观模型。由于地图单元可以包含一个以上的土壤成分序列,因此地图单元中的每个采样点都可能属于其中的任何一个。因此,在用适当的土壤系列标注训练实例时存在歧义。基于kNN的启发式方法用于消除训练集标签的歧义。进一步对训练集进行了预处理,以消除数据中的异常值(以减少噪声)和选择较少的属性(以避免冗余和不相关)。使用两种学习算法(即J48分类树和随机森林(RF))进行建模。然后,使用两种预测速率度量(一种基于优势土壤序列,另一种基于组分序列概率)和两种景观破碎统计(连续性指数和聚集指数)评估地图模型的预测质量。通常,与J48相比,Random Forest记录的预测率更高且连续性更高。但是,随机森林过度预测了占地大的土壤,例如吉尔平,根西岛,赞内斯维尔和卡皮蒂纳系列,这是以较小比例出现的土壤的预测准确性为代价的。结果表明,在使用预处理的消歧训练集构建的模型中,基于显性土壤序列(> 0.5)和RF的较高连续性指数(0.83)和聚集指数(84.2)的预测值最高。这表明由于训练集标签消除了歧义,预测地图的质量有所提高。模型预测有助于确定土壤组合和关联中的许多单个成分系列。这些地图可用于识别不确定区域,例如多边形边界的错位,不正确的标签和沿县城边缘的差异,这些可作为进一步野外调查的指南。预测模型还为合理化相邻SSURGO映射的映射强度提供了有价值的信息。

著录项

  • 作者单位

    The Ohio State University.;

  • 授予单位 The Ohio State University.;
  • 学科 Statistics.; Agriculture Soil Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 135 p.
  • 总页数 135
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 统计学;土壤学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号