...
首页> 外文期刊>The Science of the Total Environment >Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and UNEEC methods
【24h】

Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and UNEEC methods

机译:使用分位数回归和UNEEC方法预测用于模拟地下水硝酸盐污染的机器学习模型的不确定性

获取原文
获取原文并翻译 | 示例
           

摘要

Although estimating the uncertainty of models used for modelling nitrate contamination of groundwater is essential in groundwater management, it has been generally ignored. This issue motivates this research to explore the predictive uncertainty of machine-learning (ML) models in this field of study using two different residuals uncertainty methods: (pantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Prediction-interval coverage probability (PICP), the most important of the statistical measures of uncertainty, was used to evaluate uncertainty. Additionally, three state-of-the-art ML models including support vector machine (SVM), random forest (RF), and k-nearest neighbor (kNN) were selected to spatially model groundwater nitrate concentrations. The models were calibrated with nitrate concentrations from 80 wells (70% of the data) and then validated with nitrate concentrations from 34 wells (30% of the data). Both uncertainty and predictive performance criteria should be considered when comparing and selecting the best highlight that the kNN model is the best model because not only did it have the lowest uncertainty based on the PICP statistic in both the QR (0.94) and the UNEEC (in all clusters, 0.85-0.91) methods, but it also had predictive performance statistics (RMSE - 10.63, R-2- 0.71) that were relatively similar to RP (RMSE - 10.41, R-2- 0.72) and higher than SVM (RMSE - 13.28, R-2- 0.58). Determining the uncertainty of ML models used for spatially modelling groundwater-nitrate pollution enables managers to achieve better risk-based decision making and consequently increases the reliability and credibility of groundwater-nitrate predictions. (C) 2019 Elsevier B.V. All rights reserved.
机译:尽管估计用于模拟地下水硝酸盐污染的模型的不确定性在地下水管理中至关重要,但通常已被忽略。这个问题促使本研究使用两种不同的残差不确定性方法来探索该研究领域中机器学习(ML)模型的预测不确定性:(平铺回归(QR)和基于局部误差和聚类的不确定性估计(UNEEC)。 -区间覆盖概率(PICP)是不确定性最重要的统计指标,用于评估不确定性;此外,还使用了三种最新的ML模型,包括支持向量机(SVM),随机森林(RF),选择k近邻(kNN)对地下水中的硝酸盐浓度进行空间建模,并用来自80口井的硝酸盐浓度(数据的70%)对模型进行了校准,然后使用来自34口井的硝酸盐浓度的数据进行了验证(数据的30%)在比较和选择最佳选择时,应同时考虑不确定性和预测性能标准,这突出表明kNN模型是最佳模型,因为它不仅具有最低的不确定性基于QR(0.94)和UNEEC(在所有类别中为0.85-0.91)方法中的PICP统计数据得出的不确定性,但它也具有相对类似于以下方面的预测性能统计数据(RMSE-10.63,R-2- 0.71) RP(RMSE-10.41,R-2- 0.72)并高于SVM(RMSE-13.28,R-2- 0.58)。确定用于对地下水硝酸盐污染进行空间建模的ML模型的不确定性,可使管理人员实现更好的基于风险的决策,从而提高地下水硝酸盐预测的可靠性和可信度。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号