Imputation of missing sub-hourly precipitation data in a large sensor network: A machine learning approach

Chivers Benedict D.; Wallbank John; Cole Steven J.; Sebek Ondrej; Stanley Simon; Fry Matthew; Leontidis Georgios

首页> 外文期刊>Journal of Hydrology >Imputation of missing sub-hourly precipitation data in a large sensor network: A machine learning approach

【24h】

Imputation of missing sub-hourly precipitation data in a large sensor network: A machine learning approach

机译：大型传感器网络中缺少的子小时降水数据的归纳：机器学习方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Precipitation data collected at sub-hourly resolution represents specific challenges for missing data recovery by being largely stochastic in nature and highly unbalanced in the duration of rain vs non-rain. Here we present a two-step analysis utilising current machine learning techniques for imputing precipitation data sampled at 30-minute intervals by devolving the task into (a) the classification of rain or non-rain samples, and (b) regressing the absolute values of predicted rain samples. Investigating 37 weather stations in the UK, this machine learning process produces more accurate predictions for recovering precipitation data than an established surface fitting technique utilising neighbouring rain gauges. Increasing available features for the training of machine learning algorithms increases performance with the integration of weather data at the target site with externally sourced rain gauges providing the highest performance. This method informs machine learning models by utilising information in concurrently collected environmental data to make accurate predictions of missing rain data. Capturing complex non-linear relationships from weakly correlated variables is critical for data recovery at sub-hourly resolutions. Such pipelines for data recovery can be developed and deployed for highly automated and near instantaneous imputation of missing values in ongoing datasets at high temporal resolutions.

机译：在次小时分辨率下收集的降水数据代表了通过在大大随机性的数据中缺少数据恢复的具体挑战，并且在雨中的持续时间内具有高度不平衡。在这里，我们利用当前机器学习技术来提供两步分析，用于通过将任务转换为雨或非雨样本的分类，以30分钟的间隔采样，以30分钟进行采样，（b）回归绝对值预测的雨水样本。调查英国的37个气象站，该机器学习过程会产生比利用邻近的雨量仪的既定表面配件技术恢复降水数据的更准确的预测。增加机器学习算法培训的可用功能会增加性能，随着目标网站的天气数据集成，具有提供最高性能的外部源地点。该方法通过利用同伴收集的环境数据中的信息来告知机器学习模型，以准确预测缺失的雨量数据。捕获从弱相关变量的复杂非线性关系对于子小时分辨率的数据恢复至关重要。可以开发和部署用于数据恢复的这些管道，以获得高时间分辨率在持续的数据集中缺失值的高度自动化和近的瞬时载体。

著录项

来源
《Journal of Hydrology》 |2020年第1期|共12页
作者
Chivers Benedict D.; Wallbank John; Cole Steven J.; Sebek Ondrej; Stanley Simon; Fry Matthew; Leontidis Georgios;
展开▼
作者单位

Univ Lincoln Sch Comp Sci Lincoln LN6 7TS England;

UK Ctr Ecol &

Hydrol Water Resources Wallingford OX10 8BB Oxon England;

UK Ctr Ecol &

Hydrol Water Resources Wallingford OX10 8BB Oxon England;

UK Ctr Ecol &

Hydrol Water Resources Wallingford OX10 8BB Oxon England;

UK Ctr Ecol &

Hydrol Water Resources Wallingford OX10 8BB Oxon England;

UK Ctr Ecol &

Hydrol Water Resources Wallingford OX10 8BB Oxon England;

Univ Lincoln Sch Comp Sci Lincoln LN6 7TS England;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类水文科学（水界物理学）;
关键词
Machine learning; Data imputation; Gradient boosted trees; Environmental sensor networks; Precipitation; Soil moisture;

机译：机器学习;数据归档;渐变提升树木;环境传感器网络;降水;土壤水分;

相似文献

外文文献
中文文献
专利

1. Imputation of missing sub-hourly precipitation data in a large sensor network: A machine learning approach [J] . Chivers Benedict D., Wallbank John, Cole Steven J., Journal of Hydrology . 2020,第1期

机译：大型传感器网络中缺少的子小时降水数据的归纳：机器学习方法
2. Analyzing and modeling inter-sensor relationships for strain monitoring data and missing data imputation: a copula and functional data-analytic approach [J] . Chen Zhicheng, Li Hui, Bao Yuequan Structural health monitoring . 2019,第4期

机译：应变监测数据和缺失数据归档的传感器间关系分析和建模：COPULA和功能数据分析方法
3. A GS-MPSO-WKNN method for missing data imputation in wireless sensor networks monitoring manufacturing conditions [J] . JiaCheng Ni, Li Li, Fei Qiao, Transactions of the Institute of Measurement and Control . 2014,第8期

机译：用于监视制造条件的无线传感器网络中数据丢失的GS-MPSO-WKNN方法
4. A novel index measure imputation algorithm for missing data values: A machine learning approach [C] . Madhu G., Rajinikanth T.V. 2012 IEEE International Conference on Computational Intelligence amp; Computing Research . 2012

机译：缺失数据值的新型索引度量插补算法：一种机器学习方法
5. Missing Data Imputation Using Machine Learning and Natural Language Processing for Clinical Diagnostic Codes [D] . Choudhury, Arkopal. 2020

机译：使用机器学习和临床诊断代码的自然语言处理缺少数据载荷
6. Temporal and Spatial Nearest Neighbor Values Based Missing Data Imputation in Wireless Sensor Networks [O] . Yulong Deng, Chong Han, Jian Guo, 2021

机译：基于无线传感器网络中缺失数据载体的时间和空间最近邻值
7. MISSING DATA IMPUTATION ON IOT SENSOR NETWORKS: IMPLICATIONS FOR ON-SITE SENSOR CALIBRATION [O] . Nwamaka Okafor 2021

机译：IOT传感器网络上缺少数据载荷：对现场传感器校准的影响

Imputation of missing sub-hourly precipitation data in a large sensor network: A machine learning approach

摘要

著录项

相似文献

相关主题

期刊订阅