首页> 外文学位 >Comparison of regression and ARIMA models with neural network models to forecast the daily streamflow of White Clay Creek.
【24h】

Comparison of regression and ARIMA models with neural network models to forecast the daily streamflow of White Clay Creek.

机译:将回归模型和ARIMA模型与神经网络模型进行比较,以预测White Clay Creek的日流量。

获取原文
获取原文并翻译 | 示例

摘要

Linear forecasting models have played major roles in many applications for over a century. If error terms in models are normally distributed, linear models are capable of producing the most accurate forecasting results. The central limit theorem (CLT) provides theoretical support in applying linear models.;During the last two decades, nonlinear models such as neural network models have gradually emerged as alternatives in modeling and forecasting real processes. In hydrology, neural networks have been applied to rainfall-runoff estimation as well as stream and peak flow forecasting. Successful nonlinear methods rely on the generalized central limit theorem (GCLT), which provides theoretical justifications in applying nonlinear methods to real processes in impulsive environments.;This dissertation will attempt to predict the daily stream flow of White Clay Creek by making intensive comparisons of linear and nonlinear forecasting methods. Data are modeled and forecasted by seven linear and nonlinear methods: The random walk with drift method; the ordinary least squares (OLS) regression method; the time series Autoregressive Integrated Moving Average (ARIMA) method; the feed-forward neural network (FNN) method; the recurrent neural network (RNN) method; the hybrid OLS regression and feed-forward neural network (OLS-FNN) method; and the hybrid ARIMA and recurrent neural network (ARIMA-RNN) method. The first three methods are linear methods and the remaining four are nonlinear methods. The OLS-FNN method and the ARIMA-RNN method are two completely new nonlinear methods proposed in this dissertation. These two hybrid methods have three special features that distinguish them from any existing hybrid method available in literature: (1) using the OLS or ARIMA residuals as the targets of followed neural networks; (2) training two neural networks in parallel for each hybrid method by two objective functions (the minimum mean squares error function and the minimum mean absolute error function); and (3) using two trained neural networks to obtain respective forecasting results and then combining the forecasting results by a Bayesian Model Averaging technique. Final forecasts from hybrid methods have linear components resulting from the regression method or the ARIMA method and nonlinear components resulting from feed-forward neural networks or recurrent neural networks.;Forecasting performances are evaluated by both root of mean square errors (RMSE) and mean absolute errors (MAE). Forecasting results indicate that linear methods provide the lowest RMSE forecasts when data are normally distributed and data lengths are long enough, while nonlinear methods provide a more consistent RMSE and MAE forecasts when data are non-normally distributed. Nonlinear neural network methods also provide lower RMSE and MAE forecasts than linear methods even for data that are normally distributed but with small data samples. The hybrid methods provide the most consistent RMSE and MAE forecasts for data that are non-normally distributed.;The original flow is differenced and log differenced to get two differenced series: The difference series and the log difference series. These two series are then decomposed based on stochastic process decomposition theorems to produce two, three and four variables that are used as input variables in regression models and neural network models.;By working on an increment series, either difference series or log difference series, instead of the original flow series, we get two benefits: First we have a clear time series model. The secondary benefit is from the fact that the original flow series is an autocorrelated series and an increment series is approximately an independently ditributed series. For an independently ditributed series, parameters such as Mean and Standard Deviation can be calculated easily.;The length of data during modeling is in practice very important. Model parameters and forecasts are estimated from 30 data samples (1 month), 90 data samples (3 months), 180 data samples (6 months), and 360 data samples (1 year).
机译:线性预测模型在一个多世纪以来的许多应用中发挥了重要作用。如果模型中的误差项呈正态分布,则线性模型能够产生最准确的预测结果。中心极限定理(CLT)为应用线性模型提供了理论支持。在过去的二十年中,非线性模型(例如神经网络模型)逐渐成为建模和预测实际过程的替代方法。在水文学中,神经网络已应用于降雨径流估算以及水流和洪峰流量预报。成功的非线性方法依赖于广义中心极限定理(GCLT),该理论为将非线性方法应用于脉冲环境中的实际过程提供了理论依据。本文将通过对线性关系进行深入的比较来预测White Clay Creek的日流量。和非线性预测方法。通过七种线性和非线性方法对数据进行建模和预测。普通最小二乘(OLS)回归方法;时间序列自回归综合移动平均(ARIMA)方法;前馈神经网络(FNN)方法;递归神经网络(RNN)方法;混合OLS回归和前馈神经网络(OLS-FNN)方法;以及ARIMA和递归神经网络(ARIMA-RNN)的混合方法。前三种方法是线性方法,其余四种是非线性方法。 OLS-FNN方法和ARIMA-RNN方法是本文提出的两种全新的非线性方法。这两种混合方法具有三个特殊功能,使其区别于文献中现有的任何现有混合方法:(1)使用OLS或ARIMA残差作为跟随神经网络的目标; (2)通过两个目标函数(最小均方误差函数和最小均值绝对误差函数)为每种混合方法并行训练两个神经网络; (3)使用两个训练过的神经网络获得各自的预测结果,然后通过贝叶斯模型平均技术将预测结果合并。混合方法的最终预测具有回归方法或ARIMA方法产生的线性成分,以及前馈神经网络或递归神经网络产生的非线性成分;预测性能通过均方根(RMSE)和均值均方根来评估错误(MAE)。预测结果表明,当数据为正态分布且数据长度足够长时,线性方法可提供最低的RMSE预测,而当数据为非正态分布时,非线性方法可提供更一致的RMSE和MAE预测。非线性神经网络方法也提供比线性方法更低的RMSE和MAE预测,即使对于正态分布但数据样本较少的数据也是如此。混合方法为非正态分布的数据提供最一致的RMSE和MAE预测。原始流被差分和对数差分得到两个差分序列:差分序列和对数差分序列。然后根据随机过程分解定理分解这两个序列,以生成两个,三个和四个变量,将其用作回归模型和神经网络模型中的输入变量。;通过处理增量序列(差序列或对数差序列),与原始流程相比,我们有两个好处:首先,我们有一个清晰的时间序列模型。第二个好处是,原始流量序列是自相关序列,增量序列大约是一个独立分配的序列。对于独立分布的序列,可以轻松计算诸如均值和标准差之类的参数。建模期间的数据长度在实践中非常重要。根据30个数据样本(1个月),90个数据样本(3个月),180个数据样本(6个月)和360个数据样本(1年)估算模型参数和预测。

著录项

  • 作者

    Liu, Greg Qi.;

  • 作者单位

    University of Delaware.;

  • 授予单位 University of Delaware.;
  • 学科 Applied Mathematics.;Water Resource Management.;Operations Research.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 545 p.
  • 总页数 545
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号