Streamflow and soil moisture forecasting with hybrid data intelligent machine learning approaches: case studies in the Australian Murray-Darling basin

Prasad, Ramendra (2018) Streamflow and soil moisture forecasting with hybrid data intelligent machine learning approaches: case studies in the Australian Murray-Darling basin. [Thesis (PhD/Research)]

Text (Whole Thesis)

Download (16Mb) | Preview


For a drought-prone agricultural nation such as Australia, hydro-meteorological imbalances and increasing demand for water resources are immensely constraining terrestrial water reservoirs and regional-scale agricultural productivity. Two important components of the terrestrial water reservoir i.e., streamflow water level (SWL) and soil moisture (SM), are imperative both for agricultural and hydrological applications. Forecasted SWL and SM can enable prudent and sustainable decisionmaking for agriculture and water resources management. To feasibly emulate SWL and SM, machine learning data-intelligent models are a promising tool in today’s rapidly advancing data science era. Yet, the naturally chaotic characteristics of hydro-meteorological variables that can exhibit non-linearity and non-stationarity behaviors within the model dataset, is a key challenge for non-tuned machine learning models. Another important issue that could confound model accuracy or applicability is the selection of relevant features to emulate SWL and SM since the use of too fewer inputs can lead to insufficient information to construct an accurate model while the use of an excessive number and redundant model inputs could obscure the performance of the simulation algorithm.

This research thesis focusses on the development of hybridized dataintelligent models in forecasting SWL and SM in the upper layer (surface to 0.2 m) and the lower layer (0.2–1.5 m depth) within the agricultural region of the Murray-Darling Basin, Australia. The SWL quantifies the availability of surface water resources, while, the upper layer SM (or the surface SM) is important for surface runoff, evaporation, and energy exchange at the Earth-Atmospheric interface. The lower layer (or the root zone) SM is essential for groundwater recharge purposes, plant uptake and transpiration. This research study is constructed upon four primary objectives designed for the forecasting of SWL and SM with subsequent robust evaluations by means of statistical metrics, in tandem with the diagnostic plots of observed and modeled datasets.

The first objective establishes the importance of feature selection (or optimization) in the forecasting of monthly SWL at three study sites within the Murray-Darling Basin. Artificial neural network (ANN) model optimized with iterative input selection (IIS) algorithm named IIS-ANN is developed whereby the IIS algorithm achieves feature optimization. The IIS-ANN model outperforms the standalone models and a further hybridization is performed by integrating a nondecimated and advanced maximum overlap discrete wavelet transformation (MODWT) technique. The IIS selected inputs are transformed into wavelet subseries via MODWT to unveil the embedded features leading to IIS-W-ANN model. The IIS-W-ANN outperforms the comparative IIS-W-M5 Model Tree, IIS-based and standalone models.

In the second objective, improved self-adaptive multi-resolution analysis (MRA) techniques, ensemble empirical mode decomposition (EEMD) and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) are utilized to address the non-stationarity issues in forecasting monthly upper and lower layer soil moisture at seven sites. The SM time-series are decomposed using EEMD/CEEMDAN into respective intrinsic mode functions (IMFs) and residual components. Then the partial-auto correlation function based significant lags are
utilized as inputs to the extreme learning machine (ELM) and random forest (RF) models. The hybrid EEMD-ELM yielded better results in comparison to the CEEMDAN-ELM, EEMD-RF, CEEMDAN-RF and the classical ELM and RF models.

Since SM is contingent upon many influential meteorological, hydrological and atmospheric parameters, for the third objective sixty predictor inputs are collated
in forecasting upper and lower layer soil moisture at four sites. An ANN-based ensemble committee of models (ANN-CoM) is developed integrating a two-phase feature optimization via Neighborhood Component Analysis based feature selection algorithm for regression (fsrnca) and a basic ELM. The ANN-CoM shows better predictive performance in comparison to the standalone second order Volterra, M5 Model Tree, RF, and ELM models.

In the fourth objective, a new multivariate sequential EEMD based modelling is developed. The establishment of multivariate sequential EEMD is an advancement
of the classical single input EEMD approach, achieving a further methodological improvement. This multivariate approach is developed to allow for the utilization of
multiple inputs in forecasting SM. The multivariate sequential EEMD optimized with cross-correlation function and Boruta feature selection algorithm is integrated with the ELM model in emulating weekly SM at four sites. The resulting hybrid multivariate sequential EEMD-Boruta-ELM attained a better performance in comparison with the multivariate adaptive regression splines (MARS) counterpart (EEMD-Boruta-MARS) and standalone ELM and MARS models.

The research study ascertains the applicability of feature selection algorithms integrated with appropriate MRA for improved hydrological forecasting. Forecasting at shorter and near-real-time horizons (i.e., weekly) would help reinforce scientific tenets in designing knowledge-based systems for precision agriculture and climate change adaptation policy formulations.

Statistics for USQ ePrint 36485
Statistics for this ePrint Item
Item Type: Thesis (PhD/Research)
Item Status: Live Archive
Additional Information: Doctor of Philosophy (PhD) thesis.
Faculty/School / Institute/Centre: Current - Faculty of Health, Engineering and Sciences - School of Agricultural, Computational and Environmental Sciences
Supervisors: Deo, Ravinesh C.; Li, Yan; Maraseni, Tek
Date Deposited: 22 May 2019 02:05
Last Modified: 23 May 2019 02:05
Uncontrolled Keywords: streamflow, soil moisture, forecasting, machine learning, Murray-Darling Basin, non-stationarity
Fields of Research : 04 Earth Sciences > 0401 Atmospheric Sciences > 040199 Atmospheric Sciences not elsewhere classified
04 Earth Sciences > 0401 Atmospheric Sciences > 040102 Atmospheric Dynamics
07 Agricultural and Veterinary Sciences > 0701 Agriculture, Land and Farm Management > 070105 Agricultural Systems Analysis and Modelling
08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080110 Simulation and Modelling
08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080108 Neural, Evolutionary and Fuzzy Computation
01 Mathematical Sciences > 0102 Applied Mathematics > 010299 Applied Mathematics not elsewhere classified

Actions (login required)

View Item Archive Repository Staff Only