Deep learning hybrid model with Boruta-Random forest optimiser algorithm for streamflow forecasting with climate mode indices, rainfall, and periodicity

Ahmed, A. A. Masrur and Deo, Ravinesh C. ORCID: and Feng, Qi and Ghahramani, Afshin ORCID: and Raj, Nawin ORCID: and Yin, Zhenliang and Yang, Linshan (2021) Deep learning hybrid model with Boruta-Random forest optimiser algorithm for streamflow forecasting with climate mode indices, rainfall, and periodicity. Journal of Hydrology, 599:126350. pp. 1-23. ISSN 0022-1694


Long-term forecasting of any hydrologic phenomena is essential for strategic environmental planning, hydrologic and other forms of structural design, agriculture, and water resources management. Climate mode indices, utilising machine learning methods, are frequently considered as predictor variables in order to forecast several different hydrological variables. In this study, a feature selection algorithm based on two different deep learning models, i.e., long short-term memory and a gated recurrent unit, is applied to improve the forecasting capability of streamflow water levels at six gauging stations in the Murray Darling Basin of Australia. This paper therefore aggregates the significant antecedent lag memory of climate mode indices, rainfall, and the monthly factor based on the periodicity as the predictor variables to attain significantly accurate stream water level forecasts. This novel method identifies an improved relationship between the stream water level and climate mode indices through the aggregation of the significant lagged datasets capturing the historical features to predict the future streamflow water level. The boruta feature selection algorithm (BRF) was then applied in a two phase process before and after attaining the significant lagged inputs to screen the optimum predictor variables. The merits of the forecast models were evaluated through different performance evaluation criteria. The results show that the accumulated significant lagged inputs based on climate mode indices, along with the rainfall and periodicity factors are seen to provide improved forecasting of the SWL over the non-BRF deep learning approaches where no prior feature selection was applied. The hybrid LSTM method (i.e., BRF-LSTM model) achieved a unique advantage in terms of SWL forecasting, particularly attaining over 98% of the predictive errors lying within a band of +/-0.015 m with relatively low relative errors (RRMSE ≈1.30% and RMAE ≈ 0.882%), outperforming all of the benchmark models. It is also found that the periodicity factor has a potential influence on the accuracy of the forecast models for the four monitored study stations. This study concludes that the newly developed hybrid deep learning approaches, coupled with the BRF feature selection, provide improved forecasting performance. The hybrid approach developed in this paper can therefore be used to provide a strong provide predictive response algorithm for the hydrological variables that were influenced by the low-frequency variability of the climate model indices in respect to streamflow water level.

Statistics for USQ ePrint 41971
Statistics for this ePrint Item
Item Type: Article (Commonwealth Reporting Category C)
Refereed: Yes
Item Status: Live Archive
Additional Information: This project was supported by a USQ-CAS Postgraduate Research Scholarship under USQ-CAS Grant held by Professor Ravinesh Deo (USQ) and Professor Qi Feng (Chinese Academy of Sciences, CAS).
Faculty/School / Institute/Centre: Historic - Faculty of Health, Engineering and Sciences - School of Sciences (6 Sep 2019 - 31 Dec 2021)
Faculty/School / Institute/Centre: Current - Institute for Life Sciences and the Environment - Centre for Sustainable Agricultural Systems (1 Aug 2018 -)
Date Deposited: 17 May 2021 03:46
Last Modified: 16 Aug 2022 03:26
Uncontrolled Keywords: Stream water level; Climate indices; Boruta-random forest hybridizer algorithm (BRF)Significant lag memory; Murray Darling Basin; long short-term memory (LSTM)
Fields of Research (2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080110 Simulation and Modelling
05 Environmental Sciences > 0502 Environmental Science and Management > 050205 Environmental Management
Fields of Research (2020): 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460207 Modelling and simulation
41 ENVIRONMENTAL SCIENCES > 4104 Environmental management > 410404 Environmental management
Socio-Economic Objectives (2008): D Environment > 96 Environment > 9699 Other Environment > 969999 Environment not elsewhere classified
Socio-Economic Objectives (2020): 18 ENVIRONMENTAL MANAGEMENT > 1899 Other environmental management > 189999 Other environmental management not elsewhere classified
Funding Details:
Identification Number or DOI:

Actions (login required)

View Item Archive Repository Staff Only