Hybridizing Machine and Deep Learning for Urban Water Demand Forecasting: An Ensemble Framework Leveraging Dam Monitoring Data


Creative Commons License

AKINER M. E.

Pure and Applied Geophysics, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1007/s00024-026-03941-0
  • Dergi Adı: Pure and Applied Geophysics
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Geobase, INSPEC
  • Anahtar Kelimeler: dam occupancy rates, ensemble learning, SHAP explainability analysis, time series features, urban water management, Water consumption estimation
  • Akdeniz Üniversitesi Adresli: Evet

Özet

Precise forecasting of urban water demand is a necessary condition for the proper management of resources in urban areas. The presented work proposes a novel ensemble framework that significantly improves forecast accuracy by integrating the daily occupancy rates of ten major dams in Istanbul. Unlike single-model approaches, the proposed technique is based on the combination of six machine learning algorithms (Random Forest, XGBoost, LightGBM, LSTM, SVR, and Ridge Regression), where the hyperparameters of each model are tuned using the Optuna library. The research is based on 4767 daily observations (from 2011 to 2024) processed with many temporal features such as seasonal indicators, moving averages, and lagged consumption variables. After pre-processing with StandardScaler and one-hot encoding, the data was chronologically split into three parts—training (52%), validation (18%), and testing (30%)—and this partitioning was done to prevent temporal data leakage. SVR performed best (R2 = 0.8566, RMSE = 72,815 m3/day), and LSTM performed second best (R2 = 0.8345). The dynamically weighted ensemble model also had very good predictive ability (R2 = 0.8469, RMSE = 75,244 m3/day, MAE = 55,726 m3/day), outperforming all baseline models except SVR. SHAP analysis showed that short-term consumption trends were the most significant forecast indicators, especially the 7-day moving averages and the one-day lagged consumption, which completely overshadowed the dam occupancy rates. The findings revealed the supremacy of ensemble learning methods in the water demand forecasts of urban areas and also indicated the water management authorities need to come up with data-driven conservation strategies.