Journal article
A Comparative Study of Ensemble Machine Learning and Explainable AI for Predicting Harmful Algal Blooms
Big data and cognitive computing, Vol.9(5), 138
05/20/2025
DOI: 10.3390/bdcc9050138
Abstract
Harmful algal blooms (HABs), driven by environmental pollution, pose significant threats to water quality, public health, and aquatic ecosystems. This study enhances the prediction of HABs in Lake Erie, part of the Great Lakes system, by utilizing ensemble machine learning (ML) models coupled with explainable artificial intelligence (XAI) for interpretability. Using water quality data from 2013 to 2020, various physical, chemical, and biological parameters were analyzed to predict chlorophyll-a (Chl-a) concentrations, which are a commonly used indicator of phytoplankton biomass and a proxy for algal blooms. This study employed multiple ensemble ML models, including random forest (RF), deep forest (DF), gradient boosting (GB), and XGBoost, and compared their performance against individual models, such as support vector machine (SVM), decision tree (DT), and multi-layer perceptron (MLP). The findings revealed that the ensemble models, particularly XGBoost and deep forest (DF), achieved superior predictive accuracy, with R2 values of 0.8517 and 0.8544, respectively. The application of SHapley Additive exPlanations (SHAPs) provided insights into the relative importance of the input features, identifying the particulate organic nitrogen (PON), particulate organic carbon (POC), and total phosphorus (TP) as the critical factors influencing the Chl-a concentrations. This research demonstrates the effectiveness of ensemble ML models for achieving high predictive accuracy, while the integration of XAI enhances model interpretability. The results support the development of proactive water quality management strategies and highlight the potential of advanced ML techniques for environmental monitoring.
Details
- Title: Subtitle
- A Comparative Study of Ensemble Machine Learning and Explainable AI for Predicting Harmful Algal Blooms
- Creators
- Omer Mermer - University of IowaEddie Zhang - University of IowaIbrahim Demir - Tulane University
- Resource Type
- Journal article
- Publication Details
- Big data and cognitive computing, Vol.9(5), 138
- DOI
- 10.3390/bdcc9050138
- ISSN
- 2504-2289
- eISSN
- 2504-2289
- Publisher
- MDPI
- Language
- English
- Date published
- 05/20/2025
- Academic Unit
- Electrical and Computer Engineering; Civil and Environmental Engineering; IIHR--Hydroscience and Engineering; Injury Prevention Research Center
- Record Identifier
- 9984824287702771
Metrics
9 Record Views