Development of Predictive Models for “Very Poor” Beach Water Quality Gradings Using Class-imbalance Learning

Highlights

  • This study introduces the EasyEnsemble model, an AI-based system using class-imbalance learning, to predict “very poor” beach water quality events. This work is a major advance over previous methods of predicting poor bacterial levels under data scarce situations.
  • The EasyEnsemble model’s superior performance is reflected in its high F-score (a performance metric of a machine learning model) of 0.84, surpassing traditional models like Multiple Linear Regression (MLR) and Classification Tree (CT). The model shows a substantial improvement in accurately predicting rare, high-bacterial concentration events.
  • The research proposes the integration of the new EasyEnsemble model with existing MLR models to form a robust hybrid beach-water-quality forecast system. This approach facilitates more adaptive management of beach water quality and enhances public health protection.

Summary

In this study, researchers attempt to address the challenge of predicting rare but critical instances of poor beach water quality. Traditional Multiple Linear Regression (MLR) models often fail to forecast high bacterial concentrations crucial for beach closure decisions accurately. This study focuses on Hong Kong beaches, where water quality is primarily assessed based on Escherichia coli (E. coli) levels.

The researchers introduce an AI-based binary classification model, EasyEnsemble, employing class-imbalance learning. This approach is designed to enhance the prediction of “very poor” water quality events, an area where traditional models like MLR and Classification Tree (CT) have shown limitations. The study uses a comprehensive 30-year dataset covering three distinct marine beaches in Hong Kong, examining different periods influenced by the Harbour Area Treatment Scheme (HATS).

The EasyEnsemble model effectively handles data imbalances between common and rare water quality events. It outperforms the MLR and CT models by showing a significantly higher F-score (0.84).

A hybrid approach, combining MLR models with the EasyEnsemble model, can create a more accurate beach water-quality-forecast system. This system promises improved public health protection by enabling more adaptive beach management decisions. The research highlights the potential of AI and class-imbalance learning in addressing complex environmental challenges and enhancing public health safety.

Guo and J. H. W. Lee, “Development of Predictive Models for ‘Very Poor’ Beach Water Quality Gradings Using Class-Imbalance Learning,” Environ. Sci. Technol., vol. 55, no. 21, pp. 14990–15000, Nov. 2021, doi: .

Research Video Abstract- research impact

We Share your discovery
Please visit us to know more about

Creating Research Video Abstract
Write Good Research Papers
OA Publishing: workflow and tools