A hybrid Machine Learning approach for air quality prediction in Morocco: combining CatBoost with metaheuristic optimization algorithms

Authors

  • Rachid ED-DAOUDI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco
  • Sokaina EL KHAMLICHI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco; Research Team in Science and Technology, Higher School of Technology of Laayoune, Ibn Zohr University, P.O. Box 3007, Laayoune, Morocco
  • Badia ETTAKI Laboratory of Research in Informatics, Data Sciences and Artificial Intelligence, School of Information Sciences, B.P. 604, Rabat-Instituts, Rabat, Morocco

DOI:

https://doi.org/10.19139/soic-2310-5070-2705

Keywords:

Air Pollution, Hybrid Machine Learning, CatBoost Algorithm, Metaheuristic Optimization, Air Quality Index

Abstract

Air pollution poses serious risks to public health and environmental sustainability, particularly in rapidlyurbanizing areas of developing countries. This study investigates whether combining machine learning algorithms with metaheuristic optimization techniques can improve the accuracy and efficiency of air quality prediction in Morocco. The main objective is to compare direct classification of Air Quality Index (AQI) categories with a regression-based approach, and to evaluate the effectiveness of two optimization strategies—Arithmetic Optimization Algorithm (AOA) and Hunger Games Search (HGS)—in tuning the CatBoost model’s hyperparameters. Using five months of air quality data from two monitoring stations in Ait Melloul, we modeled concentrations of PM2.5, PM10, CO, and derived corresponding AQI classifications. The hybrid approach demonstrated that regression-based classification improved accuracy by nearly 30 percentage points over direct classification. Moreover, HGS achieved similar predictive performance to AOA but was over twice as computationally efficient. CO concentration predictions in residential areas achieved high accuracy (R2 > 0.95),while particulate matter predictions revealed limitations in capturing extreme pollution events. These findings suggest that combining gradient boosting with metaheuristic optimization is a promising strategy for developing scalable and accurate air quality forecasting systems in North African urban environments, with important implications for public health protection and environmental policy implementation.

Downloads

Published

2025-08-20

Issue

Section

Research Articles

How to Cite

A hybrid Machine Learning approach for air quality prediction in Morocco: combining CatBoost with metaheuristic optimization algorithms. (2025). Statistics, Optimization & Information Computing, 14(5), 2445-2471. https://doi.org/10.19139/soic-2310-5070-2705