The Earthquake Prediction Using Machine Learning Algorithms: A Data-Driven Framework Based on Supervised Learning Models

Ahmed Gamal; BenBella  Sayed Tawfik; Bassel  Hafiz

doi:10.19139/soic-2310-5070-3417

The Earthquake Prediction Using Machine Learning Algorithms

A Data-Driven Framework Based on Supervised Learning Models

Authors

Ahmed Gamal Faculty of Computers and Informatics
Ben Bella Tawfik College of Computers & Informatics, Suez Canal University, Egypt
B. Hafiz College of Computers & Informatics, Suez Canal University, Egypt

DOI:

https://doi.org/10.19139/soic-2310-5070-3417

Keywords:

Earthquake Prediction, Machine Learning, Seismology, Random Forest, Support Vector Machine, Neural Networks, Class Imbalance, Seismic Risk Analysis

Abstract

Earthquakes are among the most destructive natural phenomena, occurring suddenly and often causing severe human and infrastructural losses. Accurate and timely prediction of significant seismic events remains a major challenge in seismology. This study presents a systematic machine-learning-based framework for earthquake prediction using historical seismic data. A three-phase methodology is adopted: (i) data acquisition and preprocessing of seismic records obtained from the United States Geological Survey (USGS), (ii) class balancing and feature preparation to address extreme class imbalance inherent in earthquake datasets, and (iii) model development and evaluation using multiple supervised learning algorithms. Binary classification was first employed to distinguish significant from non-significant earthquakes, followed by multi-class classification to categorize events into Minor, Light, Moderate, and Strong classes. Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Neural Network models were evaluated. After balancing, the binary dataset achieved a 50–50 class distribution, while the multi-class dataset was uniformly balanced across all four classes. Experimental results demonstrate that the Random Forest model achieved the highest binary classification accuracy of 98.15%, with an Area Under the Curve (AUC) of 0.683, indicating strong discriminative capability. The Neural Network model achieved the highest recall (0.611), making it suitable for early warning scenarios where missed detections are critical. For multi-class classification, Random Forest achieved the highest overall accuracy of 87.78%, outperforming other models in robustness and stability. These results confirm the effectiveness of ensemble-based learning for seismic event prediction and highlight the trade-off between accuracy and sensitivity across different algorithms. The proposed framework provides a reliable foundation for earthquake early-warning systems in high-risk regions.

Downloads

Published

2026-04-14

Issue

Online First

Section

Research Articles

License

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

How to Cite

The Earthquake Prediction Using Machine Learning Algorithms: A Data-Driven Framework Based on Supervised Learning Models. (2026). Statistics, Optimization & Information Computing. https://doi.org/10.19139/soic-2310-5070-3417

Download Citation