Main Article Content

Ana Gabriela Banquez Maturana Juan David Rodríguez Cerón Ángel Manuel Benavides González Heriberto Alexander Felizzola Jimenez

Abstract

Introduction: Student dropout is a critical global challenge with profound socioeconomic and institutional impacts. On average, one in five students drop out of school, limiting social mobility, deepening inequalities, and reducing the sustainability of education systems.


Objective: Student dropout is a critical global challenge with profound socioeconomic and institutional impacts. On average, one in five students drop out of school, limiting social mobility, deepening inequalities, and reducing the sustainability of education systems.


Method: The study, framed within Design Science Research (DSR), used a longitudinal dataset of 104,147 records from a Colombian university. Rigorous preprocessing was applied, including reclassification of the target variable and engineering of 27 predictor features. Seven algorithms were evaluated, selecting LightGBM, which was optimized in its hyperparameters and balanced with SMOTE (Synthetic Minority Over-sampling Technique).


Results:  LightGBM proved to be the superior algorithm with a weighted F1-Score of 0.8125. The optimized model achieved an overall accuracy of 87% and an F1-Score of 0.83 for the “Dropout” class. Strategic calibration of the decision threshold to 0.45 raised the recall to 87%, correctly identifying 1,447 of 1,654 actual dropouts. SHAP analysis confirmed that REAL_PROGRESS_PERCENTAGE was the most influential predictor with an impact of 1.45.


Conclusions: Cumulative academic performance, grade trends, and actual progress percentage are the most decisive predictors of dropout, in interaction with socioeconomic variables such as income stratum and demographic variables such as age group.

Downloads

Download data is not yet available.

Article Details

How to Cite
Banquez Maturana, A. G., Rodríguez Cerón, J. D., Benavides González, Ángel M., & Felizzola Jimenez, H. A. (2025). Development of an intelligent system to predict university dropout rates in Colombia using machine learning techniques. Inge Cuc, 21(2). https://doi.org/10.17981/ingecuc.21.2.2025.08
Section
In Press

Most read articles by the same author(s)