Development of an intelligent system to predict university dropout rates in Colombia using machine learning techniques
Main Article Content
Abstract
Introduction: Student dropout is a critical global challenge with profound socioeconomic and institutional impacts. On average, one in five students drop out of school, limiting social mobility, deepening inequalities, and reducing the sustainability of education systems.
Objective: Student dropout is a critical global challenge with profound socioeconomic and institutional impacts. On average, one in five students drop out of school, limiting social mobility, deepening inequalities, and reducing the sustainability of education systems.
Method: The study, framed within Design Science Research (DSR), used a longitudinal dataset of 104,147 records from a Colombian university. Rigorous preprocessing was applied, including reclassification of the target variable and engineering of 27 predictor features. Seven algorithms were evaluated, selecting LightGBM, which was optimized in its hyperparameters and balanced with SMOTE (Synthetic Minority Over-sampling Technique).
Results: LightGBM proved to be the superior algorithm with a weighted F1-Score of 0.8125. The optimized model achieved an overall accuracy of 87% and an F1-Score of 0.83 for the “Dropout” class. Strategic calibration of the decision threshold to 0.45 raised the recall to 87%, correctly identifying 1,447 of 1,654 actual dropouts. SHAP analysis confirmed that REAL_PROGRESS_PERCENTAGE was the most influential predictor with an impact of 1.45.
Conclusions: Cumulative academic performance, grade trends, and actual progress percentage are the most decisive predictors of dropout, in interaction with socioeconomic variables such as income stratum and demographic variables such as age group.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Published papers are the exclusive responsibility of their authors and do not necessary reflect the opinions of the editorial committee.
INGE CUC Journal respects the moral rights of its authors, whom must cede the editorial committee the patrimonial rights of the published material. In turn, the authors inform that the current work is unpublished and has not been previously published.
All articles are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
https://orcid.org/0000-0002-8354-6396
