Application of Machine Learning Models for Predicting School Dropout in Students from a Colombian Competency-based Education Institution

Authors

  • John Jairo Castro-Maldonado Servicio Nacional de Aprendizaje SENA, Medellín (Colombia)
  • Jennifer Andrea Londoño-Gallego Esp. en Formulación y evaluación de proyectos, Servicio Nacional de Aprendizaje SENA, Medellín (Colombia)
  • Paula Andrea Rodríguez-Marín Instituto Tecnológico Metropolitano ITM, Medellín (Colombia)
  • Juan David Martínez-Vargas Universidad EAFIT, Medellín (Colombia)

DOI:

https://doi.org/10.5281/zenodo.19690782

Keywords:

Intention to Drop Out, Artificial Intelligence, Machine Learning, Education.

Abstract

Student dropout is a structural challenge in Colombian higher education, particularly in contexts with rigid curricular and pedagogical systems where the implementation of timely preventive strategies is complex. This study develops and validates a hybrid machine learning model, based on the CRISP-DM methodology, that integrates supervised algorithms (Random Forest, Ridge, XGBoost, KNN) and unsupervised approaches (K-Means, DECLA), supported by dimensionality reduction and segmentation techniques (PCA, MCA). Using sociodemographic variables, academic performance indicators, and a specifically designed monitoring instrument, the models achieved high accuracy in anticipating dropout risk and segmenting students into profiles of high, medium, and low probability of withdrawal. Tree-based algorithms, particularly Random Forest, demonstrated the best performance, identifying critical predictors such as number of complaints, grade reversals, socioeconomic status, gender, and marital status. The main contribution of this work lies in moving predictive analytics from an experimental exercise to an institutional support system in competency-based higher education, where academic rigidity often limits early interventions. By anticipating dropout through real-time empirical evidence, the model enables the design of differentiated action pathways personalized tutoring, socioeconomic support, and curricular flexibility that complement long-term educational reforms. In this way, its relevance in higher education is justified as an innovative and evidence-based resource to strengthen student retention

Published

2026-04-23

How to Cite

John Jairo Castro-Maldonado, Jennifer Andrea Londoño-Gallego, Paula Andrea Rodríguez-Marín, & Juan David Martínez-Vargas. (2026). Application of Machine Learning Models for Predicting School Dropout in Students from a Colombian Competency-based Education Institution. Comunicar, 34(85), 171–202. https://doi.org/10.5281/zenodo.19690782

Issue

Section

Research Article

Most read articles by the same author(s)