Prediction and Prevention of School Dropout through A.I.: A Review to Identify Models and Relevant Factors
Main Article Content
Abstract
School dropout is a pressing concern in educational institutions, as per statistics from the Ministry of Education of Colombia, which report that 473,786 children and young students have discontinued their studies between November 2022 and May 2023. This issue is especially prominent in Science, Technology, Engineering, and Mathematics (STEM) academic programs. Addressing this challenge requires the integration of Information Technology (IT) tools that provide effective and timely monitoring to the academic control departments. The purpose of this literature review is to explore the variables related to academic dropout and find suitable predictive models for data processing while also identifying variables and models previously used in the field. To achieve this, research is proposed using academic search platforms such as Lens.org and Google Scholar. After conducting the research, relevant variables in the national context are identified, such as academic performance, age, gender, family status, and psychological aspects, among others, as they are considered crucial for accurate prediction. The C4.5 decision tree model was chosen due to its excellent performance in research, widespread usage in the field, and low computational cost.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
All the texts published in this magazine are distributed under a Creative Commons License «Attribution-Non-Commercial-Share the same»
References
Radio Nacional de Colombia. Disponible en línea: https://www.radionacional.co/actualidad/educacion/la-desercion-escolar-en-colombia-aumento-en-el-2023-panorama-preocupante#:~:text=2023%20-%2016%3A30-,Según%20el%20Ministerio%20de%20Educación%2C%20la%20deserción%20escolar%20aumentó%20en,comparación%20con%20los%20años%20anteriores (consultado el 28, 07, 2023).
Nagy, M., Molontay, R., 2023. Interpretable Dropout Prediction: Towards XAI-Based Personalized Intervention. International Journal of Artificial Intelligence in Education.. https://doi.org/10.1007/s40593-023-00331-8
Ministerio de Educación Nacional. (2022, Ago. 5) DESERCIÓN ESCOLAR EN COLOMBIA: ANÁLISIS, DETERMINANTES Y POLÍTICA DE ACOGIDA, BIENESTAR Y PERMANENCIA. [Online]. Disponible: https://www.mineducacion.gov.co/1780/articles-363488_recurso_34.pdf
O. Castrillón, W. Sarache y S. Ruiz. "Predicción del rendimiento académico por medio de técnicas de inteligencia artificial" Form. Univ. vol.13, no.1, pp.93-102, Febrero 2020
Ya?c?, M., 2022. Educational data mining: prediction of students' academic performance using machine learning algorithms. Smart Learning Environments 9.. https://doi.org/10.1186/s40561-022-00192-z
Rodríguez, P., Villanueva, A., Dombrovskaia, L., Valenzuela, J.P., 2023. A methodology to design, develop, and evaluate machine learning models for predicting dropout in school systems: the case of Chile. Education and Information Technologies 28, 10103–10149.. https://doi.org/10.1007/s10639-022-11515-5
Castaño, E., Gallón, S., Gómez, K. y Vásquez, J. Deserción estudiantil universitaria: una aplicación de modelos de duración. Lecturas de economía, 2004. 60, 39-65.
Benites, R. M. El papel de la tutoría académica para elevar el rendimiento académico de los estudiantes universitarios. Revista Conrado. 2020. 16(77), 315-321.
Ishitani, T. Studying attrition and degree completion behavior among first generation college students in the United States. The Journal of Higher Education, 2006. 77(5), 861-885.
Castillo Caicedoa, M., Osorio Mejíab, A. M. y Montero Cuartasc, S. Deserción y retención, en la carrera de Economía de la Pontificia Universidad Javeriana Cali: un análisis de supervivencia, 2000-2008. Economía, Gestión y Desarrollo, 2010 9, 11- 33.
Giovanoli, P. Determinantes de la deserción y graduación universitaria: una aplicación utilizando modelos de duración. Documento de trabajo, 37. Argentina: Universidad Nacional de La Plata. 2002.
García Ramírez, R. G. García Montejo, J. S. ANÁLISIS CARACTERÍSTICO DE LOS FACTORES DE LA DESERCIÓN EN EDUCACIÓN SUPERIOR. Revista de divulgación científica y tecnológica. 2022. Vol 7, No. 3. 21-31
Jim?nes Garc?s, C. Vieyra Reyes, P. Trujillo Condes, V. E. Hernandez Gonzales, M. M. Factores asociados al rendimiento acad?mico y deserci?n escolar en educaci?n media superior: Reflexiones. AMeditores. 2022
Guayacán, J. Estado de la deserción escolar en los establecimientos oficiales de Colombia. 2015 Recuperado de: http://hdl.handle.net/20.500.12209/779.
Hoyos Osorio, J.K., Daza Santacoloma, G., 2023. Predictive Model to Identify College Students with High Dropout Rates. Revista Electrónica de Investigación Educativa 25, 1–10.. https://doi.org/10.24320/redie.2023.25.e13.5398
Lee, S., Chung, J.Y., 2019. The Machine Learning-Based Dropout Early Warning System for Improving the Performance of Dropout Prediction. Applied Sciences 9, 3093.. https://doi.org/10.3390/app9153093
Kim, S., Choi, E., Jun, Y.-K., Lee, S., 2023. Student Dropout Prediction for University with High Precision and Recall. Applied Sciences 13, 6275.. https://doi.org/10.3390/app13106275
F. Pacho and D. Chiqui. "Estudio de las causas de la deserción escolar," B.S. Thesis. Cuenca, 2011. [Online]. Available: http://dspace.ucuenca.edu.ec/handle/123456789/1868
E. Ortega de Ávila, B. V. Alvarado de la Torre, M. G. Balderrábano Saucedo, C. A. Martínez Cardona, & J. O. Bautista Acosta. Implicaciones de la deserción escolar a nivel superior en Ingeniería en Sistemas e Informática. Coloquio de investigación multidisciplinaria, 2019. 7(1), 2383–2390.
Leif E. Peterson, K-nearest neighbor. Scholarpedia. 2009. Disponible en línea: http://scholarpedia.org/article/K-nearest_neighbor
Kramer, O. K-Nearest Neighbors. Intelligent Systems Reference Library, 2013. 13–23.
Daniel T. Larose; Chantal D. Larose. k?Nearest Neighbor Algorithm. Discovering Knowledge in Data: An Introduction to Data Mining. 2014. pp.149-164,
Dudani, S.A. The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern., SMC-6:325–327, 1976.
Moore, A. W., & Komarek, P. Logistic regression for data mining and high-dimensional classification. Carnegie Mellon University?ProQuest Dissertations Publishing, 2004. 18–20
Sperandei, S. Understanding logistic regression analysis. Biochemia Medica, 2014. 12–18.
Bisong, E. Logistic Regression. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform. Apress, Berkeley, CA. 2019.
Zou, X., Hu, Y., Tian, Z., & Shen, K. Logistic Regression Model Optimization and Case Analysis. 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT). 2019
Noble, W. S. What is a support vector machine? Nature Biotechnology, 2006. 24(12), 1565–1567.
Mammone, A., Turchi, M., & Cristianini, N. Support vector machines. Wiley Interdisciplinary Reviews: Computational Statistics, 2009. 1(3), 283–289.
Otchere, D. A., Ganat, T. a. O., Gholami, R., & Ridha, S. Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: Comparative analysis of ANN and SVM models. Journal of Petroleum Science and Engineering, 2021. 200,
ZHAO, C., ZHANG, H., ZHANG, X., LIU, M., HU, Z., & FAN, B. Application of support vector machine (SVM) for prediction toxic activity of different data sets. Toxicology, 2006. 217(2-3), 105–119.
Kamel, H.; Abdulah, D.; Al-Tuwaijari, J. M. Cancer Classification Using Gaussian Naive Bayes Algorithm. 2019 International Engineering Conference (IEC). 2019
Gayathri, B., & Sumathi, C. P. An Automated Technique using Gaussian Naïve Bayes Classifier to Classify Breast Cancer. International Journal of Computer Applications, 2016. 148(6), 16–21.
Hemachandran, K., Tayal, S., George, P. M., Singla, P., & Kose, U. Bayesian reasoning and Gaussian processes for machine learning applications. In Chapman and Hall/CRC eBooks. 2022 3-5
Ontivero-Ortega, M., Lage-Castellanos, A., Valente, G., Goebel, R., & Valdes-Sosa, M. Fast Gaussian Naïve Bayes for searchlight classification analysis. NeuroImage, 2017. 163, 471–479.
Kingsford, C., & Salzberg, S. L. What are decision trees? Nature Biotechnology, 2008. 26(9), 1011–1013.
Adhatrao, K., Gaykar, A., Dhawan, A., Jha, R., & Honrao, V. Predicting students’ performance using ID3 and C4.5 classification algorithms. International Journal of Data Mining & Knowledge Management Process, 2013. 3(5), 39–52.
Ozsoy, S., Gümü?, G., & Khalilov, S. C4.5 versus other decision trees: A review. Computer Engineering and Applications, 2015. 4(3), 173–182.
Lawrence, J. Introduction to neural networks. California Scientific Software, USA. 1993.
Naim, A. E-Learning Engagement through Convolution Neural Networks in Business Education. European Journal of Innovationin Nonformal Education. 2022. Volumen 2 497-501
Aggarwal, C. C. Neural networks and deep learning: A Textbook. Springer. 2018.
Song, Z., Sung, S.-H., Park, D.-M., Park, B.-K., 2023. All-Year Dropout Prediction Modeling and Analysis for University Students. Applied Sciences 13, 1143.. https://doi.org/10.3390/app13021143
Flores V, Heras S, Julian V. Comparison of Predictive Models with Balanced Classes Using the SMOTE Method for the Forecast of Student Dropout in Higher Education. Electronics. 2022; 11(3):457. https://doi.org/10.3390/electronics11030457
Mnyawami, Y.N., Maziku, H.H., Mushi, J.C., 2022. Enhanced Model for Predicting Student Dropouts in Developing Countries Using Automated Machine Learning Approach: A Case of Tanzanian’s Secondary Schools. Applied Artificial Intelligence 36.. https://doi.org/10.1080/08839514.2022.2071406
Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A.A., Abid, M., Bashir, M., & Khan, S.U. (2021). Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models. IEEE Access, 9, 7519-7539.
Iam-On, N., & Boongoen, T. (2015). Improved student dropout prediction in Thai University using ensemble of mixed-type data clusterings. International Journal of Machine Learning and Cybernetics, 8(2), 497–510. doi:10.1007/s13042-015-0341-x
Livieris, I. E., Kotsilieris, T., Tampakas, V., & Pintelas, P. (2018). Improving the evaluation process of students’ performance utilizing a decision support software. Neural Computing and Applications. doi:10.1007/s00521-018-3756-y