Enhancing Predictive Accuracy in Educational Assessment: A Comparative Analysis of Machine Learning Models for Predicting Student Performance
Main Article Content
Abstract
This study presents a comprehensive evaluation of multiple machine learning models for predicting student performance within a smart learning environment. Utilizing a dataset from the Smart Learning Project, which includes data on 14 English PISA-like quizzes, 27 competencies, 8 schools, and 181 students, the analysis involves data preprocessing, feature selection, model training, and evaluation. The models assessed include Random Forest, Support Vector Regression (SVR), AdaBoost, Bayesian Ridge, K-Nearest Neighbors (KNN), ElasticNet, XGBoost, Gradient Boosting, and Stacking Ensemble. Performance metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared (R²) are used to evaluate model efficacy. The results indicate that ensemble methods, particularly XGBoost and Stacking Ensemble, provide superior predictive accuracy, capturing complex relationships within the data. The study also highlights the importance of feature selection and data preprocessing in enhancing model performance. These findings underscore the potential of advanced machine learning techniques in educational analytics, offering valuable insights for personalized learning strategies and early intervention.