Comparison Random Forest and Logistic Regression in Predicting Motivation and Learning Outcomes of Junior High School Students

  • Palma Juanta Universitas Prima Indonesia
  • Valencia Pavithra Universitas Prima Indonesia
  • Nurija Sri Paska Hutabarat Universitas Prima Indonesia
  • Yehuda M. P. Simatupang Universitas Prima Indonesia
Keywords: Random Forest, Logistic Regression, Learning Motivation, Learning Outcomes, Prediction

Abstract

Student learning motivation and learning outcomes are important factors that influence educational success, especially at the junior high school level. Previous studies that primarily emphasize academic achievement prediction alone, this study simultaneously evaluates student motivation and learning outcomes as dual prediction targets. Moreover, while earlier research often applied only a single algorithm or focused on higher education datasets, this research specifically conducts a head-to-head comparison between Random Forest and Logistic Regression using junior high school data, thereby filling an important gap in secondary education predictive analytics. This study compares the performance of two machine learning algorithms, namely Random Forest and Logistic Regression, in predicting student motivation and learning outcomes based on data on learning habits, mental condition, attendance, sleep hours, family support, and academic grades. The study process included data pre-processing, normalization, separation of data into training and testing data, model training, and evaluation using accuracy, sensitivity, specificity, and AUC. Based on the study findings, Random Forest performed better with an accuracy of 0.91, sensitivity of 0.91, specificity of 0.94, and AUC of 0.94. Meanwhile, Logistic Regression obtained an accuracy of 0.84, sensitivity of 0.84, specificity of 0.90, and AUC of 0.95. These findings confirm that Random Forest is superior in its overall predictive ability, while Logistic Regression remains relevant due to its interpretability. This study aims to assist in the development of data-driven decision support systems in education to help schools identify students who require early intervention.

Downloads

Download data is not yet available.

References

B. Owusu-Boadu, F. D. Acheampong, K. A. S. Lartey, and E. Wereko-Brobby, “Academic Performance Modelling with Machine Learning Based on Cognitive and Non-Cognitive Features,” Applied Computer Systems, vol. 26, no. 2, pp. 122–131, 2021.

A. Agustiningsih, Y. Findawati, and I. A. Kautsar, “Classification of Vocational High School Graduates’ Ability Using XGBoost, Random Forest, and Logistic Regression,” JUTIF, vol. 4, no. 4, pp. 977–985, 2023.

F. A. Orji and J. Vassileva, “Machine Learning Approach for Predicting Students Academic Performance and Study Strategies Based on Motivation,” arXiv preprint arXiv:2210.08186, 2022.

R. Schmucker, J. Wang, S. Hu, and T. M. Mitchell, “Assessing the Performance of Online Students – New Data, New Approaches, Improved Accuracy,” arXiv preprint arXiv:2109.01753, 2021.

A. Bashir Musa, “Understanding Student Performance in Foundation Year: Insights from Logistic Regression, Naïve Bayes, and Random Forest Models,” IJIEE, vol. 14, no. 12, pp. 1716–1723, 2024.

M. Wang and S. Liu, “Machine Learning-Based Research on Adolescents’ Adaptability to Online Education,” arXiv preprint arXiv:2408, 2024.

N. T. Young and M. D. Caballero, “Predictive and Explanatory Models Might Miss Informative Features in Educational Data,” arXiv preprint arXiv:2103.14513, 2021.

N. Mulyana, W. Puspita, and J. Jumanto, “Increased Accuracy in Predicting Student Academic Performance Using Random Forest Classifier,” JOSRE, vol. 1, no. 2, 2023.

L. Nadjamuddin et al., “Development of a Model for Predicting Students’ Achievement,” IJSSHMR, vol. 3, no. 6, 2024.

“D. R. Nugroho et al., “Logistic Regression and Random Forest Comparison in Predicting Students’ Qualification,” Proc. 11th ICoICT, 2023.

Y. Chen, Q. Wang, and L. Zhao, “Hybrid Ensemble Models Combining Random Forest and Logistic Regression for Academic Prediction,” Education Data Science Journal, vol. 5, no. 2, pp. 101–115, 2023.

M. Rahman and J. Lee, “Machine Learning Applications in Student Motivation Prediction,” Int. J. of Educational Technology, vol. 18, no. 4, pp. 223–239, 2022.

X. Zhang, P. Li, and T. Sun, “Behavioral Data-Driven ML Models for Academic Performance Prediction,” Computers & Education: AI, vol. 7, 100204, 2024.

S. Liu and R. Tan, “Psychological Feature Extraction Using Random Forest in Educational Data,” Applied Artificial Intelligence, vol. 35, no. 12, pp. 987–1002, 2021.

D. Fernandez, J. Ramos, and P. Ortega, “Interpretability of Logistic Regression Models in Educational Analytics,” J. of Machine Learning for Education, vol. 9, no. 3, pp. 156–170, 2023.

A. D. Putri, M. Hidayat, and R. Sari, “Implementasi Machine Learning pada Sistem Evaluasi Pendidikan Dasar,” JTPD, vol. 6, no. 1, pp. 45–59, 2024.

T. Q. Nguyen, L. Hoang, and T. Pham, “Comparative Study of Combined Logistic Regression and Random Forest in Student Success Prediction,” IEEE Access, vol. 10, pp. 150231–150244, 2022.

R. Kumar and V. Prasad, “Data-Driven Adaptive Learning Systems Using RF and LR Algorithms,” Smart Learning Environments, vol. 12, no. 1, pp. 34–49, 2025.

S. Ariyanti and A. Wibowo, “Evaluasi Model Machine Learning Berdasarkan AUC dan ROC Curve pada Data Pendidikan,” JSKI, vol. 3, no. 4, pp. 312–320, 2021.

M. Garcia, A. Torres, and L. Ruiz, “Choosing the Right ML Algorithm for Educational Data Analysis,” Computational Education Review

Published
2026-03-25
How to Cite
Juanta, P., Pavithra, V., Hutabarat, N. S. P., & Simatupang, Y. M. P. (2026). Comparison Random Forest and Logistic Regression in Predicting Motivation and Learning Outcomes of Junior High School Students. Jurnal Informatika Dan Rekayasa Perangkat Lunak, 7(1), 50-62. https://doi.org/10.33365/jatika.v7i1.1510