AccScience Publishing / IJOSI / Volume 10 / Issue 2 / DOI: 10.6977/IJoSI.202604_10(2).0001
ARTICLE

Principal component analysis-enhanced ensemble learning models for proactive failure prediction in cloud-based systems

Velicheti Anantha Lakshmi1 Vundavalli BalaSankar2 Vemuri Sailaja1 Janardhanarao Addanki1* Anantham Srujana Jyothi1
Show Less
1 Department of Computer Science and Engineering (Artificial Intelligence and Machine Learning), Pragati Engineering College (Autonomous), Andhra Pradesh, India
2 Department of Computer Science and Engineering (Artificial Intelligence and Machine Learning), Godavari Global University, Rajamahendravaram, Andhra Pradesh, India
Received: 25 October 2025 | Revised: 6 December 2025 | Accepted: 7 February 2026 | Published online: 30 April 2026
© 2026 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC-by the license) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

Cloud computing environments require high availability and scalability, making proactive failure management essential for ensuring system reliability, security, and consistent performance. Effective failure prediction significantly reduces downtime, improves disaster recovery processes, and maintains uninterrupted service delivery. This paper presents an optimized machine learning framework for predicting failures in cloud infrastructures by integrating principal component analysis (PCA) with advanced ensemble learning models. The study employs three prominent models—random forest (RF), categorical boosting (CatBoost), and light gradient boosting machine (LightGBM)—enhanced through PCA to improve feature representation and overall predictive accuracy. Key operational metrics, including class scheduling, memory usage, central processing unit utilization, event instances, and task priority, are used as features. The Google 2019 cluster dataset is utilized, and preprocessing steps involve handling missing data, scaling numerical attributes, and encoding categorical variables to ensure data quality. Experimental results reveal that PCA-enhanced RF, CatBoost, and LightGBM achieve superior accuracies of 94.31%, 97.17%, and 98.36%, respectively, outperforming their standard counterparts. These outcomes highlight the effectiveness of PCA-integrated ensemble learning and underscore its potential for real-time cloud failure prediction and automated fault monitoring in large-scale distributed environments.

Keywords
Cloud-based systems
Failure prediction
Random forest; CatBoost
Light gradient boosting machine
Principal component analysis
Likelihood of failure
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
References

Al Essa, H. A., & Bhay, W. S. (2023). Ensemble learning classifiers hybrid feature selection for enhancing performance of intrusion detection system. Bulletin of Electrical Engineering and Informatics, 13(1), 665–676. https://doi.org/10.11591/eei.v13i1.5844

 

Chen, Y., & Zhang, R. (2025). Hybrid dual-channel attention CNN and eXtreme Gradient Boosting for industrial process model development and fault diagnosis. IEEE Internet of Things Journal, 12(17), 35649–35661. https://doi.org/10.1109/JIOT.2025.3579006

 

Deb, K., Zhang, X., & Duh, K. (2022). Post-hoc interpretation of transformer hyperparameters with explainable boosting machines. In J. Bastings, Y. Belinkov, Y. Elazar, D. Hupkes, N. Saphra, & S. Wiegreffe (Eds.), Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (pp. 51–61). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.blackboxnlp-1.5

 

Dugyala, R., Kumar, T. N., Umamaheshwar, E., & Vijendar, G. (2023). An ensemble learning approach for task failure prediction in cloud data centers. In S. K. Tummala, S. Kosaraju, P. B. Bobba, & S. K. Singh (Eds.), E3S Web of Conferences, 391, 01072. EDP Sciences. https://doi.org/10.1051/e3sconf/202339101072

 

Gao, J., Wang, H., & Shen, H. (2020). Task failure prediction in cloud data centers using deep learning. IEEE Transactions on Services Computing, 15(3), 1411–1422.

 

Giridhar, M. V., Shetty, C. S., Kanthi, N., & Jayanthi, P. N. (2025). Artificial intelligence-based fault prediction for cloudresource efficiency. Journal of Emerging Technologies and Innovative Research, 12(2), g543–g546. https://www.jetir. org/view?paper=JETIR2502662

 

Gollapalli, M., AlMetrik, M. A., AlNajrani, B. S., AlOmari, A. A., AlDawoud, S. H., AlMunsour, Y. Z., Abdulqader, M. M., & Aloup, K. M. (2022). Task failure prediction using machine learning techniques in the Google cluster trace cloud computing environment. Mathematical Modelling of Engineering Problems, 9(2), 545–553. https://doi.org/10.18280/mmep.090234

 

Hadadi, F., Dawes, J. H., Shin, D., Bianculli, D., & Briand, L. (2024). Systematic evaluation of deep learning models for log-based failure prediction. Empirical Software Engineering, 29(5), 105. https://doi.org/10.1007/s10664-024-10501-4

 

Hamaide, V., Joassin, D., Castin, L., & Glineur, F. (2022). A two-level machine learning framework for predictive maintenance: Comparison of learning formulations. arXiv. https://arxiv. org/abs/2204.10083

 

Jardine, A. K. S., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20(7), 1483–1510. https://doi.org/10.1016/j.ymssp.2005.09.012

 

Jassas, M. S., Mahmoud, S. M., Alrashoud, M., & Alqahtani, A. (2022). Analysis of job failure and prediction model for cloud computing using machine learning. Sensors, 22(5), 2035. https://doi.org/10.3390/s22052035

 

Li, X., Wu, X., Wang, T., Xie, Y., & Chu, F. (2025). Fault diagnosis method for imbalanced data based on adaptive diffusion models and generative adversarial networks. Engineering Applications of Artificial Intelligence, 147, 110410. https://doi.org/10.1016/j.engappai.2025.110410

 

Malhi, A., & Gao, R. X. (2004). PCA-based feature selection scheme for machine defect classification. IEEE Transactions on Instrumentation and Measurement, 53(6), 1517–1525. https://doi.org/10.1109/TIM.2004.834070

 

Nori, H., Jenkins, S., Koch, P., & Caruana, R. (2019). InterpretML: A unified framework for machine learning interpretability. arXiv. https://arxiv.org/abs/1909.09223

 

Pruckovskaja, V., Weissenfeld, A., Heistracher, C., Graser, A., Kafka, J., Leputsch, P., Schall, D., & Kemnitz, J. (2023). Federated learning for predictive maintenance and quality inspection in industrial applications. arXiv. https://arxiv.org/abs/2304.11101

 

Saxena, D., & Singh, A. K. (2022). OFP-TM: An online VM failure prediction and tolerance model towards high availability of cloud computing environments. The Journal of Supercomputing, 78(6), 8003–8024. https://doi.org/10.1007/s11227-021-04235-z

 

Vago, N. O. P., Forbicini, F., & Fraternali, P. (2024). Predicting machine failures from multivariate time series: An industrial case study. Machines, 12(6), 357. https://doi.org/10.3390/machines12060357

 

Wen, Y., Rahman, M. F., Xu, H., & Tseng, T.-L. B. (2022). Recent advances and trends of predictive maintenance from data-driven machine prognostics perspective. Measurement, 187, 110276. https://doi.org/10.1016/j.measurement.2021.110276

 

Xie, Y., Lian, K., Liu, Q., Zhang, C., & Liu, H. (2021). Digital twin for cutting tool: Modeling, application and service strategy. Journal of Manufacturing Systems, 58, 305–312.

 

Yang, H., & Kim, Y. (2022). Design and implementation of machine learning-based fault prediction system in cloud infrastructure. Electronics, 11(22), 3765. https://doi.org/10.3390/electronics11223765

 

Zhang, Q., Liu, Q., & Ye, Q. (2024). An attention-based temporal convolutional network method for predicting remaining useful life of aero-engine. Engineering Applications of Artificial Intelligence, 127(A), 107241. https://doi.org/10.1016/j.engappai.2023.107241

 

Zhao, R., Yan, R., Chen, Z., Mao, K., Wang, P., & Gao, R. X. (2019). Deep learning and its applications to machine health monitoring. Mechanical Systems and Signal Processing, 115, 213–237. https://doi.org/10.1016/j.ymssp.2018.05.050

Share
Back to top
International Journal of Systematic Innovation, Electronic ISSN: 2077-8767 Print ISSN: 2077-7973, Published by AccScience Publishing