PREDICTIVE MODELING OF DIABETES RISK USING CLINICAL AND LIFESTYLE INDICATORS: A POPULATION-BASED DATA ANALYSIS

Authors

  • Dr. Khalid Al-Rashid

DOI:

https://doi.org/10.69980/x9jzzh17

Keywords:

Diabetes prediction , Machine Learning, Risk Assesment

Abstract

Diabetes mellitus is an increasing worldwide health issue that needs to be addressed with proper measures of early detection and prevention. The purpose of the study is to design and test predictive equations of diabetes risk with the help of both clinical and lifestyle predictors based on the large population-based dataset containing 100,000 records. Examples of variables present in the dataset are age, gender, body mass index (BMI), hypertension, heart disease, smoking history, HbA1c level, and blood glucose level, and diabetes status is the outcome variable. The analytical design used was a cross-sectional design, which included data preprocessing, feature selection, and supervised machine learning. Several models, such as logistic regression, decision tree, random forest, and gradient boosting, were created and tested on performance measures, including accuracy, precision, recall, F1-score, and ROC-AUC. The findings point to the conclusion that the most predictive variables of diabetes are the levels of HbA1c and blood glucose, then BMI, and age. Ensemble models, especially gradient boosting, showed better predictive accuracy than the traditional methods. The results indicate the usefulness of using a combination of clinical and lifestyle data in predictive modeling to determine the risk of diabetes early. This method has huge consequences on preventive healthcare, allowing the timely intervention and better clinical decision-making.

References

1. Casacchia, N. J., Lenoir, K. M., Rigdon, J., & Wells, B. J. (2024). Development, validation and recalibration of a prediction model for prediabetes: an EHR and NHANES-based study. BMC Medical Informatics and Decision Making, 24(1), 387.

2. Deberneh, H. M., & Kim, I. (2021). Prediction of type 2 diabetes based on machine learning algorithm. International journal of environmental research and public health, 18(6), 3317.

3. Dutta, A., Hasan, M. K., Ahmad, M., Awal, M. A., Islam, M. A., Masud, M., & Meshref, H. (2022). Early prediction of diabetes using an ensemble of machine learning models. International Journal of Environmental Research and Public Health, 19(19), 12378.

4. Gosak, L., Martinović, K., Lorber, M., & Stiglic, G. (2022). Artificial intelligence based prediction models for individuals at risk of multiple diabetic complications: A systematic review of the literature. Journal of Nursing Management, 30(8), 3765-3776.

5. Iparraguirre-Villanueva, O., Espinola-Linares, K., Flores Castañeda, R. O., & Cabanillas-Carbonell, M. (2023). Application of machine learning models for early detection and accurate classification of type 2 diabetes. Diagnostics, 13(14), 2383.

6. Kodama, S., Fujihara, K., Horikawa, C., Kitazawa, M., Iwanaga, M., Kato, K., ... & Sone, H. (2022). Predictive ability of current machine learning algorithms for type 2 diabetes mellitus: A meta‐analysis. Journal of diabetes investigation, 13(5), 900-908.

7. Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A., & Stiglic, G. (2020). Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Scientific reports, 10(1), 11981.

8. Liu, H., Dong, S., Yang, H., Wang, L., Liu, J., Du, Y., ... & Fu, X. (2024). Comparing the accuracy of four machine learning models in predicting type 2 diabetes onset within the Chinese population: a retrospective study. Journal of International Medical Research, 52(6), 03000605241253786.

9. Mansoori, A., Sahranavard, T., Hosseini, Z. S., Soflaei, S. S., Emrani, N., Nazar, E., ... & Mobarhan, M. G. (2023). Prediction of type 2 diabetes mellitus using hematological factors based on machine learning approaches: a cohort study analysis. Scientific Reports, 13(1), 663.

10. Mustafa, T. Z. (2023). Diabetes prediction dataset. Kaggle. https://www.kaggle.com/datasets/iammustafatz/diabetes-prediction-dataset

11. Ojurongbe, T. A., Afolabi, H. A., Oyekale, A., Bashiru, K. A., Ayelagbe, O., Ojurongbe, O., ... & Adegoke, N. A. (2024). Predictive model for early detection of type 2 diabetes using patients' clinical symptoms, demographic features, and knowledge of diabetes. Health Science Reports, 7(1), e1834.

12. Olusanya, M. O., Ogunsakin, R. E., Ghai, M., & Adeleke, M. A. (2022). Accuracy of machine learning classification models for the prediction of type 2 diabetes mellitus: A systematic survey and meta-analysis approach. International journal of environmental research and public health, 19(21), 14280.

13. Qin, Y., Wu, J., Xiao, W., Wang, K., Huang, A., Liu, B., ... & Ren, Z. (2022). Machine learning models for data-driven prediction of diabetes by lifestyle type. International journal of environmental research and public health, 19(22), 15027.

14. Shin, J., Kim, J., Lee, C., Yoon, J. Y., Kim, S., Song, S., & Kim, H. S. (2022). Development of various diabetes prediction models using machine learning techniques. Diabetes & Metabolism Journal, 46(4), 650-657.

15. Tanabe, H., Sato, M., Miyake, A., Shimajiri, Y., Ojima, T., Narita, A., ... & Shimabukuro, M. (2024). Machine learning-based reproducible prediction of type 2 diabetes subtypes. Diabetologia, 67(11), 2446-2458.

16. Tao, X., Jiang, M., Liu, Y., Hu, Q., Zhu, B., Hu, J., ... & Long, E. (2023). Predicting three-month fasting blood glucose and glycated hemoglobin changes in patients with type 2 diabetes mellitus based on multiple machine learning algorithms. Scientific Reports, 13(1), 16437.

17. Todd, B. (2021). The USPSTF Recommends Earlier Screening for Prediabetes and Type 2 Diabetes. AJN The American Journal of Nursing, 121(12), 60.

18. Wang, S., Chen, R., Wang, S., Kong, D., Cao, R., Lin, C., ... & Ding, Y. L. (2023). Comparative study on risk prediction model of type 2 diabetes based on machine learning theory: a cross-sectional study. BMJ open, 13(8), e069018.

19. Wu, Y., Hu, H., Cai, J., Chen, R., Zuo, X., Cheng, H., & Yan, D. (2021). Machine learning for predicting the 3-year risk of incident diabetes in Chinese adults. Frontiers in Public Health, 9, 626331.

20. Xie, Z., Nikolayeva, O., Luo, J., & Li, D. (2019). Building risk prediction models for type 2 diabetes using machine learning techniques. Preventing chronic disease, 16, E130.

Downloads

Published

2025-12-29