Enhanced Diabetes Risk Prediction through Machine Learning Techniques
Keywords:
Class imbalance, Diabetes prediction, Django framework, Explainable AI, Feature selection, Machine learningAbstract
This study delves into diabetes prediction by analyzing a dataset comprising diabetic patients using a combination of advanced machine-learning techniques. The primary focus is on the practical application of mutual information feature selection and semi-supervised learning, coupled with high gradient boosting, to enhance the predictive accuracy of diabetes diagnosis. The research extensively evaluates the performance of several machine learning algorithms, including Decision Tree, Logistic Regression, Support Vector Machine (SVM), and Random Forest, along with a novel combined model integrating Random Forest and SVM. Among these, the combined Random Forest-SVM model demonstrated superior performance, achieving the highest accuracy of 76%, thereby highlighting its potential to improve diabetes risk assessment. The study underscores the significance of employing sophisticated machine learning approaches to refine the prediction models used in healthcare. A Django web framework was developed further to enhance the practical application of the research findings, enabling instant diabetes prediction. This web-based tool emphasizes user accessibility and promotes proactive healthcare interventions, making it a valuable resource for patients and healthcare providers. The study's findings point to the critical role machine learning can play in advancing personalized healthcare and improving outcomes for individuals at risk of diabetes.