Optimizing Diabetes Classification: BOA-Enhanced ML with EDA and SMOTE
Article Sidebar
Main Article Content
Abstract
Diabetes Mellitus, a chronic metabolic disorder stemming from fluctuations in blood
glucose and insulin levels, exerts profound impacts on every organ, significantly
compromising overall health. While a permanent cure remains elusive, proactive
management can control the disease's extent. Early detection is pivotal in averting its
onset. This research employs Exploratory Data Analysis (EDA), coupled with
SMOTE analysis , to unveil patterns, correlation, characteristics, and data structures.
For diabetes classification, Support Vector Machine (SVM), Extreme Gradient
Boosting (XG Boost) , Random Forest (RF), Logistic Regression(LR) and Decision
Tree(DT) optimized by Bees Optimization, were employed. Metrics like the F1 Score,
ROC curve, accuracy, precision, and recall are used to carefully evaluate the model's
performance. In order to determine the parameters that support classification, this
model was tested using the PIMA Indian dataset and real-time datasets. For the realtime dataset with BOA, the SVM model scored an astounding 98.86% accuracy, but
for the PIMA dataset, it only managed a 96% accuracy. As a result, this study proves
that, in comparison to cutting-edge techniques, combining EDA with SMOTE and
ML with BOA produces better outcome.