Diabetes Prediction

22nd June 2020

alt text

Group Members (Group C):
Robin Ahmed\( \ \) (17221070)
LEE ZHEN LEK\( \ \) (17219514)
AVINNAASH SURESH\( \ \) (17219903)
TAN HIAP LI\( \ \) (17219269)

Introduction

About Diabetes


Diabetes is a chronic condition in which the body develops a resistance to insulin, a hormone which converts food into glucose. Diabetes affect many people worldwide and is normally divided into Type 1 and Type 2 diabetes. Both have different characteristics.

What do we want to achieve in this Project


In this project, we have tried to analyze and considered a model with high accuracy on the PIMA Indian Diabetes dataset to predict if a particular observation is at a risk of developing diabetes, given the independent factors.

However, to scrutinize the prediction outcome, here we have developed an interactive Shiny App using R language which will help to checkout the result depends on the feature values you have set.

Dataset considered: https://www.kaggle.com/uciml/pima-indians-diabetes-database

Data Pre-processing, Visualization & Model Implementation

Data Pre-processing & Visualization

To prepare data prior analysis and model implementation, we have carried out below task:

  • Data Cleaning
    • Detect missing value.
    • Ensure the font case for continous variable is standardized.
    • Check the decimal points for numeric variable is consistent.
  • Exploratory Data Analaysis

Once we are done with the data preparation and EDA then considered the cleaned output and implemented below machine learning model using R to check the accuracy. And finally selected the one with maximum accuracy and incorporated that in Shiny App

Model Implementation

  • Implemented Models and their Accuracies
    • Logistic Regression [75.32%]
    • K Nearest Neighbors (KNN) [74.03%]
    • Support Vector Machine(SVM) [74.68%]

Diabetes Shiny App Details

some
What this ShinyApp is all about:
  • Overview tab will give you the very broad level idea about the App
  • Inside HeatMap, you can visualize cases & severity in Western Pacific region
  • Prediction is the main page where you can input different parameters to see how likely you are to get affted with Diabetes
  • Some EDA output have been incorporated under comparison & Exploratory Data Analysis tab
  • Finally we have covered the App descripition in About tab

Experience & Conclusion

Experience Summary


  • It has been a great learning experience throughout the process of completing the whole assignment.
  • Since all of us are new, at first it was difficult to use shiny, R Packages, slidify and R-presenter, but with the help of reference & guidance, it ease our task <>br>

Conclusion

With the explosion rate of diabetes growing exponentially every year, we believe a prediction mechanism as done in the project along with an application will help to resist its speed by creating awareness, diagnosing at an early stage and thereby help to keep the situation under control

  • please feel free to visit our GitHub repository for the source code :

Thank You & Enjoy the App