LA Assignment Presentation

Team-AURA

Introduction

  • This project analyzes a diabetes dataset
  • Aim: Compare health factors using slope chart
  • Visualization helps understand differences clearly

Dataset

  • Dataset used: Diabetes dataset
  • Key variables:
    • Glucose
    • BMI
    • Age
    • Insulin
    • Blood Pressure
  • Outcome variable:
    • 0 → Non-Diabetic
    • 1 → Diabetic

Objective

  • Compare two groups:
    • Non-Diabetic
    • Diabetic
  • Identify variation in health parameters

Load Libraries

- Libraries are packages that provide functions.
- We use:
- ggplot2 → for plotting graphs
- dplyr → for data manipulation
- tidyr → for reshaping data
- ggrepel → to avoid overlapping labels

Load Dataset

  • The dataset is loaded using read.csv()

  • It reads data from a file into R

Data Transformation

  • Raw data cannot be directly used for slope chart

  • We calculate average values for each group

Explanation:

  • group_by() → separates data into groups

  • summarise() → calculates mean values

  • na.rm = TRUE → removes missing values

Convert Data Format

  • Data is converted from wide format to long format

  • Required for ggplot slope chart

Explanation:

  • pivot_longer() → reshapes data

  • Makes it easier to plot multiple variables

Convert Outcome Labels

  • Replace numeric values with meaningful labels

Slope Chart Visualization

  • Slope chart shows change between two categories

Explanation:

  • geom_line() → connects values

  • geom_point() → marks data points

  • geom_text_repel() → avoids overlapping labels

  • labs() → adds title and axis labels

Insights

  • Glucose is higher in diabetic patients

  • BMI shows increasing trend

  • Age is slightly higher in diabetic group

  • Insulin varies significantly

  • Blood Pressure shows minor variation

Conclusion

  • Slope chart effectively compares two groups

  • Diabetic patients show higher health risk indicators

  • Visualization improves understanding of patterns