-In this presentation, I will use a diabetes dataset which examines the differences in traits of diabetics versus healthy people.
-This will demonstrate knowledge of R statistical analysis techniques and graphing techniques.
2025-03-25
-In this presentation, I will use a diabetes dataset which examines the differences in traits of diabetics versus healthy people.
-This will demonstrate knowledge of R statistical analysis techniques and graphing techniques.
-Number of Pregnancies
-Non-Diabetic or Diabetic
-Blood Pressure
-Skin Thickness
-BMI
-Diabetics Pedigree Function
-Age
-Although this typically resolves after birth, it gives these women a higher risk of type 2 diabetes later in life.
-I will examine a correlation between number of pregnancies and BMI to determine if these variables have a statistically significant correlation that may together contribute to diabetes.
LRPreg = lm(BMI ~ Pregnancies, data = diabetes_data) (summary(LRPreg))$coefficients["Pregnancies", "Pr(>|t|)"]
## [1] 0.5507416
The p-value for this linear regression was 0.551, suggesting no significant relationship between number of pregnancies and BMI in this sample.
-Despite the regression being insignificant, it is clear that more diabetics have a higher BMI.
-There appears to be a trend that a person who has experienced a greater number of pregnancies may have a smaller BMI requirement for developing diabetes.
-Diabetes can be diagnosed at any age, but generally it is diagnosed in middle-aged people.
-People With Diabetes are at risk for High Blood Pressure
-Thickening of skin is a common symptom of diabetes
-Skin thickness doesn’t appear to contribute much to diabetic status in this dataset.
-In this dataset, many older people with higher blood pressure have diabetes.
-Blood pressure and skin thickness also vary widely among younger people, so it may be more of a commentary on health than age.
Since most diabetes diagnoses are said to be in middle-aged people, I wanted to assess if this was supported by our dataset.
This particular sample suggests that you are more likely to become diabetic between 20 years old until around 50 years old. Before or after this point, you seem less likely to develop diabetes.
The Diabetes Pedigree Function Value looks at your family history to assign a risk factor for your development of diabetes.
-While diabetics tend to have a greater Diabetes Pedigree Function Value, the number of outliers in the non-diabetic box plot suggest that this function is not always an accurate predictor of if a person will develop diabetes.
-This is only a snapshot in time however, so current non-diabetics may later be diagnosed with diabetes.
-This data shows that much of the collected data, whether that be age, blood pressure, or another variable we explored are often a cause or consequence of diabetes.
-While symptoms vary across people, and no trait is mutually exclusive to a diabetic or non-diabetic, understanding these traits can help us recognize and prevent negative health outcomes.