2025-03-25

Goal of this presentation

-In this presentation, I will use a diabetes dataset which examines the differences in traits of diabetics versus healthy people.

-This will demonstrate knowledge of R statistical analysis techniques and graphing techniques.

Data used in this presentation:

-Number of Pregnancies

-Non-Diabetic or Diabetic

-Blood Pressure

-Skin Thickness

-BMI

-Diabetics Pedigree Function

-Age

Pregnant women have a chance of developing gestational diabetes

-Although this typically resolves after birth, it gives these women a higher risk of type 2 diabetes later in life.

-I will examine a correlation between number of pregnancies and BMI to determine if these variables have a statistically significant correlation that may together contribute to diabetes.

Linear Regression of Pregnancies and BMI

LRPreg = lm(BMI ~ Pregnancies, data = diabetes_data)

(summary(LRPreg))$coefficients["Pregnancies", "Pr(>|t|)"]
## [1] 0.5507416

The p-value for this linear regression was 0.551, suggesting no significant relationship between number of pregnancies and BMI in this sample.

Plotting BMI vs Pregnancies colorized by Diabetic Status

Plotting BMI vs Pregnancies colorized by Diabetic Status

-Despite the regression being insignificant, it is clear that more diabetics have a higher BMI.

-There appears to be a trend that a person who has experienced a greater number of pregnancies may have a smaller BMI requirement for developing diabetes.

Age, Blood Pressure, and Skin Thickness in Diabetes

-Diabetes can be diagnosed at any age, but generally it is diagnosed in middle-aged people.

-People With Diabetes are at risk for High Blood Pressure

-Thickening of skin is a common symptom of diabetes

3D Plot of Age, Blood Pressure and Skin Thickness

3D Plot of Age, Blood Pressure and Skin Thickness

-Skin thickness doesn’t appear to contribute much to diabetic status in this dataset.

-In this dataset, many older people with higher blood pressure have diabetes.

-Blood pressure and skin thickness also vary widely among younger people, so it may be more of a commentary on health than age.

Number of Diabetics Diagnosed per Age Group

Since most diabetes diagnoses are said to be in middle-aged people, I wanted to assess if this was supported by our dataset.

Number of Diabetics Diagnosed per Age Group

Number of Diabetics Diagnosed per Age Group

This particular sample suggests that you are more likely to become diabetic between 20 years old until around 50 years old. Before or after this point, you seem less likely to develop diabetes.

Diabetes Pedigree Function

The Diabetes Pedigree Function Value looks at your family history to assign a risk factor for your development of diabetes.

Diabetes Pedigree Function Values in Nondiabetics vs Diabetics

Diabetes Pedigree Function Values in Nondiabetics vs Diabetics

-While diabetics tend to have a greater Diabetes Pedigree Function Value, the number of outliers in the non-diabetic box plot suggest that this function is not always an accurate predictor of if a person will develop diabetes.

-This is only a snapshot in time however, so current non-diabetics may later be diagnosed with diabetes.

Conclusion

-This data shows that much of the collected data, whether that be age, blood pressure, or another variable we explored are often a cause or consequence of diabetes.

-While symptoms vary across people, and no trait is mutually exclusive to a diabetic or non-diabetic, understanding these traits can help us recognize and prevent negative health outcomes.