Pulling data from Kaggle, we are examining maternal health risks for pregnant women using the dataset available at Maternal Health Risk Data.

Since I’m looking at how age and blood pressure affect the maternal risk during pregnancy, I want to convert the RiskLevel column into a factor. This allows R to treat it as a categorical variable, rather than a string. It also enables us to order the levels so that “low risk” is seen as different from “high risk,” creating a proper level ordering. This conversion also improves visualization and analysis.

# Convert RiskLevel to a factor
health_data <- Maternal_risk_data %>%
  mutate(
    RiskLevel = factor(RiskLevel, levels = c("low risk", "mid risk", "high risk"))
  )

# Check the data types
str(health_data)
## 'data.frame':    1014 obs. of  7 variables:
##  $ Age        : int  25 35 29 30 35 23 23 35 32 42 ...
##  $ SystolicBP : int  130 140 90 140 120 140 130 85 120 130 ...
##  $ DiastolicBP: int  80 90 70 85 60 80 70 60 90 80 ...
##  $ BS         : num  15 13 8 7 6.1 7.01 7.01 11 6.9 18 ...
##  $ BodyTemp   : num  98 98 100 98 98 98 98 102 98 98 ...
##  $ HeartRate  : int  86 70 80 70 76 70 78 86 70 70 ...
##  $ RiskLevel  : Factor w/ 3 levels "low risk","mid risk",..: 3 3 3 3 1 3 2 3 2 3 ...

From the visualization, we see that younger individuals have a higher proportion of low risk compared to those over 40. When exploring the relationship between diastolic and systolic blood pressure, we observe that as both increase, the likelihood of being classified in the high-risk maternal level also rises.