Introduction

We aim to investigate the distribution of healthy and sick patients with heart attacks organized by biological sex, as well as compare the cholesterol levels between males and females in both health statuses. This analysis sheds light on potential gender disparities in heart attack patients and their health outcomes.

Data Source: “https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat

# Load some R packages: 
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(extrafont)
## Registering fonts with R
library(ggtext)
library(Cairo)

# Get the data: 
url<-"https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat"

heart <- read.csv(url, sep=" ", header = F)
names <- c("age", "sex", "cp", "restbp", 
           "chol", "fbs", "restecg", "maxach", "exang", "oldpeak", "slope", "num", 
           "thal","disease")
heart<-data.frame(heart)

load("heart_data.Rdata")

ggplot(data = heart.dat, mapping = aes(x = sex, fill = disease)) +
  geom_bar() +
  
  labs(title ="Distrubution of Disease with Relation to Sex",
       x = "Sex",
       y = "Number of People",
       color="State of Health (Sick vs Healthy)",
       caption = "Distribution of Sick vs Healthy individuals within the data set, organized by biological sex")

# Save the plot: 
ggsave("Heart_Disease.png",
       width = 21,
       height = 30,
       units = "cm",
       dpi = 500,
       type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments

Distribution of male and female healthy and sick patients

# Pie chart
ggplot(data = heart.dat, mapping = aes(x = "", fill = disease)) +
  geom_bar(width = 1) +  # Create a pie chart by setting width = 1
  coord_polar(theta = "y") +  # Convert the bar chart to a pie chart
  labs(title = "Distribution of Disease with Relation to Sex",
       fill = "State of Health (Sick vs Healthy)") +
  theme_void()  # Remove axis and grid lines

# Save the plot: 
ggsave("Heart_Disease_pie.png",
       width = 21,
       height = 30,
       units = "cm",
       dpi = 500,
       type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments

A Comparative Analysis of Health Status and Cholesterol Levels

Males and female often have different values for clincial and physiological variables. We can make a scatter plot that compares cholesterol in males and females who are helathy and sick.

ggplot(data = heart.dat, mapping = aes(x = sex, y = chol, color = disease)) +
  geom_jitter() +
  labs(title ="Cholesterol levels of healthy vs Sick Individuals",
       x = "Sex",
       y = "Cholesterol Levels")

# Save the plot: 
ggsave("Heart_Disease_scatter.png",
       width = 21,
       height = 25,
       units = "cm",
       dpi = 500,
       type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments

Discussion

By comparing the proportions of male and female patients in each health status category, we can determine if there are any notable differences in the distribution of heart attack cases between genders. Next, By examining the cholesterol levels of healthy and sick patients separately for each gender, we can identify any significant differences or patterns in cholesterol levels among heart attack patients.