We aim to investigate the distribution of healthy and sick patients with heart attacks organized by biological sex, as well as compare the cholesterol levels between males and females in both health statuses. This analysis sheds light on potential gender disparities in heart attack patients and their health outcomes.
Data Source: “https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat”
# Load some R packages:
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(extrafont)
## Registering fonts with R
library(ggtext)
library(Cairo)
# Get the data:
url<-"https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/heart/heart.dat"
heart <- read.csv(url, sep=" ", header = F)
names <- c("age", "sex", "cp", "restbp",
"chol", "fbs", "restecg", "maxach", "exang", "oldpeak", "slope", "num",
"thal","disease")
heart<-data.frame(heart)
load("heart_data.Rdata")
ggplot(data = heart.dat, mapping = aes(x = sex, fill = disease)) +
geom_bar() +
labs(title ="Distrubution of Disease with Relation to Sex",
x = "Sex",
y = "Number of People",
color="State of Health (Sick vs Healthy)",
caption = "Distribution of Sick vs Healthy individuals within the data set, organized by biological sex")
# Save the plot:
ggsave("Heart_Disease.png",
width = 21,
height = 30,
units = "cm",
dpi = 500,
type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments
# Pie chart
ggplot(data = heart.dat, mapping = aes(x = "", fill = disease)) +
geom_bar(width = 1) + # Create a pie chart by setting width = 1
coord_polar(theta = "y") + # Convert the bar chart to a pie chart
labs(title = "Distribution of Disease with Relation to Sex",
fill = "State of Health (Sick vs Healthy)") +
theme_void() # Remove axis and grid lines
# Save the plot:
ggsave("Heart_Disease_pie.png",
width = 21,
height = 30,
units = "cm",
dpi = 500,
type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments
Males and female often have different values for clincial and physiological variables. We can make a scatter plot that compares cholesterol in males and females who are helathy and sick.
ggplot(data = heart.dat, mapping = aes(x = sex, y = chol, color = disease)) +
geom_jitter() +
labs(title ="Cholesterol levels of healthy vs Sick Individuals",
x = "Sex",
y = "Cholesterol Levels")
# Save the plot:
ggsave("Heart_Disease_scatter.png",
width = 21,
height = 25,
units = "cm",
dpi = 500,
type = "cairo-png")
## Warning: Using ragg device as default. Ignoring `type` and `antialias`
## arguments
By comparing the proportions of male and female patients in each health status category, we can determine if there are any notable differences in the distribution of heart attack cases between genders. Next, By examining the cholesterol levels of healthy and sick patients separately for each gender, we can identify any significant differences or patterns in cholesterol levels among heart attack patients.