Veri setine erişmek için Heart Disease sitesini ziyaret edebilirsiniz.
Yaş: Hastanın yaşı (yıl)
Cinsiyet: Hastanın cinsiyeti (E: Erkek, K: Kadın)
ChestPainType: göğüs ağrısı tipi (TA: Tipik Angina, ATA: Atipik Angina, NAP: Anginal Olmayan Ağrı, ASY: Asemptomatik)
DinlenmeBP: dinlenme kan basıncı (mm Hg)
Cholesterol: serum kolesterolü (mm/dl)
FastingBS: açlık kan şekeri (1: FastingBS > 120 mg/dl ise, 0: aksi takdirde)
RestingECG: istirahat elektrokardiyogram sonuçları (Normal: Normal, ST: ST-T dalgası anormalliği (T dalgası inversiyonları ve/veya ST elevasyonu veya > 0,05 mV depresyonu), LVH: Estes kriterlerine göre olası veya kesin sol ventriküler hipertrofiyi gösteren)
MaxHR: ulaşılan maksimum kalp atış hızı (60 ile 202 arasındaki sayısal değer)
ExerciseAngina: egzersize bağlı anjina (E: Evet, N: Hayır)
Oldpeak: oldpeak = ST (Alçalmada ölçülen sayısal değer)
ST_Slope: zirve egzersiz ST segmentinin eğimi (Up: yukarı eğimli, Down: düz, Aşağı: aşağı eğimli)
HeartDisease: çıktı sınıfı (1: kalp hastalığı, 0: Normal)
library(readr)
library(tidyverse)
library(tidyr)
library(dendextend)
library(knitr)
library(gridExtra)
library(GGally)
library(ggplot2)
library(VIM)
library(corrplot)
library(car)
library(ResourceSelection)
library(glmulti)
library(tree)
library(randomForest)
library(ISLR)
library(class)
library(pROC)
library(gtools)
library(tidyverse)
library(GGally)
library(superml)
library(caret)
library(Boruta)
library("stringr")
library("tidyr")
library("readr")
library("here")
library("skimr")
library("janitor")
library("lubridate")
library(gridExtra)
library(ggplot2)
library(VIM)
library(corrplot)
library(car)
library(ResourceSelection)
library(glmulti)
library(tree)
library(randomForest)
library(ISLR)
library(class)
library(pROC)
library(gtools)
library(tidyverse)
library("scales")
library("ggcorrplot")
library("ggrepel")
library("forcats")
library("corrgram")
library(tidymodels)
library(baguette)
library(discrim)
library(bonsai)
library(ResourceSelection)
library(kableExtra)
library(broom)
library(dplyr)
library(caret)
library(tidyr)
library(corrplot)
library("Hmisc")
library(psych)
library(factoextra)
library("DescTools")
library(ResourceSelection)
library(haven)
library(effectsize)
library(rstatix)
library(ggpubr)
library(biotools)
library(PerformanceAnalytics)
library(heplots)
library(gplots)
df <- read.csv('/home/ilke/Downloads/heart (2).csv')
kable(head(df), format = "html") %>%
kable_styling()
| Age | Sex | ChestPainType | RestingBP | Cholesterol | FastingBS | RestingECG | MaxHR | ExerciseAngina | Oldpeak | ST_Slope | HeartDisease |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 40 | M | ATA | 140 | 289 | 0 | Normal | 172 | N | 0.0 | Up | 0 |
| 49 | F | NAP | 160 | 180 | 0 | Normal | 156 | N | 1.0 | Flat | 1 |
| 37 | M | ATA | 130 | 283 | 0 | ST | 98 | N | 0.0 | Up | 0 |
| 48 | F | ASY | 138 | 214 | 0 | Normal | 108 | Y | 1.5 | Flat | 1 |
| 54 | M | NAP | 150 | 195 | 0 | Normal | 122 | N | 0.0 | Up | 0 |
| 39 | M | NAP | 120 | 339 | 0 | Normal | 170 | N | 0.0 | Up | 0 |
colnames(df)
## [1] "Age" "Sex" "ChestPainType" "RestingBP"
## [5] "Cholesterol" "FastingBS" "RestingECG" "MaxHR"
## [9] "ExerciseAngina" "Oldpeak" "ST_Slope" "HeartDisease"
str(df)
## 'data.frame': 918 obs. of 12 variables:
## $ Age : int 40 49 37 48 54 39 45 54 37 48 ...
## $ Sex : chr "M" "F" "M" "F" ...
## $ ChestPainType : chr "ATA" "NAP" "ATA" "ASY" ...
## $ RestingBP : int 140 160 130 138 150 120 130 110 140 120 ...
## $ Cholesterol : int 289 180 283 214 195 339 237 208 207 284 ...
## $ FastingBS : int 0 0 0 0 0 0 0 0 0 0 ...
## $ RestingECG : chr "Normal" "Normal" "ST" "Normal" ...
## $ MaxHR : int 172 156 98 108 122 170 170 142 130 120 ...
## $ ExerciseAngina: chr "N" "N" "N" "Y" ...
## $ Oldpeak : num 0 1 0 1.5 0 0 0 0 1.5 0 ...
## $ ST_Slope : chr "Up" "Flat" "Up" "Flat" ...
## $ HeartDisease : int 0 1 0 1 0 0 0 0 1 0 ...
df <- df %>% mutate(
Sex = factor(Sex),
ChestPainType = factor(ChestPainType),
FastingBS = factor(FastingBS),
RestingECG = factor(RestingECG),
ExerciseAngina = factor(ExerciseAngina),
ST_Slope = factor(ST_Slope),
HeartDisease = factor(HeartDisease)
)
str(df)
## 'data.frame': 918 obs. of 12 variables:
## $ Age : int 40 49 37 48 54 39 45 54 37 48 ...
## $ Sex : Factor w/ 2 levels "F","M": 2 1 2 1 2 2 1 2 2 1 ...
## $ ChestPainType : Factor w/ 4 levels "ASY","ATA","NAP",..: 2 3 2 1 3 3 2 2 1 2 ...
## $ RestingBP : int 140 160 130 138 150 120 130 110 140 120 ...
## $ Cholesterol : int 289 180 283 214 195 339 237 208 207 284 ...
## $ FastingBS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ RestingECG : Factor w/ 3 levels "LVH","Normal",..: 2 2 3 2 2 2 2 2 2 2 ...
## $ MaxHR : int 172 156 98 108 122 170 170 142 130 120 ...
## $ ExerciseAngina: Factor w/ 2 levels "N","Y": 1 1 1 2 1 1 1 1 2 1 ...
## $ Oldpeak : num 0 1 0 1.5 0 0 0 0 1.5 0 ...
## $ ST_Slope : Factor w/ 3 levels "Down","Flat",..: 3 2 3 2 3 3 3 3 2 3 ...
## $ HeartDisease : Factor w/ 2 levels "0","1": 1 2 1 2 1 1 1 1 2 1 ...
Veride eksik veri yoktur. Aykırı değerlere bakabilmek için sayısal değişkenlerin kutu grafiklerine bakıldı.
sum(is.na(df))
## [1] 0
colSums(is.na(df))
## Age Sex ChestPainType RestingBP Cholesterol
## 0 0 0 0 0
## FastingBS RestingECG MaxHR ExerciseAngina Oldpeak
## 0 0 0 0 0
## ST_Slope HeartDisease
## 0 0
library(Hmisc)
Hmisc::describe(df)
## df
##
## 12 Variables 918 Observations
## --------------------------------------------------------------------------------
## Age
## n missing distinct Info Mean Gmd .05 .10
## 918 0 50 0.999 53.51 10.71 37 40
## .25 .50 .75 .90 .95
## 47 54 60 65 68
##
## lowest : 28 29 30 31 32, highest: 73 74 75 76 77
## --------------------------------------------------------------------------------
## Sex
## n missing distinct
## 918 0 2
##
## Value F M
## Frequency 193 725
## Proportion 0.21 0.79
## --------------------------------------------------------------------------------
## ChestPainType
## n missing distinct
## 918 0 4
##
## Value ASY ATA NAP TA
## Frequency 496 173 203 46
## Proportion 0.540 0.188 0.221 0.050
## --------------------------------------------------------------------------------
## RestingBP
## n missing distinct Info Mean Gmd .05 .10
## 918 0 67 0.993 132.4 20.09 106 110
## .25 .50 .75 .90 .95
## 120 130 140 160 160
##
## lowest : 0 80 92 94 95, highest: 180 185 190 192 200
## --------------------------------------------------------------------------------
## Cholesterol
## n missing distinct Info Mean Gmd .05 .10
## 918 0 222 0.993 198.8 116 0.0 0.0
## .25 .50 .75 .90 .95
## 173.2 223.0 267.0 305.0 331.3
##
## lowest : 0 85 100 110 113, highest: 491 518 529 564 603
## --------------------------------------------------------------------------------
## FastingBS
## n missing distinct
## 918 0 2
##
## Value 0 1
## Frequency 704 214
## Proportion 0.767 0.233
## --------------------------------------------------------------------------------
## RestingECG
## n missing distinct
## 918 0 3
##
## Value LVH Normal ST
## Frequency 188 552 178
## Proportion 0.205 0.601 0.194
## --------------------------------------------------------------------------------
## MaxHR
## n missing distinct Info Mean Gmd .05 .10
## 918 0 119 1 136.8 29.03 96 103
## .25 .50 .75 .90 .95
## 120 138 156 170 178
##
## lowest : 60 63 67 69 70, highest: 190 192 194 195 202
## --------------------------------------------------------------------------------
## ExerciseAngina
## n missing distinct
## 918 0 2
##
## Value N Y
## Frequency 547 371
## Proportion 0.596 0.404
## --------------------------------------------------------------------------------
## Oldpeak
## n missing distinct Info Mean Gmd .05 .10
## 918 0 53 0.934 0.8874 1.126 0.0 0.0
## .25 .50 .75 .90 .95
## 0.0 0.6 1.5 2.3 3.0
##
## lowest : -2.6 -2 -1.5 -1.1 -1 , highest: 4.2 4.4 5 5.6 6.2
## --------------------------------------------------------------------------------
## ST_Slope
## n missing distinct
## 918 0 3
##
## Value Down Flat Up
## Frequency 63 460 395
## Proportion 0.069 0.501 0.430
## --------------------------------------------------------------------------------
## HeartDisease
## n missing distinct
## 918 0 2
##
## Value 0 1
## Frequency 410 508
## Proportion 0.447 0.553
## --------------------------------------------------------------------------------
par(mfrow=c(2, 3))
# Age Boxplot
boxplot(df$Age, main="Age", ylab="Age", col="skyblue")
# RestingBP Boxplot
boxplot(df$RestingBP, main="RestingBP", ylab="RestingBP", col="lightgreen")
# Cholesterol Boxplot
boxplot(df$Cholesterol, main="Cholesterol", ylab="Cholesterol", col="lightcoral")
# MaxHR Boxplot
boxplot(df$MaxHR, main="MaxHR", ylab="MaxHR", col="lightgoldenrodyellow")
# Oldpeak Boxplot
boxplot(df$Oldpeak, main="Oldpeak", ylab="Oldpeak", col="lightsteelblue")
df$Age %>% summary()
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 28.00 47.00 54.00 53.51 60.00 77.00
ggplot(data = df, aes(x = Age)) +
geom_histogram(color = "darkblue", fill = "lightblue") +
labs(title = "Age Histogram Plot", x = "Age", y = "Count") +
theme_minimal()
df$Cholesterol %>% summary()
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 173.2 223.0 198.8 267.0 603.0
ggplot(data = df, aes(x = Cholesterol)) +
geom_histogram(color = "darkblue", fill = "lightblue") +
labs(title = "Serum Cholesterol Histogram Plot", x = "Serum Cholesterol", y = "Count") +
theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Alt sınır için Q1 - 1.5 * IQR, üst sınır içinse Q3 + 1.5 * IQR değerini kullanarak değişken bazında aykırı değer oranlarını hesaplandı. %20’den fazla aykırı değere sahip değişken olmadığı için aykırı değerleri direkt silmek tercih edildi.
outlier_ratio <- function(data, variable) {
Q1 <- quantile(data[[variable]], 0.25)
Q3 <- quantile(data[[variable]], 0.75)
IQR <- Q3 - Q1
alt_sinir <- Q1 - 1.5 * IQR
ust_sinir <- Q3 + 1.5 * IQR
aykiri <- sum(data[[variable]] < alt_sinir | data[[variable]] > ust_sinir)
oran <- aykiri / length(data[[variable]])
return(oran)
}
degiskenler <- c("Age", "RestingBP", "Cholesterol", "MaxHR", "Oldpeak")
aykiri_oranlar <- sapply(degiskenler, function(x) outlier_ratio(df, x))
aykiri_oranlar
## Age RestingBP Cholesterol MaxHR Oldpeak
## 0.000000000 0.030501089 0.199346405 0.002178649 0.017429194
veri <- data.frame(degiskenler, aykiri_oranlar)
grafik <- ggplot(data = veri, aes(x = degiskenler, y = aykiri_oranlar)) +
geom_bar(stat = "identity", fill = "skyblue", width = 0.5) +
labs(title = "Aykırı Değer Oranları", x = "Değişkenler", y = "Aykırı Değer Oranı") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(grafik)
outlier_detection <- function(data, variable) {
Q1 <- quantile(data[[variable]], 0.25)
Q3 <- quantile(data[[variable]], 0.75)
IQR <- Q3 - Q1
alt_sinir <- Q1 - 1.5 * IQR
ust_sinir <- Q3 + 1.5 * IQR
aykiri <- data[[variable]] < alt_sinir | data[[variable]] > ust_sinir
return(aykiri)
}
age_outliers <- outlier_detection(df, "Age")
RestingBP_outliers <- outlier_detection(df, "RestingBP")
Cholesterol_outliers <- outlier_detection(df, "Cholesterol")
MaxHR_outliers <- outlier_detection(df, "MaxHR")
Oldpeak_outliers <- outlier_detection(df, "Oldpeak")
clean_df <- df[!age_outliers & !RestingBP_outliers & !Cholesterol_outliers & !MaxHR_outliers & !Oldpeak_outliers, ]
cat("Aykırı değerleri içermeyen veri setinin boyutu:", dim(clean_df))
## Aykırı değerleri içermeyen veri setinin boyutu: 702 12
write.csv(clean_df, file = "/home/ilke/Downloads/clean_heart.csv", row.names = FALSE) #**düzenlenmiş veriyi kaydettim**
Aykırı değerleri çıkarttıktan sonra sayısal değişkenlerin kutu grafikleri
par(mfrow=c(2, 3))
# Age Boxplot
boxplot(clean_df$Age, main="Age", ylab="Age", col="skyblue")
# RestingBP Boxplot
boxplot(clean_df$RestingBP, main="RestingBP", ylab="RestingBP", col="lightgreen")
# Cholesterol Boxplot
boxplot(clean_df$Cholesterol, main="Cholesterol", ylab="Cholesterol", col="lightcoral")
# MaxHR Boxplot
boxplot(clean_df$MaxHR, main="MaxHR", ylab="MaxHR", col="lightgoldenrodyellow")
# Oldpeak Boxplot
boxplot(clean_df$Oldpeak, main="Oldpeak", ylab="Oldpeak", col="lightsteelblue")
head(df)
## Age Sex ChestPainType RestingBP Cholesterol FastingBS RestingECG MaxHR
## 1 40 M ATA 140 289 0 Normal 172
## 2 49 F NAP 160 180 0 Normal 156
## 3 37 M ATA 130 283 0 ST 98
## 4 48 F ASY 138 214 0 Normal 108
## 5 54 M NAP 150 195 0 Normal 122
## 6 39 M NAP 120 339 0 Normal 170
## ExerciseAngina Oldpeak ST_Slope HeartDisease
## 1 N 0.0 Up 0
## 2 N 1.0 Flat 1
## 3 N 0.0 Up 0
## 4 Y 1.5 Flat 1
## 5 N 0.0 Up 0
## 6 N 0.0 Up 0
kable(df %>%
group_by(HeartDisease) %>%
summarise(N = n())) %>%
kable_styling(full_width = F)
| HeartDisease | N |
|---|---|
| 0 | 410 |
| 1 | 508 |
factor_columns <- sapply(df, is.factor)
factor_levels <- lapply(df[factor_columns], table)
factor_summary_combined <- do.call(rbind, Map(as.data.frame, factor_levels))
kable(factor_summary_combined, caption = "kategorik değişkenlerin seviye sayıları") %>%
kable_styling(full_width = F)
| Var1 | Freq | |
|---|---|---|
| Sex.1 | F | 193 |
| Sex.2 | M | 725 |
| ChestPainType.1 | ASY | 496 |
| ChestPainType.2 | ATA | 173 |
| ChestPainType.3 | NAP | 203 |
| ChestPainType.4 | TA | 46 |
| FastingBS.1 | 0 | 704 |
| FastingBS.2 | 1 | 214 |
| RestingECG.1 | LVH | 188 |
| RestingECG.2 | Normal | 552 |
| RestingECG.3 | ST | 178 |
| ExerciseAngina.1 | N | 547 |
| ExerciseAngina.2 | Y | 371 |
| ST_Slope.1 | Down | 63 |
| ST_Slope.2 | Flat | 460 |
| ST_Slope.3 | Up | 395 |
| HeartDisease.1 | 0 | 410 |
| HeartDisease.2 | 1 | 508 |
barplot_var1 <- ggplot(clean_df, aes(x = as.factor(Sex), fill = as.factor(Sex))) +
geom_bar(position = "dodge") +
labs(title = "Sex Distribution", x = "Sex", y = "Frequency", fill = "Sex") +
theme_minimal()
barplot_var2 <- ggplot(clean_df, aes(x = as.factor(ChestPainType), fill = as.factor(ChestPainType))) +
geom_bar(position = "dodge") +
labs(title = "Chest Pain Type Distribution", x = "Chest Pain Type", y = "Frequency", fill = "Chest Pain Type") +
theme_minimal()
barplot_var3 <- ggplot(clean_df, aes(x = as.factor(FastingBS), fill = as.factor(FastingBS))) +
geom_bar(position = "dodge") +
labs(title = "Fasting Blood Sugar Distribution", x = "Fasting BS", y = "Frequency", fill = "Fasting BS") +
theme_minimal()
barplot_var4 <- ggplot(clean_df, aes(x = as.factor(RestingECG), fill = as.factor(RestingECG))) +
geom_bar(position = "dodge") +
labs(title = "Resting ECG Distribution", x = "Resting ECG", y = "Frequency", fill = "Resting ECG") +
theme_minimal()
library(gridExtra)
multi_barplot <- grid.arrange(barplot_var1, barplot_var2, barplot_var3, barplot_var4, ncol = 2)
print(multi_barplot)
## TableGrob (2 x 2) "arrange": 4 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (2-2,1-1) arrange gtable[layout]
## 4 4 (2-2,2-2) arrange gtable[layout]
barplot_st_slope <- ggplot(clean_df, aes(x = as.factor(ST_Slope), fill = as.factor(ST_Slope))) +
geom_bar(position = "dodge") +
labs(title = "ST Slope Distribution", x = "ST Slope", y = "Frequency", fill = "ST Slope") +
theme_minimal()
barplot_heart_disease <- ggplot(clean_df, aes(x = as.factor(HeartDisease), fill = as.factor(HeartDisease))) +
geom_bar(position = "dodge") +
labs(title = "Heart Disease Distribution", x = "Heart Disease", y = "Frequency", fill = "Heart Disease") +
theme_minimal()
barplot_exercise_angina <- ggplot(clean_df, aes(x = as.factor(ExerciseAngina), fill = as.factor(ExerciseAngina))) +
geom_bar(position = "dodge") +
labs(title = "Exercise Angina Distribution", x = "Exercise Angina", y = "Frequency", fill = "Exercise Angina") +
theme_minimal()
multi_barplot <- grid.arrange(barplot_st_slope, barplot_heart_disease, barplot_exercise_angina, ncol = 3)
print(multi_barplot)
## TableGrob (1 x 3) "arrange": 3 grobs
## z cells name grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]
colnames(df)
## [1] "Age" "Sex" "ChestPainType" "RestingBP"
## [5] "Cholesterol" "FastingBS" "RestingECG" "MaxHR"
## [9] "ExerciseAngina" "Oldpeak" "ST_Slope" "HeartDisease"
a <- df %>%
group_by(HeartDisease) %>%
summarise(across(-c(Sex, ChestPainType, FastingBS, RestingECG, ExerciseAngina, ST_Slope), list(mean = mean, sd = sd)))
kable(a,caption="Target değişkenine göre ort ve standart sapmalar", format = "html") %>%
kable_styling()
| HeartDisease | Age_mean | Age_sd | RestingBP_mean | RestingBP_sd | Cholesterol_mean | Cholesterol_sd | MaxHR_mean | MaxHR_sd | Oldpeak_mean | Oldpeak_sd |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 50.55122 | 9.444915 | 130.1805 | 16.49958 | 227.1220 | 74.63466 | 148.1512 | 23.28807 | 0.4080488 | 0.6997091 |
| 1 | 55.89961 | 8.727056 | 134.1850 | 19.82868 | 175.9409 | 126.39140 | 127.6555 | 23.38692 | 1.2742126 | 1.1518720 |
kable(summary(df), caption = "Veri Özeti") %>%
kable_styling(full_width = F)
| Age | Sex | ChestPainType | RestingBP | Cholesterol | FastingBS | RestingECG | MaxHR | ExerciseAngina | Oldpeak | ST_Slope | HeartDisease | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min. :28.00 | F:193 | ASY:496 | Min. : 0.0 | Min. : 0.0 | 0:704 | LVH :188 | Min. : 60.0 | N:547 | Min. :-2.6000 | Down: 63 | 0:410 | |
| 1st Qu.:47.00 | M:725 | ATA:173 | 1st Qu.:120.0 | 1st Qu.:173.2 | 1:214 | Normal:552 | 1st Qu.:120.0 | Y:371 | 1st Qu.: 0.0000 | Flat:460 | 1:508 | |
| Median :54.00 | NA | NAP:203 | Median :130.0 | Median :223.0 | NA | ST :178 | Median :138.0 | NA | Median : 0.6000 | Up :395 | NA | |
| Mean :53.51 | NA | TA : 46 | Mean :132.4 | Mean :198.8 | NA | NA | Mean :136.8 | NA | Mean : 0.8874 | NA | NA | |
| 3rd Qu.:60.00 | NA | NA | 3rd Qu.:140.0 | 3rd Qu.:267.0 | NA | NA | 3rd Qu.:156.0 | NA | 3rd Qu.: 1.5000 | NA | NA | |
| Max. :77.00 | NA | NA | Max. :200.0 | Max. :603.0 | NA | NA | Max. :202.0 | NA | Max. : 6.2000 | NA | NA |
clean_df$Cholesterol %>% summary()
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 85.0 206.0 235.0 239.7 272.0 404.0
ggplot(data = clean_df, aes(x = Cholesterol)) +
geom_histogram(color = "darkblue", fill = "lightblue") +
labs(title = "Serum Cholesterol Histogram Plot", x = "Serum Cholesterol", y = "Count") +
theme_minimal()
ggplot(clean_df, aes(x = RestingBP, y = Cholesterol, color = as.factor(HeartDisease))) +
geom_point(alpha = 0.7) +
labs(title = "Scatterplot of Resting BP and Cholesterol by Heart Disease Status", x = "Resting BP", y = "Cholesterol", color = "Heart Disease") +
theme_minimal()
ggplot(clean_df, aes(x = Age, y = RestingBP, color = as.factor(HeartDisease), size = MaxHR)) +
geom_point(alpha = 0.7) +
labs(title = "Three-variable Scatterplot", x = "Age", y = "Resting BP", color = "Heart Disease", size = "Max HR") +
theme_minimal()
ggpairs(clean_df[, c("Age", "RestingBP", "Cholesterol", "MaxHR", "HeartDisease")],
aes(color = as.factor(HeartDisease)),
lower = list(continuous = "points"),
upper = list(continuous = "blank"),
diag = list(continuous = "barDiag"))
ggpairs(df[, c("Age", "RestingBP", "Cholesterol", "MaxHR", "HeartDisease", "Sex")],
aes(color = Sex),
lower = list(continuous = "points"),
upper = list(continuous = "blank"),
diag = list(continuous = "barDiag"))