Objective
The objective of the visualization is to understand the data distribution of Age and Trestbps(Blood Pressure) with respect to Disease and Non-Disease category. The target audience is the general population as the article was published in an online publishing platform called medium to get an understanding of data exploration.
The visualisation chosen had the following three main issues:
Reference * Basic Medical Data Exploration / Visualization — Heart Diseases by Jae Duk Seo * Link Reference - https://towardsdatascience.com/basic-medical-data-exploration-visualization-heart-diseases-6ab12bc0a8b7
The following code was used to fix the issues identified in the original.
library(plotly)
library(ggplot2)
library(readr)
library(dplyr)
library(Hmisc)
library(GGally)
library(magrittr)
library(grid)
library(colorblindr)
library(gridExtra)
heart=read.csv("processed.cleveland.csv", header = FALSE)
colnames(heart)<-c('Age', 'Gender', 'ChestPain', 'Trestbps', 'Cholestoral',
'FBS', 'Restecg','MaximumHeartRate','Exang', 'OldPeak',
'Slope','Ca', 'Thal', 'Target')
heart$Target = ifelse(heart$Target > 0, 1, 0)
heart$Target=as.factor(heart$Target)
heart$Target=heart$Target %>% factor(levels = c(0,1),labels=c("No Disease","Disease"))
heart$Gender=as.factor(heart$Gender)
heart$Gender=heart$Gender%>% factor(levels = c(1,0),labels=c("Male","Female"))
p9<-ggplot(heart,aes(x=Age))
p10<- ggplot(heart,aes(x=Trestbps))
Trestbps <-p10 + geom_histogram(aes(y = ..density.., color = Target, fill = Target), position = "identity", alpha=0.2)+
ylim(0, 0.06)+
geom_density(aes(color = Target), size = 1,alpha=.2) +
scale_color_manual(values = c("#0073C2FF","#FC4E07"))+ scale_fill_manual(values = c ("#0073C2FF", "#FC4E07"))+
ggtitle(" Data Distribution of Trestpbs")+
theme(
plot.title = element_text(color="Black", size= 10, face="bold.italic"),
axis.title.x = element_text(color="Black", size=7, face="bold"),
axis.title.y = element_text(color="Black", size=7, face="bold")
)
Age<-p9 + geom_histogram(aes(y = ..density.., color = Target, fill = Target), position = "identity", alpha=0.2)+
ylim(0, 0.08)+
geom_density(aes(color = Target), size = 1,alpha=.2) +
scale_color_manual(values = c("#0073C2FF","#FC4E07"))+ scale_fill_manual(values = c ("#0073C2FF", "#FC4E07"))+
ggtitle(" Data Distribution of Age")+
theme(
plot.title = element_text(color="Black", size=10, face="bold.italic", margin = margin(10, 0, 10, 0)),
axis.title.x = element_text(color="Black", size=7, face="bold"),
axis.title.y = element_text(color="Black", size=7, face="bold")
)
Data Reference
We reconstruct the graph by: