Original


Source: BMJ Journal Public health Research (2014).


Objective

The objective of the graph is to determine trend in the years of potential life lost in india with respect to sex and the most frequent diseases that caused the death. The main audience of this data visualisation are people of health sectors, medical researcch students and also common man whose knowledge about the data and visualisations could be limited.

The visualisation chosen had the following three main issues:

Reference

Dubey,M & Mohanty,S.K.(2014).Age and sex patterns of premature mortality in India:Figure 1. BMJ Open, 4(8). https://bmjopen.bmj.com/content/4/8/e005386

Code

The following code was used to fix the issues identified in the original.

library(readxl)
library(dplyr)
library(ggplot2)
library(reshape2)

vis_data <- read_excel("C:/Users/intel/Desktop/vis data.xlsx")
new <- vis_data %>% subset( ,c(1,2,3))
new2 <- new[-c(1), ]

names(new2) <- lapply(new2[1, ], as.character)
df <- new2[-1,] 
head(df)
## # A tibble: 6 x 3
##   Causes_of_death     Y.Male             Y.Female          
##   <chr>               <chr>              <chr>             
## 1 Tuberculosis        16.7               13                
## 2 Diarrhoeal          21.4               17.600000000000001
## 3 Malaria             7.9                7.8               
## 4 Malignant,neoplasms 11.9               14.6              
## 5 Cardiovascular      38.200000000000003 33.299999999999997
## 6 Respiratory         15.5               16
df$Y.Male <- as.numeric(as.character(df$Y.Male))
df$Y.Female <- as.numeric(as.character(df$Y.Female))
head(df)
## # A tibble: 6 x 3
##   Causes_of_death     Y.Male Y.Female
##   <chr>                <dbl>    <dbl>
## 1 Tuberculosis          16.7     13  
## 2 Diarrhoeal            21.4     17.6
## 3 Malaria                7.9      7.8
## 4 Malignant,neoplasms   11.9     14.6
## 5 Cardiovascular        38.2     33.3
## 6 Respiratory           15.5     16
Rate_of_deaths <- melt(df, id.vars = "Causes_of_death", measure.vars = c("Y.Male","Y.Female"))
head(Rate_of_deaths)
##       Causes_of_death variable value
## 1        Tuberculosis   Y.Male  16.7
## 2          Diarrhoeal   Y.Male  21.4
## 3             Malaria   Y.Male   7.9
## 4 Malignant,neoplasms   Y.Male  11.9
## 5      Cardiovascular   Y.Male  38.2
## 6         Respiratory   Y.Male  15.5
p<-ggplot(data=Rate_of_deaths, aes(x=Causes_of_death, y= value, fill= variable),cex.axis = 0.1, width = "100%") +
  geom_bar(stat="identity",color= "Black", position=position_dodge()) +
  scale_x_discrete(guide = guide_axis(n.dodge=2))+
  theme_classic() 

p1 <- p + scale_fill_manual(values = c("Cyan4","Pink"))+
  labs(title = "Rate Of YPLL deaths for most frequent types of deaths in India",
              subtitle = "With respect to gender",
              caption = "YPLL = Years of Potential Life Lost" ,
              x = "Causes of death", y = "Rate of deaths per 1000 people")
ggsave("Rate of YPLL deaths in people of India.png", width=25)

Data Reference

Dubey,M & Mohanty,S.K.(2014).Age and sex patterns of premature mortality in India:Table 1. BMJ Open, 4(8). https://bmjopen.bmj.com/content/4/8/e005386

Reconstruction

The following plot fixes the main issues in the original.