Objective Data on the numbers of persons belonging to various religions are viewed as horizontal stacked plot in the 20 most populous nations. These religions include Christian, Hindu, Islamic, Buddhist, Jewish, folk, and non-affiliated religions. In growing order of their complete population, the list of nations is listed below.the China is the most populous nation and the least populated nation in “DR congo.” The graph shows population and religion in each nation sub-population. The main objective is to demonstrate the difference in proportionality between religion and people in each nation.
#Issues Identified
Some regions in the graph allow certain changes to be made to make the perception of information simpler and to make it easier for those who are shared to benefit from blindness. Since several combinations of the two categorical variables can be made, for all types of blindness the graph is very difficult to structure so that we have worked on the common blindness shape (red and green)
Following two are the suggested improvements 1. Representation : Although the graph compares the proportionality between religion and population in the various nations and gives no great amount of information as it becomes hard when comparing one country’s religious community with another nation, we have therefore built up a mosaic chart with boundaries along the corners, as well as axes that demonstrate the percentage of population and religion in each nation.
Reference
https://medium.com/@currankelleher/data-visualization-online-wpi-2018-f662bf32908d
The following code was used to fix the issues identified in the original.
install.packages("ggplot2")
##
## The downloaded binary packages are in
## /var/folders/_g/hp2r1dx123l5zcyp6wn3zv100000gn/T//RtmpanRUW2/downloaded_packages
install.packages("colourpicker")
##
## The downloaded binary packages are in
## /var/folders/_g/hp2r1dx123l5zcyp6wn3zv100000gn/T//RtmpanRUW2/downloaded_packages
install.packages("devtools")
##
## The downloaded binary packages are in
## /var/folders/_g/hp2r1dx123l5zcyp6wn3zv100000gn/T//RtmpanRUW2/downloaded_packages
devtools::install_github("wilkelab/cowplot")
devtools::install_github("clauswilke/colorblindr")
install.packages("colorspace", repos = "http://cran.us.r-project.org")
##
## The downloaded binary packages are in
## /var/folders/_g/hp2r1dx123l5zcyp6wn3zv100000gn/T//RtmpanRUW2/downloaded_packages
install.packages("colorblindr")
library(readr)
library(dplyr)
library(tidyr)
library(ggplot2)
library(reshape2)
library(colourpicker)
library(ggmosaic)
library(data.table)
library(devtools)
library(colorblindr)
library(usethis)
Data Reference
https://medium.com/@currankelleher/data-visualization-online-wpi-2018-f662bf32908d
## # A tibble: 6 x 3
## country religion population
## <chr> <chr> <dbl>
## 1 China Christian 68410000
## 2 China Muslim 24690000
## 3 China Unaffiliated 700680000
## 4 China Hindu 20000
## 5 China Buddhist 244130000
## 6 China Folk Religions 294320000
## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 160 obs. of 3 variables:
## $ country : chr "China" "China" "China" "China" ...
## $ religion : chr "Christian" "Muslim" "Unaffiliated" "Hindu" ...
## $ population: num 6.84e+07 2.47e+07 7.01e+08 2.00e+04 2.44e+08 ...
## - attr(*, "spec")=
## .. cols(
## .. country = col_character(),
## .. religion = col_character(),
## .. population = col_double()
## .. )
#convert population to millions
ord <- religionByCountryTop20 %>% group_by(country) %>%summarise(sum = sum(population))
ord$country<- ord$country %>%
factor(levels = ord$country[order(ord$sum)],ordered=TRUE)
str(ord)
## Classes 'tbl_df', 'tbl' and 'data.frame': 20 obs. of 2 variables:
## $ country: Ord.factor w/ 20 levels "DR Congo"<"Thailand"<..: 13 16 20 1 5 7 6 19 17 4 ...
## $ sum : num 1.49e+08 1.95e+08 1.34e+09 6.60e+07 8.11e+07 ...
## - attr(*, "spec")=
## .. cols(
## .. country = col_character(),
## .. religion = col_character(),
## .. population = col_double()
## .. )
# convert population to millions
religionByCountryTop20$population <- religionByCountryTop20$population/1e6
#convert country and religion to a factor as it they are categorial variables
religionByCountryTop20$country <- as.factor(religionByCountryTop20$country)
religionByCountryTop20$religion <- as.factor(religionByCountryTop20$religion)
str(religionByCountryTop20)
## Classes 'spec_tbl_df', 'tbl_df', 'tbl' and 'data.frame': 160 obs. of 3 variables:
## $ country : Factor w/ 20 levels "Bangladesh","Brazil",..: 3 3 3 3 3 3 3 3 8 8 ...
## $ religion : Factor w/ 8 levels "Buddhist","Christian",..: 2 6 8 4 1 3 7 5 2 6 ...
## $ population: num 68.41 24.69 700.68 0.02 244.13 ...
## - attr(*, "spec")=
## .. cols(
## .. country = col_character(),
## .. religion = col_character(),
## .. population = col_double()
## .. )
# to arrange countries in increasing order of their population
order <- religionByCountryTop20 %>% group_by(country) %>%summarise(sum = sum(population))
order$country<- order$country %>%
factor(levels = order$country[order(order$sum)],ordered=TRUE)
str(order)
## Classes 'tbl_df', 'tbl' and 'data.frame': 20 obs. of 2 variables:
## $ country: Ord.factor w/ 20 levels "DR Congo"<"Thailand"<..: 13 16 20 1 5 7 6 19 17 4 ...
## $ sum : num 148.7 195 1341.3 66 81.1 ...
## - attr(*, "spec")=
## .. cols(
## .. country = col_character(),
## .. religion = col_character(),
## .. population = col_double()
## .. )
# sort country in terms of population
religionByCountryTop20$country <- religionByCountryTop20$country %>%
factor(levels = sort(order$country) ,ordered=TRUE)
# creating the original visualization in R
p1 <- ggplot(data = religionByCountryTop20, aes(x = reorder(country,-population), y = population, fill = religion))
p2 <- p1 + geom_bar(stat = "identity",colour="grey")+ coord_flip()
#create a table for country vs religion values religionByCountryTop20Final
table <- acast(religionByCountryTop20, country~religion, value.var='population' )
attributes(table)
## $dim
## [1] 20 8
##
## $dimnames
## $dimnames[[1]]
## [1] "DR Congo" "Thailand" "Turkey" "Iran"
## [5] "Egypt" "Germany" "Ethiopia" "Vietnam"
## [9] "Philippines" "Mexico" "Japan" "Russia"
## [13] "Bangladesh" "Nigeria" "Pakistan" "Brazil"
## [17] "Indonesia" "United States" "India" "China"
##
## $dimnames[[2]]
## [1] "Buddhist" "Christian" "Folk Religions" "Hindu"
## [5] "Jewish" "Muslim" "Other Religions" "Unaffiliated"
margin.table(table,1) #Row marginals
## DR Congo Thailand Turkey Iran Egypt
## 65.97 69.11 72.74 73.96 81.11
## Germany Ethiopia Vietnam Philippines Mexico
## 82.31 82.93 87.85 93.25 113.40
## Japan Russia Bangladesh Nigeria Pakistan
## 126.54 142.96 148.69 158.42 173.58
## Brazil Indonesia United States India China
## 194.95 239.88 310.39 1224.60 1341.33
margin.table(table,2) #Column marginals
## Buddhist Christian Folk Religions Hindu
## 384.79 1105.84 354.67 996.72
## Jewish Muslim Other Religions Unaffiliated
## 6.36 1070.93 46.21 918.45
tab <- prop.table(table, 1)
#to get propotion and population in same data frame
t1 <-as.data.frame(tab)
col_names <-t1 %>% select (Buddhist:Unaffiliated) %>% colnames()
setDT(t1, keep.rownames = TRUE)[]
## rn Buddhist Christian Folk Religions Hindu
## 1: DR Congo 0.000000e+00 0.958162801 0.0074276186 4.547522e-04
## 2: Thailand 9.321372e-01 0.008681812 0.0008681812 1.012878e-03
## 3: Turkey 5.499038e-04 0.004399230 0.0002749519 0.000000e+00
## 4: Iran 0.000000e+00 0.001487290 0.0000000000 2.704164e-04
## 5: Egypt 0.000000e+00 0.050795216 0.0000000000 0.000000e+00
## 6: Germany 2.551330e-03 0.686915320 0.0004859677 9.719354e-04
## 7: Ethiopia 0.000000e+00 0.627878934 0.0256843121 0.000000e+00
## 8: Vietnam 1.636881e-01 0.081616392 0.4524758110 0.000000e+00
## 9: Philippines 8.579088e-04 0.926219839 0.0153351206 0.000000e+00
## 10: Mexico 0.000000e+00 0.951587302 0.0006172840 0.000000e+00
## 11: Japan 3.620989e-01 0.016042358 0.0035561878 2.370792e-04
## 12: Russia 1.189144e-03 0.732722440 0.0021684387 2.098489e-04
## 13: Bangladesh 4.842289e-03 0.001883113 0.0034972090 9.092743e-02
## 14: Nigeria 6.312334e-05 0.492677692 0.0144552455 0.000000e+00
## 15: Pakistan 1.152206e-04 0.015842839 0.0001728310 1.918424e-02
## 16: Brazil 1.282380e-03 0.888945884 0.0284175430 0.000000e+00
## 17: Indonesia 7.170252e-03 0.098632650 0.0031265633 1.688344e-02
## 18: United States 1.150166e-02 0.783079352 0.0020297046 5.766938e-03
## 19: India 7.553487e-03 0.025420545 0.0047689041 7.951576e-01
## 20: China 1.820059e-01 0.051001618 0.2194240045 1.491057e-05
## Jewish Muslim Other Religions Unaffiliated
## 1: 0.000000e+00 0.0147036532 0.0015158405 0.0177353342
## 2: 0.000000e+00 0.0545507162 0.0000000000 0.0027492403
## 3: 2.749519e-04 0.9806158922 0.0020621391 0.0118229310
## 4: 0.000000e+00 0.9947268794 0.0020281233 0.0014872904
## 5: 0.000000e+00 0.9492047836 0.0000000000 0.0000000000
## 6: 2.794314e-03 0.0578301543 0.0012149192 0.2472360588
## 7: 0.000000e+00 0.3458338358 0.0000000000 0.0006029181
## 8: 0.000000e+00 0.0018212863 0.0039840637 0.2964143426
## 9: 0.000000e+00 0.0552278820 0.0013941019 0.0009651475
## 10: 6.172840e-04 0.0000000000 0.0001763668 0.0470017637
## 11: 0.000000e+00 0.0015805279 0.0465465465 0.5699383594
## 12: 1.608842e-03 0.0999580302 0.0000000000 0.1621432569
## 13: 0.000000e+00 0.8981101621 0.0002017621 0.0005380321
## 14: 0.000000e+00 0.4879434415 0.0005681101 0.0042923873
## 15: 0.000000e+00 0.9644544302 0.0001152206 0.0001152206
## 16: 5.642472e-04 0.0002051808 0.0015388561 0.0790459092
## 17: 0.000000e+00 0.8717692179 0.0014173754 0.0010005003
## 18: 1.833178e-02 0.0089242566 0.0061213312 0.1642449821
## 19: 8.165932e-06 0.1438755512 0.0225053079 0.0007104361
## 20: 0.000000e+00 0.0184071034 0.0067694005 0.5223770437
ta<-t1 %>% gather(col_names, key = "religion", value = "propotion")
colnames(ta) <- c("country","religion","Proportion")
religionByCountryTop20Final <- merge(ta,religionByCountryTop20,by = c("country","religion"))
labs<-round(prop.table(tab,1),2)
labs
## Buddhist Christian Folk Religions Hindu Jewish Muslim
## DR Congo 0.00 0.96 0.01 0.00 0.00 0.01
## Thailand 0.93 0.01 0.00 0.00 0.00 0.05
## Turkey 0.00 0.00 0.00 0.00 0.00 0.98
## Iran 0.00 0.00 0.00 0.00 0.00 0.99
## Egypt 0.00 0.05 0.00 0.00 0.00 0.95
## Germany 0.00 0.69 0.00 0.00 0.00 0.06
## Ethiopia 0.00 0.63 0.03 0.00 0.00 0.35
## Vietnam 0.16 0.08 0.45 0.00 0.00 0.00
## Philippines 0.00 0.93 0.02 0.00 0.00 0.06
## Mexico 0.00 0.95 0.00 0.00 0.00 0.00
## Japan 0.36 0.02 0.00 0.00 0.00 0.00
## Russia 0.00 0.73 0.00 0.00 0.00 0.10
## Bangladesh 0.00 0.00 0.00 0.09 0.00 0.90
## Nigeria 0.00 0.49 0.01 0.00 0.00 0.49
## Pakistan 0.00 0.02 0.00 0.02 0.00 0.96
## Brazil 0.00 0.89 0.03 0.00 0.00 0.00
## Indonesia 0.01 0.10 0.00 0.02 0.00 0.87
## United States 0.01 0.78 0.00 0.01 0.02 0.01
## India 0.01 0.03 0.00 0.80 0.00 0.14
## China 0.18 0.05 0.22 0.00 0.00 0.02
## Other Religions Unaffiliated
## DR Congo 0.00 0.02
## Thailand 0.00 0.00
## Turkey 0.00 0.01
## Iran 0.00 0.00
## Egypt 0.00 0.00
## Germany 0.00 0.25
## Ethiopia 0.00 0.00
## Vietnam 0.00 0.30
## Philippines 0.00 0.00
## Mexico 0.00 0.05
## Japan 0.05 0.57
## Russia 0.00 0.16
## Bangladesh 0.00 0.00
## Nigeria 0.00 0.00
## Pakistan 0.00 0.00
## Brazil 0.00 0.08
## Indonesia 0.00 0.00
## United States 0.01 0.16
## India 0.02 0.00
## China 0.01 0.52
religionByCountryTop20Final$country <- religionByCountryTop20Final$country %>%
factor(levels = sort(order$country) ,ordered=TRUE)
ord$prop <- ord$sum / sum(ord$sum)
religionByCountryTop20Final <- merge(ord,religionByCountryTop20Final,by = c("country"))
The following plot solves the major issues in the original plot.
here we use the ggplot2 to visualize the region by country. It is to visualize the religions of top 20 countries which uses different styles.
#create mosaic plot on population propotion
p4 <- ggplot(religionByCountryTop20Final)
p5 <- p4 + geom_mosaic(aes(x = product(country), weight = c(population), fill = religion))
p6 <- p5 + theme(axis.text.x=element_blank() + geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 1,label=round(1,2)),inherit.aes = FALSE))
#Adding propotion limits for religion propotion
p7<- p6 +geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.9,label=round(0.9,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.8,label=round(0.8,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.7,label=round(0.7,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.6,label=round(0.6,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.5,label=round(0.5,2)), inherit.aes = FALSE)
p8 <- p7 +geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.4,label=round(0.4,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.3,label=round(0.3,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.2,label=round(0.2,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0, y = 0.1,label=round(0.1,2)), inherit.aes = FALSE)
#Adding propotion limits for country population
p9 <-p8 +geom_text(data = religionByCountryTop20Final, aes(x = 0.9, y = 1,label=round(0.1,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0.8, y = 1,label=round(0.2,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0.7, y = 1,label=round(0.3,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0.6, y = 1,label=round(0.4,2)), inherit.aes = FALSE)
p10 <- p9 +geom_text(data = religionByCountryTop20Final, aes(x = 0.5, y = 1,label=round(0.5,2)), inherit.aes = FALSE) +geom_text(data = religionByCountryTop20Final, aes(x = 0.4, y = 1,label=round(0.6,2)), inherit.aes = FALSE) +geom_text(data = religionByCountryTop20Final, aes(x = 0.3, y = 1,label=round(0.7,2)), inherit.aes = FALSE)
p11 <-p10 +geom_text(data = religionByCountryTop20Final, aes(x = 0.2, y = 1,label=round(0.8,2)), inherit.aes = FALSE)+geom_text(data = religionByCountryTop20Final, aes(x = 0.1, y = 1,label=round(0.9,2)), inherit.aes = FALSE)
# (II) Adding Proper Title and Naming the Axes
p12 <- p11 +labs(title = "Religions of Largest 20 Countries Population Proportion",y = "Religion",x = "Country")
p12 <- p12+coord_flip()
p12 <- p12+theme(axis.text.x=element_blank(),axis.ticks.x=element_blank())
# (III) color Adjustment
plotHelper(p13 <- p12 +scale_fill_manual(values=c( "#CDAD00","#1874CD" ,"#EE4000", "#FFC125", "#FFAAAA", "#FF8247", "#5E5E5E", "#FF1493")))
## [1] "#1f78b4"
p13
#Color Saturation
We update the hue and saturation such that it is easy for even people suffering from protanalomy (Red-green color blindness) which is the most common form of color-blindness to distinguish between the factors.It can be seen that it would be difficult to distinguish some categories by people suffering from color blindness and makes difficult to perceive the intended knowledge. Tritanomoly is the plot to visualise it after adjustments which is very clear.
p14<-cvd_grid(p13)
p14