Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The objective of the original visualisation is to highlight how America ranks with respect to the rest of world in terms of freedoms granted to its citizens and institutes using five different freedom indices. This is done showing the percentage difference (both positive and negative) in the press freedom, democracy, economic freedom, human freedom, and moral free index values between America and 25 other countries. Overall, as the title “is USA the most free and democratic country in the world? the short answer is no..” heavily implies, this visualisation intends to emphasize the lack of freedoms and therefore democracy present in America to contrast the widely held belief that America is a free nation. The target audience of this visualisation is the general public, but specifically is likely aimed at more nationalistic Americana audience to dissuade the viewer from holding a chauvinistic view towards the USA and the rights provided to its citizens. Ultimately this visualisation is designed to shock the viewer into altering their opinion on America’s democracy with respect to its standing on the global stage.
The visualisation chosen had the following three main issues:
Scale/use of radial bar charts: no value or percentage amounts are given, and the viewer simply must rely on visual comparison of area and angle to determine the percentage difference in values, in addition to the relatively small size of the radial bar charts, this makes visual comparison highly inaccurate.
Over faceted/visually confusing: having five quantitative variables all displayed as the same colour across 25 different radial bar charts forces the view to constantly check the legend as the location of each index is not intuitive.The use of the fill parameter to denote both percentage above and below the US value also adds an addition level of confusion.
Missing and biased data: Only 25 countries out of an average 172 listed in the indices are provided all of which on average rank higher than the USA across the five variables, meaning the countries shown display a heavy sampling bias as they do not represent the overall population well. Additionally , as the Human Freedom index(2020) accounts for both economic freedom and moral freedom i belive having two additional metrics to measure these is not only redundant but also skews the data to heavily favor countries with specific economic systems and ethical values as both metrics are relatively subjective and doesn’t completely encompass the full spectrum of what makes a country more democratic.
NOTE: All indices will allways contain their own bias as their is no universally accepted metric for democracy , however all are peer reviewed extensively to reduce this effect.
Reference
The following code was used to fix the issues identified in the original.
#Import packages:
library(readxl)
library(readr)
library(dplyr)
library(rapportools)
library(ggplot2)
library(tidyr)
library(knitr)
library(sysfonts)
library(showtext)
# Import Fonts needed
font_add(family = "Cons", regular = "constan.ttf",italic = "constani.ttf",bold ="constanb.ttf",bolditalic = "constanz.ttf" )
font_add(family = "TNR", regular = "times.ttf",italic = "timesi.ttf",bold ="timesbd.ttf",bolditalic = "timesbi.ttf" )
showtext_auto()
#---------------------- DATA ---------------------------------------------------
# Import Data sets:
# I decided to not use the economic freedom index and moral freedom index as 5
#metrics would cluster the visualization too much, additionally i thought both
# indices are summarized sufficiently in the human rights index calculation,
# allowing me to use unionization rates which acts as a less subjective metric for
#democracy , as economic and moral freedom values are dependent on what the designer
#of said index defines as amoral legislation and what economic freedoms are
#necessary for a complete democracy.
Democracy_indx2020 <- read_excel("Democracy_indx2020.xlsx")
HFI2020 <- read_excel("Humanfreedomindx2020.xlsx")
Press_fredom_index2020 <- read_excel("Press Freedom index.xlsx")
OECDunionisationrates <- read_excel("OECDunionisationrates.xlsx")
#Subset Data:
# Countries used: I decided to use 46 countries including America to get a better sample size whilst trying not to make the visualization too clustered. I chose the 38 OECD members to act as a good comparison against America as all countries have ideological similarities. I also included 9 countries such as China and Russia that is commonly associated in America as having a lack of freedom to see how these countries fare in comparison to America itself.
C_List <- c('Luxembourg','Ireland','Switzerland','Norway','United States','Iceland','Netherlands'
,'Austria','Denmark','Australia','Germany','Sweden','Belgium','Finland','Canada','United Kingdom'
,'France','Japan','Italy','New Zealand','South Korea','Israel','Czech Republic','Spain','Slovenia'
, 'Estonia','Slovakia','Portugal','Poland','Hungary','Greece','Turkey','Chile','Mexico','China'
, 'Russia','Saudi Arabia','Costa Rica','Afghanistan','Iraq','Kenya','Colombia','Latvia'
,'Iran','Lithuania')
Unionisation <- OECDunionisationrates %>% select(Country,`2018`)
Unionisation$`2018` <- as.numeric(Unionisation$`2018`)
Unionisation<- na.omit(Unionisation)
DemI <- Democracy_indx2020 %>% select(Country,`Overall score`)
HFI <- HFI2020 %>% select(`Country/Territory`,`Human Freedom`)
PressFI <- Press_fredom_index2020 %>% select(EN_country,`Score 2020...8`)
i<- 1
i1<- 1
i2<- 1
i3<- 1
u <- as.data.frame(matrix(nrow =29,ncol = 2))
colnames(u) <- c('Country','Unionisation rates (2018)')
d<- as.data.frame(matrix(nrow =44,ncol = 2))
colnames(d) <- c('Country','Democracy index (2020)')
h<- as.data.frame(matrix(nrow =44,ncol = 2))
colnames(h) <- c('Country','Human Freedom index(2020)')
p<- as.data.frame(matrix(nrow =45,ncol = 2))
colnames(p) <- c('Country','*Press Freedom index (2020)')
for(var in C_List){
a<- Unionisation[Unionisation$Country == var,]
b<- DemI[DemI$Country == var,]
c <- HFI[HFI$`Country/Territory` == var,]
e <- PressFI[PressFI$EN_country == var,]
if (is.empty(a) == FALSE){
u[i,] <- a
i <- i+1
}
if (is.empty(b) == FALSE){
d[i1,] <- b
i1<- i1+1
}
if (is.empty(c) == FALSE){
h[i2,] <- c
i2<- i2+1
}
if (is.empty(e) == FALSE){
p[i3,] <- e
i3<- i3+1
}
}
# Join all data:
df <- p %>% full_join(.,u,by = "Country")
df1<- df %>% left_join(h,by = "Country")
Data_full<- df1 %>% left_join(d,by = "Country")
#Convert to Long data format:
Data_fullL <- Data_full %>% pivot_longer(names_to = "Index" , values_to = "Value" , cols = 2:5)
#add rank , sum and rank/sum :
Data_fullL1 <- transform(subset(Data_fullL,Index != '*Press Freedom index (2020)'), Rank = ave(Value, Index , FUN = function(x) rank(-x,ties.method = "first")))
Data_fullL2 <- transform(subset(Data_fullL,Index == '*Press Freedom index (2020)'), Rank = ave(Value, Index , FUN = function(x) rank(x,ties.method = "last")))
Data_fullL <- bind_rows(Data_fullL1,Data_fullL2)
Data_fullL$Sum <- ifelse(Data_fullL$Index == 'Unionisation rates (2018)',length(u$`Unionisation rates (2018)`),
ifelse(Data_fullL$Index == 'Democracy index (2020)',length(d$`Democracy index (2020)`),
ifelse(Data_fullL$Index == 'Human Freedom index(2020)',length(h$`Human Freedom index(2020)`),
ifelse(Data_fullL$Index == '*Press Freedom index (2020)',length(p$`*Press Freedom index (2020)`),'NA'))))
Data_fullL <- Data_fullL %>% mutate(Rankoutof = paste(Rank,Sum,sep = "/"))
Data_fullL[Data_fullL$Country == "United States" & Data_fullL$Index == "Democracy index (2020)",6] <- paste(Data_fullL[Data_fullL$Country == "United States" & Data_fullL$Index == "Democracy index (2020)",6],'**',sep = "")
# Add mean:
Avg_u <-mean(u$`Unionisation rates (2018)`)
Avg_dem <- mean(d$`Democracy index (2020)`)
Avg_h <- mean(h$`Human Freedom index(2020)`)
Avg_P <- mean(p$`*Press Freedom index (2020)`)
#-------------------- PLOT ----------------------------------------------------
is_usa <- Data_fullL$Country == 'United States'
is_topperformers <- Data_fullL$Rank == 1
is_u <- Data_fullL$Index == 'Unionisation rates (2018)'
is_p <- Data_fullL$Index == '*Press Freedom index (2020)'
is_h <- Data_fullL$Index == 'Human Freedom index(2020)'
usa_data <- subset(Data_fullL,is_usa)
col <- ifelse(is_usa,'0',ifelse(is_topperformers,'1','2'))
Line_col <- "#EC6632"
# data:
plot <- ggplot(data = Data_fullL , aes(x = Value , y=reorder(Country,Value) , colour = col)) +
geom_point(stat = "identity", size =2) + facet_wrap(.~Index,scales = "free_x" , ncol = 4)+
geom_segment(aes(x=0,xend = Value, y =reorder(Country,Value),yend = reorder(Country,Value)),lwd = 1.5,show.legend = FALSE) +
theme(panel.spacing.x = unit(0.2,"lines"))
#mean lines
plot <- plot + geom_vline(data = filter(Data_fullL,Index == 'Unionisation rates (2018)') , aes(xintercept = Avg_u ),size = 1,linetype = "dashed" , color =Line_col ) +
geom_vline(data = filter(Data_fullL,Index == 'Democracy index (2020)') , aes(xintercept = Avg_dem),size = 1,linetype = "dashed" , color = Line_col ) +
geom_vline(data = filter(Data_fullL,Index == 'Human Freedom index(2020)') , aes(xintercept = Avg_h),size = 1,linetype = "dashed" , color =Line_col ) +
geom_vline(data = filter(Data_fullL,Index == '*Press Freedom index (2020)') , aes(xintercept = Avg_P),size = 1,linetype = "dashed" , color = Line_col ) + geom_text(aes(x=ifelse(Data_fullL$Index == 'Unionisation rates (2018)',Avg_u+34,0),y='Slovakia',label = ifelse(Data_fullL$Index == 'Unionisation rates (2018)',"*Mean Line",'')), size = 8, color = Line_col)
# Text
font <- "Cons"
plot <- plot + labs(title = "How Democratic is American Democracy?",subtitle = "America performs below the OCED average in metrics that account for democracy within the workplace (unionisation) and freedom afforded to the press \nSwitzerland,Sweeden,New Zealand,Luxembourg,Ireland,Iceland,Finland,Denmark,Australia,Canada and Germany all perform better than the US in ALL metrics",caption = "*The lower the value in the Press Freedom index the greater the degree of freedom is afforded to journalist \n **The Democracy index defines full democracies as values within 8-10 , flawed democracies within 6-8 , hybrid regimes within 4-6 and authoritarian\n regimes within 0-4. As the United States scores a value of 7.92 it is classified as a 'flawed democracy' under this metric\n Note: Data for every country is not present across all variables and a rank label for america out of the total data points present is provided ")
#TEXT aesthetics
plot <- plot + theme(text = element_text(family=font),plot.title = element_text(face = "bold",size = 35 , lineheight = 0.5),axis.text.y = element_text(size = 20,lineheight = 0.5,family = "TNR"),legend.key.size = unit(1,"cm"),legend.title = element_blank(),legend.text = element_text(size = 20),strip.text=element_text(size = 21),axis.title.x = element_blank(),axis.title.y = element_blank(),plot.caption = element_text(size =20,face = "italic",lineheight = 0.5,hjust = 0),plot.subtitle = element_text(size = 20,family = "TNR",lineheight = 0.5),axis.text.x = element_text(size = 25))+ geom_text(label = ifelse(is_usa,usa_data$Rankoutof,'') , hjust = 1 , nudge_x = ifelse(is_u,45,ifelse(is_p,30,ifelse(is_h,3.5,6))),size = 8,show.legend = FALSE,family ="TNR")
#color:
Back <- "#FFFAEB"
Back2 <- "#FFF5D6"
usa <- "#D83131"
other <- "#00A19D"
top <- "#E08700"
plot <- plot + theme(plot.background = element_rect(fill = Back),panel.background = element_rect(fill = Back),legend.background = element_rect(fill = Back),strip.background = element_rect(fill = Back2),panel.grid = element_line(colour = "lightgrey"),legend.key = element_rect(fill = Back))+ scale_color_manual(values = c('0' =usa,'2'= other,'1'=top),labels = c("United States","Other Countries","Top Performing Country"))
Data Reference
The following plot fixes the main issues in the original.