This is a template file. The example included is not considered a good example to follow for Assignment 2. Remove this warning prior to submitting.

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


Source: Jeff (2019).


Objective

Original data visualisation wants to show the volume changes of the world’s top ten economies in 2017 and 2030. The targeted audience is the general public, who can use these charts to understand the economic situation in their own country or region. Because for professionals such as economists and investors, the data contained in this graph are not detailed enough.

The visualisation chosen had the following three main issues:

  • Using too many circles in the same graph create visual confusion and make it difficult for readers to understand the relationship of the data.
  • Countries/regions and numbers are not prominent enough to be difficult to read.
  • Too cluttered information and graphics, such as N0.1 medals and unnecessary distribution of India’s internal gdp, deviate from the purpose of the topic.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(ggrepel)

# prepare the first 5 countries
df_1 <- data.frame("Country" = c("China", "India", "United States","Indonesia","Brazil"),
                      "2017" = c(23.2, 9.5, 19.4,3.2,3.2),
                      "2030" = c(64.2, 46.3, 31.0,10.1,8.6))
colnames(df_1) <- c("Country", "2017", "2030")
left_label <- paste(df_1$Country, df_1$`2017`,sep=" -> ")
right_label <- paste(df_1$Country, df_1$`2030`,sep=" -> ")
df_1$color <- ifelse((df_1$`2030` - df_1$`2017`)/df_1$`2017` < 1, "red", "green")

# add more detail
p_1 <- ggplot(df_1) +
geom_segment(aes(x=1, xend=2,y=`2017`, yend=`2030`,col=color), show.legend = F)+
scale_color_manual(values = c("green"="green3","red"="red"))+
geom_vline(data = data.frame(xintercept = c(1, 2)),aes(xintercept = xintercept)) +
xlab("interval")+
ylab("GDP(trillion)") +
xlim(0.5, 2.5) + 
ylim(0,(1.2*max(df_1$`2030`)))+
ggtitle("Green(slope>1) Red(slope<1)")

# add texts
p_1 <- p_1 + geom_text_repel(label=left_label,x=rep(1, NROW(df_1)),y=df_1$`2017`, hjust=1.1, size=4)
p_1 <- p_1 + geom_text_repel(label=right_label,x=rep(2, NROW(df_1)),y=df_1$`2030`,hjust=-0.1, size=4)
p_1 <- p_1 + geom_text(label="Year 2017", x=1, y=1.2*(max(df_1$`2030`)), hjust=1.3, size=6)
p_1 <- p_1 + geom_text(label="Year 2030", x=2, y=1.2*(max(df_1$`2030`)), hjust=-0.3, size=6)

# The 10 countries are divided into two charts considering that the data density will affect the data reading 

# prepare the last 5 countries
df_2 <- data.frame("Country" = c("Egypt", "Russia", "Japan","Germany","Turkey"),
                      "2017" = c(1.2,4.0,5.4,4.2,2.2),
                      "2030" = c(8.2,7.9,7.2,6.9,9.1))
colnames(df_2) <- c("Country", "2017", "2030")
left_label <- paste(df_2$Country, df_2$`2017`,sep=" -> ")
right_label <- paste(df_2$Country, df_2$`2030`,sep=" -> ")
df_2$color <- ifelse((df_2$`2030` - df_2$`2017`)/df_2$`2017` < 1, "red", "green")

# add more detail
p_2 <- ggplot(df_2) +
geom_segment(aes(x=1, xend=2,y=`2017`, yend=`2030`,col=color), show.legend = F)+
scale_color_manual(values = c("green"="green3","red"="red"))+
geom_vline(data = data.frame(xintercept = c(1, 2)),aes(xintercept = xintercept)) +
xlab("interval")+
ylab("GDP(trillion)") +
xlim(0.5, 2.5) + 
ylim(0,(1.2*max(df_2$`2030`)))+
ggtitle("Green(slope>1) Red(slope<1)")

# add texts
p_2 <- p_2 + geom_text_repel(label=left_label,x=rep(1, NROW(df_2)),y=df_2$`2017`, hjust=1.1, size=4)
p_2 <- p_2 + geom_text_repel(label=right_label,x=rep(2, NROW(df_2)),y=df_2$`2030`,hjust=-0.1, size=4)
p_2 <- p_2 + geom_text(label="Year 2017", x=1, y=1.2*(max(df_2$`2030`)), hjust=1.3, size=6)
p_2 <- p_2 + geom_text(label="Year 2030", x=2, y=1.2*(max(df_2$`2030`)), hjust=-0.3, size=6)

Data Reference

Reconstruction

The following plot fixes the main issues in the original.