Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original


United Nations, Department of Economic and Social Affairs, Population Division (2019).


Objective

The original data visualisation was meant to enable a macro understanding of the relationship between population growth and median age by continent and to recognize some countries that contribute to these trends. This is useful for governments in projecting the need for age or family related services by using their census data for median age.

The visualisation chosen had the following three main issues:

  • The first issue was the lack of minor gridlines made it impossible to estimate what a value was for a given point between the major gridlines.
  • The second issue was the crowding of the points on the scatterplot made it impossible to derive the differences between the different continents.
  • The last issue was the colors used. They were not colorblind friendly, and when overlayed caused the continents to blend into each other.

Reference

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
Population_Dataset <- read.csv("D:/Masters Files/Visualizations/Assignment2/Population_Dataset.csv")
Population_Dataset$name <- row.names(Population_Dataset)
p1<-qplot(x = Median_Age,y = Population_Growth, data = Population_Dataset,
       geom = "point", colour = Continent)  +
     labs(
         title = "Population growth rate vs Median age, 2020",
         subtitle = "Median age is the age that divides the population in two parts of equal size, that is, there are as many persons with 
        ages above the median as there are with ages below the median. In this metric of population growth changes due 
        to migration are excluded and only births and deaths are taken into account.",
         caption = "Age Structure by Hannah Ritchie and Max Roser First published in September 2019",
         x = "Median Age (years)",
         y = "Annual Rate of Natural Population Increase (percent)") +
    scale_color_manual(values=c("#488f31", "#8aac49", "#c6c96a","#ffe792","#f8b267","#eb7a52","#d43d51")) +
    scale_y_continuous(minor_breaks = seq(-1, 10, 0.5),limit = c(-0.8, 4)) + 
    scale_x_continuous(minor_breaks = seq(15, 55, 2.5),limit = c(15, 55),breaks = seq(15,55,5)) +
  facet_wrap(~ Continent, ncol=4) +
    theme(aspect.ratio = 2
            ,plot.title = element_text(color = "black", size = 14, face = "bold", hjust = 0.5)
            ,plot.subtitle = element_text(color = "black", size = 8, hjust = 0.5)
            ,plot.caption = element_text(color = "black", size = 6, face = "italic", hjust = 1)
            ,axis.title = element_text(color = "black", size = 7)
            ,axis.text.x = element_text(size=5, angle=45
                                        ,face="bold")) +
    geom_point(size = 0.3) +
    #geom_smooth(method = "lm") +
    geom_text(data=subset(Population_Dataset, Entity>1)
            ,aes(label=Entity,family="serif")
            ,size = 1.8
            ,check_overlap = TRUE
            ,hjust = 0
            ,nudge_x = 0.2
            ,nudge_y=0.05
            ,lineheight=1
            ,colour="black"
            ,fontface="bold")

Data Reference

Reconstruction

The following plot fixes the main issues in the original.