Objective
The main objective of the original data visualisation is to depict average learning outcome and GDP rate of various countries from the standardised achievement tests organised for students across the world. It demonstrates Gross Domestic Product(GDP) on the y-axis and Average Leaning Rate on the x-axis. The results of tests conducted were pooled from subjects such as math, science and reading at primary and secondary levels of education. The price difference were adjusted over time for the GDP. (Note: Only year 2015 is being considered here.) Three major issues that were obvious from looking at the visualisation are as follows:
Visual Perception and Color Issue: The graph doesn’t convey a clear standardised message as a good data visualisation should provide. There is no important pattern being recognised by the audience by looking at the graph. Each bubble in the plot represents a country. It doesn’t consider colour-blind people as it will be difficult for them to notice the difference between green and red. For example, the big green circle for India contains minute red circles inside it representing countries like Nicaragua, Honduras and Guatemala. Such examples were repeated all over the graph for other countries too.
Deceptive method - Area and Size as quantity: The size of the bubble for a country in the graph represents the population. Only two countries with huge population namely, China, India and the US can be seen clearly from the graph. Firstly, there was no need to consider population for answering the practical question. Second, all other countries are represented by a dot which doesn’t reveal the population accurately. This is creating a deceptive impact on the audience. Moreover, since the all countries are in form of dots, it is hard to scroll over a particular country and differentiate amongst them since these are overlapping on each other hence, by using the size of the bubble, a clear message is not conveyed through the graph.
Visual Bombardment: Considering the third guiding principle of Kirk, “Creating accessibility through intuitive design”, using too many colors in the above plot is not creating effective human visual communication. Grouping countries data on the basis of continent seems to make the visualisation more complex and not easy to understand. There is lack of information due to the visual bombardment of the big data that is used to plot the graph. There are far too many countries and it is inappropriate to display all the countries by over-plotting the bubbles.
Reference
The following code was used to fix the issues identified in the original.
library(dplyr)
library(matlib)
library(ggplot2)
library(tidyverse)
#Reading data from csv file
data = read.csv('learning-outcomes-vs-gdp-per-capita.csv')
data1 = filter(data, Year==2015) #Filtering data for the year 2015
data1 = subset(data1, select = -c(6,7)) #removing unwanted columns
#Renaming the columns 1,4 and 5
names(data1)[1] <- "Country"
names(data1)[4] <- "Avg_learning_outcome"
names(data1)[5] <- "GDP"
#View(data1)
#filtering top 10 countries on the basis of GDP
top_10 <- top_n(data1, 10, GDP)
top_10
## Country Code Year Avg_learning_outcome GDP Continent
## 1 Ireland IRL 2015 535.42 54278 Europe
## 2 Kuwait KWT 2015 372.30 71354 Asia
## 3 Luxembourg LUX 2015 508.88 55972 Europe
## 4 Norway NOR 2015 530.48 82713 Europe
## 5 Qatar QAT 2015 438.90 156029 Asia
## 6 Saudi Arabia SAU 2015 375.60 51681 Asia
## 7 Singapore SGP 2015 619.17 65660 Asia
## 8 Switzerland CHE 2015 543.83 59307 Europe
## 9 United Arab Emirates ARE 2015 460.49 74746 Asia
## 10 United States USA 2015 529.09 52591 North America
#Top 10 GDP countries
plot1 <- ggplot(top_10, aes(x=Country, y=GDP)) + geom_bar(stat = "identity", color = '#1A237E', fill = "#0277BD") + geom_text(aes(label = `GDP`), vjust = -0.2, size = 3.0) + labs( title="GDP of Top 10 Countries in Year 2015", x="Countries", y="Gross Domestic Product (GDP)") + theme_bw() + theme(plot.title = element_text(hjust = 0.5, face="bold"), axis.text.x = element_text(angle=45, hjust=1, size = 11), axis.text.y = element_text(size = 12), axis.title = element_text(face = "bold"))
#Learning outcome of top 10 GDP countries
plot2 <- ggplot(top_10, aes(x=Country, y=Avg_learning_outcome)) + geom_bar(stat = "identity", color = '#1A237E', fill = "#0277BD") + geom_text(aes(label = `Avg_learning_outcome`), vjust = -0.2, size = 3.0) + labs( title="Learning Outcome of Top 10 GDP Countries in Year 2015", x="Countries", y="Average Learning Outcome")+ theme_bw() + theme(plot.title = element_text(hjust = 0.5, face="bold"), axis.text.x = element_text(angle=45, hjust=1, size = 11), axis.text.y = element_text(size = 12), axis.title = element_text(face = "bold"))
Data Reference
The graphs given below resolve all the issues that were states in the original plot. The top 10 countries are selected and a bar plot for their GDP is plotted. To easily understand the average learning outcome of these countries with highest GDP, another bar graph was plotted. Also considering the color-blindness in people, one color was chosen, the proportion of learning outcome can be observed without any eye-straining.