Assignment 2

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original

Source: Essem Educational (2020).

Objective

The European Union consisted of 27 countries in 2007. The objective of this data visualisation was to show the 6 most populated countries of the European Union in that year, and the comparative size of their populations in relation to the rest of the Union. A pie chart featuring 7 slices was the visualisation of choice. The 6 most populated countries took up 6 slices of the pie, and the last slice - the remaining 29.7% of the pie - was labelled “All other countries (21)”.

The visualisation chosen had the following three main issues:

Colour issues

Pie charts rely on different coloured slices for the audience to visualise the different segments. First of all, there are two shades of red: a light red for Germany and a darker red for Poland. It is difficult to distinguish which slice of the pie belongs to which country.

There are also two shades of yellow; a light yellow for the United Kingdom and a darker yellow/orange for Spain. It is very difficult to tell the difference between the two.

On top of two shades of Red there is also a green slice of the pie – Italy. This would make it difficult for anyone with red-green colour blindness (the most common form of colour blindness) to distinguish between Italy and the two red slices (Germany and Poland).

Insufficient data

This pie chat only shows the 6 largest countries in terms of size of population in the European Union in 2007. There were 27 countries in the European Union in 2007, meaning that there are 21 countries not shown in this diagram. The problem with using a pie chart to show this data is that pie charts are limited in the number of categories they can represent effectively, and perform poorly when proportions are similar. Pie charts with small proportions are hard to see and label, and there isn’t enough room left in the pie to show how the smaller countries compare; to each other and to the larger countries. The 21 smaller countries make up 78% of the amount of countries in the European Union and they have been left out of the visualisation entirely.

Perceptual issues

From a perceptual standpoint, the pie chart was an odd choice. It somewhat explains the relationship between the top 6 countries in the European union, but it does not explain the relationship between those top 6 and the smaller countries. How much more populated are the top 6 than any of the bottom 21? Which country is the least populated? It would have been nice to be able to be able to compare some of the smaller countries with one another.

Without the data of the bottom 21 countries, the visualisation misleads us into thinking that countries like Poland, Spain and Italy aren’t very populated. This is because they are the least populated of the 6 countries included in the chart. The problem is that they are being compared to the three highest populated countries in the Union, and to none of the bottom 21. If there was a way to compare the populations of Poland, Spain and Italy with any or all of the bottom 21 countries, we would be able to get a better idea of how heavily populated these three countries actually are.

Reference

Essem Educational (2020) Describing Pie Charts, Essay Builder website, accessed 25 March 2023. http://www.essaybuilder.net/PieCharts.html

Code

The following code was used to fix the issues identified in the original.

df <- data.frame(Country = c("Belgium", "Bulgaria", "Czech R", "Denmark", "Germany", "Estonia", "Ireland", "Greece", "Spain", "France", "Italy", "Cyprus", "Latvia", "Lithuania", "Luxemborg", "Hungary", "Malta", "Netherlands", "Austria", "Poland", "Portugal", "Romania", "Slovenia", "Slovakia", "Finland", "Sweden", "UK"),
                Population = c(10584, 7679, 10287, 5447, 82314, 1342, 4312, 11171, 44474, 63392, 59131, 778, 2281, 3384, 476, 10066, 407, 16358, 8298, 38125, 10599, 21565, 2010, 5393, 5277, 9113, 60816))

dfA <- df %>% mutate(Percentage = (Population/sum(Population))) %>% arrange(desc(Percentage)) 
                     
dfA$Percentage<-dfA$Percentage*100


#First bar chart

df1 <- dfA[1:9, ]

p1 <- ggplot(data = df1, aes(x=reorder(Country, -Percentage), y = Percentage)) +
  geom_bar(stat="identity", width=0.5, fill= "red") +
  theme(axis.text.x = element_text(angle = 90, size = 10)) +
  labs(x = "Country") +
  labs(y = "Percentage of European Union") +
  labs(title = "Population of Countries of the European Union in 2007 by Percentage 1/3")


#Second bar chart

df2 <- dfA[10:18, ]

p2 <- ggplot(data = df2, aes(x=reorder(Country, -Percentage), y = Percentage)) +
  geom_bar(stat="identity", width=0.5, fill="blue") +
  labs(x = "Country") +
  labs(y = "Percentage of European Union") +
  labs(title = "Population of Countries of the European Union in 2007 by Percentage 2/3")


#Third bar chart

df3 <- dfA[19:27, ]

p3 <- ggplot(data = df3, aes(x=reorder(Country, -Percentage), y = Percentage)) +
  geom_bar(stat="identity", width=0.5, fill="green") +
  labs(x = "Country") +
  labs(y = "Percentage of European Union") +
  labs(title = "Population of Countries of the European Union in 2007 by Percentage 3/3")

Data Reference

Eurostat (2008) Population in Europe 2007: first results, Eurostat Website, accessed 25 March 2023. https://ec.europa.eu/eurostat/documents/3433488/5583236/KS-SF-08-081-EN.PDF.pdf/ff7fa28e-6f67-4d50-8a43-05f90e209f93?t=1414693674000

Reconstruction

The following plot fixes the main issues in the original. I separated the sizes of the countries into three different bar charts. The bar charts each have different values on the y-axis, and I coloured the charts differently in an attempt to make it more obvious that the population sizes were getting lower with each visual. I made it so that across the 3 visualisations the population was continually descending, hopefully making it obvious for the audience to understand that the highest populated country of Visual 2 (Portugal) was the next highest populated country after the least populated country of visual 1 (Greece). Admittedly, it would’ve been nice to have had all 27 countries on the one visual, but I wasn’t sure the best way to do this. But a positive is that the audience can compare the countries that had similar sizes with one another, which is something that you couldn’t do from the pie chart as the least populated 21 countries weren’t even included. We should now not have any Colour or Perception issues, and we have included the data from all 27 countries.

Assignment 2

Deconstruct, Reconstruct Web Report

Patrick Campbell - 3990646

Original

Code

Reconstruction