Original


Source: https://i.redd.it/559j07gnzyia1.png.


Objective

Explain the objective of the original data visualisation and the targetted audience.

The visualisation chosen had the following three main issues:

Data visualisation aims to clearly understand the process of representing data in a visual format, such as charts, graphs, and maps, to facilitate understanding and analysis. Here we found the data about annual healthcare expenditures due to insufficient physical activity for this assessment. Original visualisation must give the correct output to understand the purpose of creating this data and the chart. I found three major issues with a chart and worked on it to give new output.

  • Issue No:1

The chart

This chart is difficult to understand because Box charts provide less detail than other charts, such as histograms or density plots. They do not show the exact shape of the distribution or the frequency of individual values, which can be misleading. Box charts can be challenging to compare data groups with different scales or distributions. Box charts may not accurately represent highly skewed distributions or have long tails. In such cases, other charts or transformations may be more appropriate, which we will apply here.

  • Issue No:2

The colour

Colour is crucial in data visualisation as it can significantly impact how people interpret and understand data. Here are a few reasons issues that we found in this chart Different size boxes and different values, but still, the colour is the same. Colours highlight patterns, trends, and relationships in the data. For example, using different colours to represent different categories or groups can make it easier to identify patterns and trends within each group. Also, it can make visualisation more visually appealing and engaging, which can help to draw viewers’ attention and increase their interest in the data. Using colour can also make data visualisation accessible to people with colour vision deficiencies. By using a colour palette with high contrast and avoiding certain colour combinations, such as red and green, data visualisation can be made more accessible to a broader range of viewers. Also, it can be used to convey meaning or emphasise important information. For example, using red to highlight an exceptionally high or low value can draw attention to the significance of that value. Colours can also convey emotions or associations relevant to the data being visualised. For example, warm colours, such as red or orange, may give a sense of urgency or danger, while cool colours, such as blue or green, may convey a sense of calmness or serenity. Careful consideration and appropriate use of colour can enhance comprehension, engagement, accessibility, meaning, and emotion in data visualisation; that is why we used one colour for one finding.

  • Issue No:3

Size of the box

The box size is the same in some areas despite having different values and decent searches, which can confuse the viewers. Using the same size can make it difficult to emphasise significant differences or patterns in the data. Viewers may need help identifying which groups have higher or lower values, outliers, or extreme values. Using the same size can also limit the information the chart conveys. Viewers may need to include important details that could be highlighted through different dimensions. It is essential to consider the purpose and context of the data visualisation before deciding on the appropriate box chart design. That is why we have used a bar chart to present the corrected version.

Reference

Home (no date) Step Up! Tackling the Burden of Insufficient Physical Activity in Europe | OECD iLibrary. Available at: https://www.oecd-ilibrary.org/sites/8a333796-en/index.html?itemId=%2Fcontent%2Fcomponent%2F8a333796-en (Accessed: March 31, 2023). OECD SPHEP models (no date). Available at: https://www.oecd.org/health/OECD-SPHEP-Models-Brochure-2020.pdf (Accessed: March 31, 2023). R/dataisugly - population map - annual healthcare expenditures due to insufficient physical activity (2022) (no date) reddit. Available at: https://www.reddit.com/r/dataisugly/comments/115gi3i/population_map_annual_healthcare_expenditures_due/?utm_source=share&utm_medium=web2x&context=3 (Accessed: March 31, 2023). R/dataisugly - population map - annual healthcare expenditures due to insufficient physical activity (2022) (no date) reddit. Available at: https://www.reddit.com/r/dataisugly/comments/115gi3i/population_map_annual_healthcare_expenditures_due/?utm_source=share&utm_medium=web2x&context=3 (Accessed: March 31, 2023).

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(dplyr)

healthcare <- data.frame(
  country = c("Germany","Italy","France","Netherlands","Portugal","Sweeden" , "Austria","Spain","Belgium","Poland","Ireland","Greece","Finland","Romania","Denmark","Rest"),
  expenditures = c(2062,1327,1094,309,229,295,191,822,337,169,136,104,95,119,116,189)
)

# reorder countries based on expenditure in descending order
healthcare$country <- reorder(healthcare$country, healthcare$expenditures, FUN = function(x) -x)

ggplot(data = healthcare, aes(x = country, y = expenditures)) +
  geom_bar(stat = "identity", fill = "blue") +
  labs(title = "Healthcare Expenditures by Country") +
  xlab("Country") +
  ylab("Expenditures")

Data Reference

Home (no date) Step Up! Tackling the Burden of Insufficient Physical Activity in Europe | OECD iLibrary. Available at: https://www.oecd-ilibrary.org/sites/8a333796-en/index.html?itemId=%2Fcontent%2Fcomponent%2F8a333796-en (Accessed: March 31, 2023).

Reconstruction

The following plot fixes the main issues in the original.

ggplot(data = healthcare, aes(x = country, y = expenditures)) +
  geom_bar(stat = "identity", fill = "blue") +
  labs(title = "Healthcare Expenditures by Country") +
  xlab("Country") +
  ylab("Expenditures")