Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.
Objective
The visualizations above were adapted from CNN Health with the aim of showing the audience that child obesity is a growing worldwide problem in many countries. The visualizations target the global audience, specifically general public of countries with high childhood obesity prevalence, policymakers, parents or families with children, and possibly education providers and childcare centers as well. According to the World Health Organization (2021), over 340 million children and adolescents aged 5-19 were overweight or obese in 2016. The World Health Organization has also found that childhood obesity not only contributes to poor health, but it is associated with higher chance of premature death and disability in adulthood, and might lead to chronic diseases such as cardiovascular diseases, stroke, diabetes, mental illness, musculoskeletal disorders and certain types of cancer. The major causes of childhood obesity are the global shift in diet towards food which have high fat and sugar content, low levels of physical activity and other social economic factors such as the increase in processed food consumption, marketing of energy-dense foods, urbanization and so on (World Health Organization, 2020). Due to the fact that childhood obesity is carried through into adulthood, the purpose of the visualizations is to raise global awareness and also awareness of families to understand the severity of childhood obesity and how important it is to prevent it. By understanding the seriousness of this issue and the root causes of childhood obesity, parents and teachers can make changes to help children or students promote a healthy lifestyle by limiting their sugar and fat intakes, changing to healthier food choices, and increasing their levels of daily physical exercises in effort to prevent childhood obesity; country policymakers such as governments can implement health sector reforms to improve health services and better respond to the growing childhood obesity epidemic.
The visualizations were plotted based on the data from the World Health Organization. The data was based on a pooled analysis of 2416 population-based studies with measurements of height and weight on 128.9 million participants aged 5 and above, in which of them includes 31.5 million children and adolescents aged 5-19 years (Lancet, 2017). The prevalence of Body Mass Index (BMI) for children and adolescents aged 5-19 is defined as a percentage of defined population with a BMI that is two standard deviations above the WHO Growth Reference median. In other words, the prevalence of obesity among children and adolescents is the percentage of children and adolescents aged 5-19 with a BMI > +2 standard deviations above the median (World Health Organization, 2021).
Issues
The visualizations chosen had the following three main issues:
Poor choice and selection of data visualization used to visualize obesity prevalence in younger children and adolescents around the world. The designer has constructed two static choropleth maps to visualize data for two different age groups. Due to the choropleth maps being static instead of interactive, the designer was unable to include the basic information of every country within the visualizations, such as each country’s name and its respective prevalence of obesity or additional information such as gender, year, region and age group. The designer could have chosen a better static data visualization such as a bar graph which can facet both age groups within a single data visualization for a better side-by-side comparison. Due to the limitations of static choropleth maps, the designer has sparingly labelled 12 out of 195 countries with their respective obesity prevalence, and there is no reasoning to the labeling choice of the 12 chosen countries. For example, labeling Chile instead of Argentina which has a higher obesity prevalence, labeling Nauru which has the highest obesity prevalence, but not Burkina Faso which has the lowest obesity prevalence and so on. There is uncertainty as to why the designer chose to draw the audiences’ attention to those 12 countries. The labels were also not positioned properly with a pointer between the country being labelled and the label itself. For example, the label for Saudi Arabia was positioned between Somalia and India, the label for France was positioned beside Spain and the label positioned beside United States does not cover Alaska. For audience who are not familiar with geography, this can cause confusion. For example, it is easy for them to mistaken Saudi Arabia as a country on the light green color scale and be interpreted as having a low childhood obesity prevalence if they don’t match the percentage on the label against the color scale legend. Furthermore, for audience who are not familiar with the world map, it is impossible for them to deduce any information from these data visualizations without the help of the labels. They can only visualize which geographical area on the map has the lightest and darkest color, without knowing what countries are located under those geographical areas. If the audience need to rely heavily on the labels instead of getting the actual message from visualizing the choropleth map itself, it defeats the purpose of using a choropleth map in the first place.
According to Kirk (2014) and Pandey et al. (2015), regardless of the designer’s intent to deceive, if the designer did not take responsibility and precautions to minimize the possibility of deceiving the audience, the data visualization is deceptive if the message depicted by the visualization differs from the actual message itself. In this case, both visualizations are visually deceptive if the audience relied solely on the sequential color scale to draw conclusions. If we look at the first visualization for children aged 5-9 without reading the labels, we would deduce that the United States of America, Argentina and Egypt have high childhood obesity prevalence compared to the other countries; and looking at the second visualization for children aged 10-19, it looks like the United States of America has the highest obesity prevalence, followed by Egypt and Saudi Arabia. This is definitely not the case when we look into the real data. Based on the original data from the World Health Organization, the top 10 countries with the highest childhood obesity prevalence are Pacific Island Countries, such as Nauru, Cook Islands, Palau, Niue, Marshall Islands, Tuvalu, Tonga, Kiribati, Micronesia and Samoa. However, these countries are small and are not visible on both of the world map visualizations, except for Nauru which has been specifically labelled. Visually, Nauru is only a small dot on the map, which otherwise wouldn’t be noticeable if the designer did not label it. In this case, even if the audience is familiar with geography, it is impossible to draw a correct conclusion because the important message is not present within the visualization for the audience to visualize in the first place. Besides, there are also countries with no data available, depicted in grey patches within the world map, but the designer did not address them within the visualizations. Both visualizations have been tested using the color blindness simulator, and have shown no issues to potential audience with color blindness.
Both of the visualizations could have been built on a better purpose. Based on the plot title, the designer wanted the audience to look at the obesity prevalence of children and adolescents across all the countries around the world. However this is uninformative because the aim of the visualizations is to raise global awareness that childhood obesity is an emerging public health concern in many countries, therefore it is only useful if the visualizations inform the audience about which countries are facing this major health concern and urgently require attention. In other words, the visualizations should specifically highlight which are the top countries with high childhood prevalence, so that interventions can be carried out and actions can be taken on all levels across those countries to prevent childhood obesity and improve the current condition. Given that the data was plotted on a choropleth map, it is natural that the audience would be interested to know where childhood obesity is most and least prevalent. Based on the visualizations, the obesity prevalence was split into 6 ranges displayed on the sequential color scale from 0% to 30%. Besides the 12 countries which were labelled with their respective obesity prevalence, the audience cannot compare the obesity prevalence between two or more countries, nor be able to visualize which countries have the highest and lowest obesity prevalence, because the choropleth maps rely on color which has low visual comparison accuracy. The most they can conclude is that a group of countries have similar childhood obesity prevalence which falls within a certain range, and again, this is only possible if the audience is familiar with the countries on the world map. It would be useful if the audience can visualize the global ranking of the top few countries with the highest and lowest childhood obesity prevalence for both age groups.
Reference
Baglin, J. (2020). Chapter 4 Avoiding Deception [Course Material]. Canvas @ RMIT University. https://dark-star-161610.appspot.com/secured/_book/avoiding-deception.html Baglin, J. (2021). Chapter 6: Multivariate Strategies [Lecture slides]. Canvas @ RMIT University. https://dark-star-161610.appspot.com/secured/_book/demos/DataVis-Week-07-Demo.html#/ Colblinder. (n.d.). Coblis — Color Blindness Simulator. https://www.color-blindness.com/coblis-color-blindness-simulator/ Howard, J. (2019, February 13). Why these Pacific Island nations have world’s highest childhood obesity prevalences. CNN Health. https://edition.cnn.com/2019/02/13/health/child-obesity-parenting-without-borders-intl/index.html Kirk, A. (2014). The Fine Line Between Confusion and Deception. https://www.visualisingdata.com/2014/04/the-fine-line-between-confusion-and-deception/ Lancet. (2017). Worldwide trends in body-mass index, underweight, overweight, and obesity from 1975 to 2016: a pooled analysis of 2416 population-based measurement studies in 128·9 million children, adolescents, and adults. Lancet, 390(10113), 2627–2642. https://doi.org/10.1016/S0140-6736(17)32129-3 Pandey, A, V., Rall, K., Satterthwaite, M. L., Nov, O., Bertini, E. (2015). How Deceptive are Deceptive Visualizations?: An Empirical Analysis of Common Distortion Techniques. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, - CHI ’15, 1469–1478. https://doi.org/10.1145/2702123.2702608 World Health Organization. (2020). Noncommunicable diseases: Childhood overweight and obesity. https://www.who.int/news-room/q-a-detail/noncommunicable-diseases-childhood-overweight-and-obesity World Health Organization. (2021). Obesity and overweight. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight
The following code was used to fix the issues identified in the original.
library(ggplot2)
library(readr)
library(tidyverse)
library(magrittr)
library(forcats)
library(tidytext)
library(ggrepel)
library(colourpicker)
library(RColorBrewer)
library(ggthemes)
library(ggtext)
library(cowplot)
library(countrycode)
library(ggpubr)
setwd("C:/Users/laite/Desktop/Data Visualization/Assignment 2")
#1st data set (Young Children (Age 5-9))
young <- read_csv("data_gender (5-9).csv", skip=3) %>%
separate(`Both sexes`, into = c("Young Children (Age 5-9)", "Range"), sep = " ") %>%
select(`Country`, `Young Children (Age 5-9)`)
young$`Young Children (Age 5-9)` %<>% as.numeric()
young$Country %<>% as.factor()
young %<>% na.omit(young)
#2nd data set (Adolescents (Age 10-19))
adol <- read_csv("data_gender (10-19).csv", skip=3) %>%
separate(`Both sexes`, into = c("Adolescents (Age 10-19)", "Range"), sep = " ") %>%
select(`Country`, `Adolescents (Age 10-19)`)
adol$`Adolescents (Age 10-19)` %<>% as.numeric()
adol$Country %<>% as.factor()
adol %<>% na.omit(adol)
#Join both data sets together under complete_join
complete_join <- young %>% full_join(adol, by = "Country") %>%
pivot_longer(names_to = "Children and Adolescents", values_to = "Prevalence of Obesity (%)", cols = 2:3) %>%
mutate(Continent = countrycode(Country, 'country.name', 'continent'))
#Factorise and level the two age groups
complete_join$`Children and Adolescents` %<>% factor(levels = c("Young Children (Age 5-9)", "Adolescents (Age 10-19)"))
#Create a rank variable to rank the complete 20 countries with the highest and lowest obesity prevalence
top_join <- complete_join %>%
mutate(Country = reorder_within(Country, `Prevalence of Obesity (%)`,`Children and Adolescents`)) %>%
group_by(`Children and Adolescents`) %>%
mutate(`Global Rank` = rank(-`Prevalence of Obesity (%)`, ties.method = "min")) %>% top_n(-20)
bottom_join <- complete_join %>%
mutate(Country = reorder_within(Country, -`Prevalence of Obesity (%)`,`Children and Adolescents`)) %>%
group_by(`Children and Adolescents`) %>%
mutate(`Global Rank` = rank(`Prevalence of Obesity (%)`, ties.method = "min")) %>% top_n(-20)
#Colorblind friendly color palette from ggthemes() and RColorBrewer
pal <- c('Africa' = "#0072B2", 'Americas' = "#FDBF6F", 'Asia' = "#CAB2D6", 'Europe' = "#A6CEE3", 'Oceania' = "#E31A1C")
#Plot top 20 highest
plot_top_join <- ggplot(data = top_join,
aes(x=Country, y=`Prevalence of Obesity (%)`, fill = `Continent`)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_x_reordered() +
scale_y_continuous(breaks = seq(0, 40, by = 10)) +
facet_wrap(~`Children and Adolescents`, scales="free_y") +
geom_text(aes(label = `Global Rank`, y=0), hjust = "top", size = 17, color = "gray10", nudge_y = -0.15) +
geom_text(aes(label = `Prevalence of Obesity (%)`), hjust = 0, size = 17, color = "black", nudge_y = 0.15) +
labs(
title = "Top 20 Countries with the <span style = 'color: red;'>Highest</span> Obesity Prevalence* in Young Children and Adolescents",
subtitle = "Countries of the <span style = 'color: blueviolet;'>Pacific Islands</span> in the <span style = 'color: #E31A1C;'><b>Oceania</b></span> continent have the Highest Prevalence of Childhood Obesity") +
theme_bw() + scale_fill_manual(values = pal) +
theme(axis.title.x = element_text(size = 73, face = "bold", margin = margin(30,0,30,0)),
axis.text.y = element_text(size = 68, colour = "black"),
axis.text.x = element_text(size = 68, colour = "black"),
plot.title = element_markdown(size = 85, face = "bold", hjust = 0, color = "black", margin = margin(0,0,50,0)),
plot.subtitle = element_markdown(size = 80, hjust = 0, color = "navy", margin = margin(10,0,60,0)),
strip.text.x = element_text(size = 68, color = "black", face = "bold", margin = margin(20,0,20,0)),
strip.background = element_rect(color= "gray20", fill="ivory", size=1, linetype="solid"),
axis.title.y = element_blank(),
legend.title = element_text(size = 73, face = "bold"),
legend.text = element_text(size = 68, margin = margin(0,30,0,0)),
legend.key.size = unit(5,"line"),
panel.grid.minor = element_blank())
#Plot top 20 lowest
plot_bottom_join <- ggplot(data = bottom_join,
aes(x=Country, y=`Prevalence of Obesity (%)`, fill = `Continent`)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_x_reordered() +
scale_y_continuous(breaks = seq(0, 10, by = 1)) +
facet_wrap(~`Children and Adolescents`, scales="free_y") +
geom_text(aes(label = `Global Rank`, y=0), hjust = "top", size = 17, color = "gray10", nudge_y = -0.02) +
geom_text(aes(label = `Prevalence of Obesity (%)`), hjust = 0, size = 17, color = "black", nudge_y = 0.02) +
labs(
title = "Top 20 Countries with the <span style = 'color: green;'>Lowest</span> Obesity Prevalence* in Young Children and Adolescents",
subtitle = "Countries of <span style = 'color: #0072B2;'><b>Africa</b></span> have the Lowest Prevalence of Childhood Obesity",
caption = "Data Source: World Health Organization, 2017a; World Health Organization, 2017b.\n *Prevalence of Obesity among Children and Adolescents: BMI > +2 standard deviations above the median.\n There is no data available for Monaco (Europe), San Marino (Europe), Sudan (Africa) and South Sudan (Africa).") +
theme_bw() + scale_fill_manual(values = pal) +
theme(axis.title.x = element_text(size = 73, face = "bold", margin = margin(30,0,40,0)),
axis.text.y = element_text(size = 68, colour = "black", margin= margin(0,0,0,30)),
axis.text.x = element_text(size = 68, colour = "black"),
plot.title = element_markdown(size = 85, face = "bold", hjust = 0, color = "black", margin = margin(0,0,50,0)),
plot.subtitle = element_markdown(size = 80, hjust = 0, color = "navy", margin = margin(10,0,60,0)),
plot.caption = element_text(size = 60, hjust = 1, margin = margin(20,0,30,0)),
strip.text.x = element_text(size = 68, color = "black", face = "bold", margin = margin(20,0,20,0)),
strip.background = element_rect(color = "gray20", fill="ivory", size=1, linetype="solid"),
axis.title.y = element_blank(),
legend.title = element_text(size = 73, face = "bold"),
legend.text=element_text(size = 68, margin = margin(0,30,0,0)),
legend.key.size = unit(5,"line"),
panel.grid.minor = element_blank())
plot_combined <- ggarrange(plot_top_join, plot_bottom_join, common.legend = T, legend = "right", ncol = 1, align = "hv")
Data Reference
World Health Organization. (2017a). Prevalence of obesity among children and adolescents, BMI>+2 standard deviation above the median, crude estimates by country, among children aged 5-9 years. https://apps.who.int/gho/data/view.main.BMIPLUS2C05-09v?lang=en
World Health Organization. (2017b). Prevalence of obesity among children and adolescents, BMI>+2 standard deviation above the median, crude estimates by country, among children aged 10-19 years. https://apps.who.int/gho/data/view.main.BMIPLUS2C10-19v?lang=en
The following plot fixes the main issues in the original.