Setup

library(pacman); p_load(ggplot2, ggthemr, ggrepel, scales)

data

Rationale

A recent graph depicts the relationship between country’s per capita GDPs (current US$) and their electricity consumption per capita in kilowatt-hour terms with the caption “No such thing as a low-energy rich country”. The graph is very misleading because it appears to show a large quadrant of missing low-energy rich countries who are actually concealed due to the log scale on the y-axis. When you log one variable and not the other, you can easily make an asymptote or extreme relationship appear. For this reason, I have downloaded the World Bank’s per capita (current US$) data for 2017 and 2019 from here, https://data.worldbank.org/indicator/NY.GDP.PCAP.CD, and I have scraped the IEA’s electricity consumption per capita in MWh data from here, http://energyatlas.iea.org/#!/tellmap/-1118783123/1. I used the last years the IEA had good data, since the quality of data fell precipitously in 2020. I used 2017 and 2019 for these purposes, since 2018 was not different enough from 2019.

Analysis

First 2017

ggthemr("dust")

ggplot(data, aes(x = Year17, y = MWhPC17)) +
  geom_point(size = 2) +
  labs(
    x = "GDP Per Capita (Current US$)",
    y = "Electricity Consumption per Capita (MWh)",
    title = "Electricity and Income in 2017") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = dollar_format()) + 
  scale_y_continuous(labels = number_format())

ggplot(data, aes(x = log(Year17), y = MWhPC17)) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Electricity Consumption per Capita (MWh)",
    title = "Electricity and Log Income in 2017") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = number_format())

ggplot(data, aes(x = Year17, y = log(MWhPC17))) +
  geom_point(size = 2) +
  labs(
    x = "GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Income in 2017") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = dollar_format()) + 
  scale_y_continuous(labels = trans_format("exp", number_format()))

ggplot(data, aes(x = log(Year17), y = log(MWhPC17))) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Log Income in 2017") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = trans_format("exp", number_format()))

Now 2019

ggplot(data, aes(x = Year19, y = MWhPC19)) +
  geom_point(size = 2) +
  labs(
    x = "GDP Per Capita (Current US$)",
    y = "Electricity Consumption per Capita (MWh)",
    title = "Electricity and Income in 2019") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = dollar_format()) + 
  scale_y_continuous(labels = number_format())

ggplot(data, aes(x = log(Year19), y = MWhPC19)) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Electricity Consumption per Capita (MWh)",
    title = "Electricity and Log Income in 2019") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = number_format())

ggplot(data, aes(x = Year19, y = log(MWhPC19))) +
  geom_point(size = 2) +
  labs(
    x = "GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Income in 2019") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = dollar_format()) + 
  scale_y_continuous(labels = trans_format("exp", number_format()))

ggplot(data, aes(x = log(Year19), y = log(MWhPC19))) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Log Income in 2019") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = trans_format("exp", number_format()))

Discussion

The persuasiveness of the electricity-income graph that has gone viral was derived substantially from its logged y-axis. When neither axis is logged, it looks like noise and when the other axis is the only one that is logged, the same conclusion can be reached but based on a reflection across the diagonal. When both axes are logged, we get the most powerful conclusion because it neither overrepresents the relationship, nor does it make the graph unusable due to immense differences in scale in either axis. Instead, the relationship becomes linear, as it does actually seem to be, and it becomes apparent that the original conclusion is substantially correct - that abundance is necessary - without giving the appearance of an asymptote or appearing like the conclusion was forced by scaling.

Since this gives us the most useful plot, here are both years with country labels.

ggplot(data, aes(x = log(Year17), y = log(MWhPC17))) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Log Income in 2017") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(#breaks = c(6, 7, 8, 9, 10, 11), #manual is possible
                       labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = trans_format("exp", number_format())) +
  #scale_y_continuous(breaks = c(-2, 0, 2, 4), #manual also possible here
  #                   labels = c(round(exp(-2), 3), round(exp(0), 3), round(exp(2), 3), round(exp(4), 3))) + 
  geom_text_repel(aes(label = Country))

ggplot(data, aes(x = log(Year19), y = log(MWhPC19))) +
  geom_point(size = 2) +
  labs(
    x = "Log GDP Per Capita (Current US$)",
    y = "Log Electricity Consumption per Capita (MWh)",
    title = "Log Electricity and Log Income in 2019") +
  theme(
    plot.title = element_text(hjust = .5),
    plot.caption = element_text(hjust = .5)) +
  scale_x_continuous(labels = trans_format("exp", dollar_format())) + 
  scale_y_continuous(labels = trans_format("exp", number_format())) + 
  geom_text_repel(aes(label = Country))