Updated 7/8/2019
In this article, I will share to you one way to remake an interesting Economist Plot “Cherry Bomb: the trend in early blossoming date of Sakura trees in Kyoto”, using ‘ggplot2’ package in R. The raw dataset used for this graph was obtained from a phenological dataset that was previously collected by Dr. Yasuyuki Aono from Osaka Prefecture University12.
Below are the steps to remake the Economist Plot.
These are the basic packages that I will use. Along the way, there will be some packages that I will install before performing some steps to better understand and remember its uses.
library(dplyr) # to simplify data transformation
library(tidyr) # to tidy data
library(lubridate) # to deal with date
library(ggplot2) # to create plot
The first thing we have to do is to load the dataset into RStudio. In my case, I have already downloaded the dataset and saved it in my working directory.
sakura <- read.csv("sakura.csv")
sakura
These are the descriptions of each column:
The next step is data wrangling or the process of tidying and transforming data from a raw dataset into another format that is easier to plot (sufficient).
# selecting column for plotting
sakura <- sakura %>%
select(AD, full.flowering.date)
sakura
# removing rows with missing value (NA) in 'full.flowering.date'; rename AD to Year
sakura.used <- sakura %>%
filter(full.flowering.date != "NA") %>%
rename(Year=AD)
sakura.used
# separate full.flowering.date into month and day coloumn
sakura.plot <- sakura.used %>%
separate(full.flowering.date, into = c("month", "day"), sep = 1)
sakura.plot
# removing leading zero in day
library(numform)
sakura.plot$day <- sakura.plot$day %>%
f_num(zero = NULL, digits = 0)
sakura.plot
# make new column containing Date
sakura.plot <- sakura.plot %>%
transform(date = as.Date(paste(Year, month, day, sep = "-")))
sakura.plot
# replace month column with month as in Date
sakura.plot <- sakura.plot %>%
mutate(month = month(date, label = T, abbr = F),
day = day(date))
sakura.plot
unique(sakura.plot$month)
## [1] April March May
## 12 Levels: January < February < March < April < May < June < ... < December
If we look at the order of our month column, it is still in the reverse format. Therefore I will reverse the order of the levels.
sakura.plot$month <- factor(sakura.plot$month, levels=rev(levels(sakura.plot$month)))
unique(sakura.plot$month)
## [1] April March May
## 12 Levels: December < November < October < September < August < ... < January
canvas <- ggplot(sakura.plot, aes(x = Year, y = day)) +
facet_grid(month~., scales = "free", space = "free", switch = "both")
canvas
geom1 <- canvas + geom_point(shape = 42, size = 6, color = "#af427e")
geom1
geom2 <- geom1 +
geom_smooth(aes(fill = "Trend"), span = 0.1, se = FALSE, color = "#644128") +
geom_smooth(aes(color = "Confidence Interval"), span = 0.1, fill = "#a56c56", linetype = 0)
geom2
geom3 <- geom2 +
scale_x_continuous(
limits = c(812,2020),
breaks = c(800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2016),
labels = c("800", "", "1000", "", "1200", "", "1400", "", "1600", "", "1800", "", "", "2016")) +
scale_y_continuous(
breaks = c(1, 10, 20),
labels = c("1st", "10th", "20th"))
geom3
geom4 <- geom3 +
labs(title = "Cherry Bomb",
subtitle = "Date of cherry-blossom peak-bloom in Kyoto, Japan, 800AD - 2016",
x = expression(italic("Year")),
y = expression(italic("Date of cherry-blossom peak-bloom")),
caption = "Source: Yasuyuki Aono, Osaka Prefecture University")
geom4
cherrybomb <- geom4 + theme(
panel.background = element_rect(fill = "#ffffff"),
panel.grid.major.y = element_line(color = "#a56c56",
linetype = "solid"),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
panel.spacing = unit(0.1, "cm"),
axis.line.x = element_line(colour = "black"),
axis.line.y = element_blank(),
axis.text = element_text(size = 8),
axis.title = element_text(size = 8),
legend.position = c(0.81, 1),
legend.key = element_blank(),
legend.box = "horizontal",
legend.box.spacing = unit(1, "mm"),
legend.title = element_blank(),
legend.text = element_text(size = 8),
legend.background = element_blank(),
plot.title = element_text(hjust = 0,
face = "bold",
size = 12),
plot.subtitle = element_text(hjust = 0,
face = "plain",
size = 10),
plot.caption = element_text(size = 8,
colour = "#B3B1B1",
hjust = 0),
strip.placement = "outside",
strip.background = element_rect(fill = "#e8dbd6"),
strip.switch.pad.grid = unit(0.2, "cm"),
strip.text = element_text(size = 8))
Here is our final Cherry Bomb Plot!
cherrybomb
This graph was both fun and challenging to make, unfortunately, there are some features that I wasn’t able to remake:
Uncomplete label for tickmarks in Y-axis.
I find it difficult to remake the different y-axis tickmarks and its labels for each facet in the plot. It is very unfortunate that facet_grid function is not yet provided with the ability to have/set different scales/limits and breaks for each facet. And yet, I also haven’t found a way to assign different breaks and limits for each facet. Thankfully the communities are developing it right now. For more info, you can click here.
Real Sakura.
I wonder if you’ve noticed, but if you look closely, the scatter point from the real Economist Plot is actually real Sakura flowers! how cute is that! Unfortunately, on my graph, I only used the ASCII character of the 5-pointed star that I hope able to resembles the sakura flower. I haven’t found a way to add a picture into a scatterplot as geom_point. Although, there is a package that allow us to add an .svg and .png picture in to the plot. For more info you can explore more about grImport2 & grConvert and png package.
Thank you for reading!