Welcome to your last ggplot2 live workshop!
In this exercise you will be using a subset of the
gapminder dataframe. Create a subset called
gap_small, which only contains data from 1952 and 2007.
gap_small <- gapminder %>%
filter(year == 1952 | year == 2007)First, plot a simple histogram showing the distribution of life expectancy (lifeExp) in your dataframe.
ggplot(data = gap_small ,
mapping = aes(x = lifeExp)) +
geom_histogram()## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Set binwidth = 5 to give each bar a 5-year range (and
get rid of the persistent warning message).
ggplot(data = gap_small,
mapping = aes(x = lifeExp)) +
geom_histogram(binwidth = 5)Use the fill argument to create a stacked histogram with
two fill colors, one for each year.
ggplot(data = gap_small,
mapping = aes(x = lifeExp, fill = as.factor(year))) +
geom_histogram(binwidth = 5)Hint: year is treated as numerical, so ggplot() will try
to map it as a continuous variable. To get two distinct colors, you will
need to tell R to treat it as a factor.
Change the position argument to overlap the two distributions. Then, add a degree of transparency so that you can see where the bars overlap.
ggplot(data = gap_small,
mapping = aes(x = lifeExp, fill = as.factor(year))) +
geom_histogram(binwidth = 5, alpha = 0.6,
position = "identity")Create small multiples of the plot above using
facet_wrap(), with one panel for each continent.
ggplot(data = gap_small,
mapping = aes(x = lifeExp, fill = as.factor(year))) +
geom_histogram(binwidth = 5, alpha = 0.6,
position = "identity") +
facet_wrap(~continent)Next, use facet_grid() to further subdivide your plots
by year. You should have one column for each continent, and
one row for each year.
ggplot(data = gap_small,
mapping = aes(x = lifeExp, fill = as.factor(year))) +
geom_histogram(binwidth = 5, alpha = 0.6,
position = "identity") +
facet_grid(year ~ continent,
labeller = label_both)There are only two countries in the continent “Oceania”. Remove these
from your dataframe (or if you want an extra challenge, use
mutate() and case_when() to change the
continent name from “Oceania” to “Asia”).
gap_smaller <- gap_small %>%
mutate(continent = case_when(continent == "Oceania" ~ "Asia",
TRUE ~ as.character(continent)))Recreate your previous plot with the new dataframe. This time, map continent to fill color.
ggplot(data = gap_smaller,
mapping = aes(x = lifeExp, fill = as.factor(continent))) +
geom_histogram(binwidth = 5, alpha = 0.6,
position = "identity") +
facet_grid(year ~ continent,
labeller = label_both)Lastly, improve the labels of the axes and color legend on this plot.
ggplot(data = gap_smaller,
mapping = aes(x = lifeExp, fill = as.factor(continent))) +
geom_histogram(binwidth = 5, alpha = 0.6,
position = "identity") +
facet_grid(year ~ continent,
labeller = label_both) +
xlab("Life Expectancy in years") +
ylab("Count") +
ggtitle("Life Expectancy in the year 1952 and 2007 by continent") +
theme(legend.title = element_blank())1 Submission: Upload Rmd and HTML
The final due date for this exercise is Wednesday, December 14th at 23:59 PM UTC+2.
Once you have finished the tasks above, you should knit this Rmd into an HTML and upload both files on the assignment page in a ZIP folder.