This assignment explores how to use ggplot2 to visualize
data, using data available in R from the gapminder dataset.
I encourage you to first try this assignment using
gapminder and then explore your own data using
ggplot2.
Prerequisite: Before attempting this assignment, be sure to complete DataCamp’s Introduction to Data Visualization using ggplot2.
If you enjoy visualization, I encourage you to continue on in DataCamp’s skill track, Data Visualization with R.
We’ve already loaded several relevant packages above, including
ggplot2 and dplyr, to facilitate data
visualization. Next, let’s install and then load the
gapminder dataset. Run the following chunk of code to
install and learn about gapminder:
Learn more about the gapminder dataset here:
https://cran.r-project.org/web/packages/gapminder/readme/README.html
# Load required packages
install.packages("gapminder", repos = "http://cran.us.r-project.org")##
## The downloaded binary packages are in
## /var/folders/gg/hw0c_7m17zz__l5mpsm9j6vw0000gq/T//RtmpFZdxRM/downloaded_packages
library(gapminder)Explore the gapminder dataset with the goal of
understanding each of its six variables. Here are three ways to do so.
Think about when you might use each of these three ways.
head(gapminder)str(gapminder)## tibble [1,704 × 6] (S3: tbl_df/tbl/data.frame)
## $ country : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ year : int [1:1704] 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
## $ lifeExp : num [1:1704] 28.8 30.3 32 34 36.1 ...
## $ pop : int [1:1704] 8425333 9240934 10267083 11537966 13079460 14880372 12881816 13867957 16317921 22227415 ...
## $ gdpPercap: num [1:1704] 779 821 853 836 740 ...
gapminderWrite one sentence explaining, in your own words, what each of the three codes above accomplish:
head(): type your response here, leaving the
asterisks intact to keep your response in bold (don’t included spaces
after your response).str(): response 2Run the following chunk of code to summarize median
gdpPercap by continent in the
year 1952. Then, adjust the code to re-run for 1997.
# Summarize the median gdpPercap by continent in 1952
by_continent <- gapminder %>%
filter(year == 1952) %>%
group_by(continent) %>%
summarize(medianGdpPercap = median(gdpPercap))
# Create a bar plot showing medianGdp by continent
ggplot(by_continent, aes(x = continent, y = medianGdpPercap)) +
geom_col()Run the following chunk of code to summarize median
lifeExp by country, limited to those countries
in Oceania, in 2007. Then, adjust the code to re-run for the Americas
(“Americas”) in 1987. Save this new dplyr
filter as by_country_americas instead of
by_country_oceania. Be sure to call the correct dataframe
when plotting below.
# Summarize the median lifeExp by country, limited to those countries in Oceania, in 2007
by_country_oceania <- gapminder %>%
filter(year == 2007, continent == "Oceania") %>%
group_by(country) %>%
summarize(medianlifeExp = median(lifeExp))
# Create a bar plot showing lifeExp by continent
ggplot(by_country_oceania, aes(x = country, y = medianlifeExp)) +
geom_col()
Replicate the following scatter plots as accurately as possible,
using the gapminder dataset. For full credit, all elements
must match, including the capitalization of “Continent”.
Consult for assistance: Modify axis, legend, and plot labels
Open the image at the following URL: https://www.public.asu.edu/~jbronowi/Gapminder1.png
# Recreate Graph #1 hereOpen the image at the following URL: https://www.public.asu.edu/~jbronowi/Gapminder2.png
Hints for recreating Graph #2:
ggthemes, use theme_fivethirtyeight()
but do not use its color scalescale_color_fivethirtyeight()Additional resources:
# Install required ggthemes package and learn more using the ?ggthemes code.
install.packages('ggthemes', repos = "http://cran.us.r-project.org")##
## The downloaded binary packages are in
## /var/folders/gg/hw0c_7m17zz__l5mpsm9j6vw0000gq/T//RtmpFZdxRM/downloaded_packages
library(ggthemes)
?ggthemes
# Recreate Graph #2 here
Use the following instructions to submit your assignment, which may
vary depending on your course’s platform. Thank you to Dr. Jesse Lecy
for these instructions on submitting using RMarkdown.
When you have completed your assignment, click the “Knit” button to
render your .RMD file into a .HTML report.
Perform the following depending on your course’s platform:
.RMD and
.HTML files to the appropriate link.RMD and .HTML files in a .ZIP
file and upload to the appropriate link.HTML files are preferred but not allowed by all
platforms.
Remember to ensure the following before submitting your assignment.
head()See Google’s R Style Guide for examples of common conventions.
.RMD files are knit into .HTML and other
formats procedural, or line-by-line.
install.packages() or
setwd() are bound to cause errors in knittinglibrary() in a previous chunkIf All Else Fails: If you cannot determine and fix
the errors in a code chunk that’s preventing you from knitting your
document, add eval = FALSE inside the brackets of
{r} at the beginning of a chunk to ensure that R does not
attempt to evaluate it, that is: {r eval = FALSE}. This
will prevent an erroneous chunk of code from halting the knitting
process.
Learn more about ggplot2 with the following
resources:
Resource I ggplot2: Elegant Graphics for Data Analysis
Resource II R Graphics Cookbook
Resource III The Grammar of Graphics
This assignment references and cites the following sources:
David Robinson, DataCamp. Source I. Introduction to the Tidyverse: Types of Visualizations
Gapminder Dataset. Source II. Download Dataset
Gapminder. Source III. Gapminder
Thank you to Dr. Giovanni Circo and Dr. Jesse Lecy for inspiring this
assignment as part of ASU’s program in
Program
Evaluation and Data Analytics.