Introduction

This assignment explores how to use ggplot2 to visualize data, using data available in R from the gapminder dataset. I encourage you to first try this assignment using gapminder and then explore your own data using ggplot2.

Prerequisite: Before attempting this assignment, be sure to complete DataCamp’s Introduction to Data Visualization using ggplot2.

If you enjoy visualization, I encourage you to continue on in DataCamp’s skill track, Data Visualization with R.


Preparation

We’ve already loaded several relevant packages above, including ggplot2 and dplyr, to facilitate data visualization. Next, let’s install and then load the gapminder dataset. Run the following chunk of code to install and learn about gapminder:

Learn more about the gapminder dataset here:
https://cran.r-project.org/web/packages/gapminder/readme/README.html

# Load required packages
install.packages("gapminder", repos = "http://cran.us.r-project.org")
## 
## The downloaded binary packages are in
##  /var/folders/gg/hw0c_7m17zz__l5mpsm9j6vw0000gq/T//RtmpFZdxRM/downloaded_packages
library(gapminder)


Methods of exploring your data

Explore the gapminder dataset with the goal of understanding each of its six variables. Here are three ways to do so. Think about when you might use each of these three ways.

head(gapminder)
str(gapminder)
## tibble [1,704 × 6] (S3: tbl_df/tbl/data.frame)
##  $ country  : Factor w/ 142 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ continent: Factor w/ 5 levels "Africa","Americas",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ year     : int [1:1704] 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 ...
##  $ lifeExp  : num [1:1704] 28.8 30.3 32 34 36.1 ...
##  $ pop      : int [1:1704] 8425333 9240934 10267083 11537966 13079460 14880372 12881816 13867957 16317921 22227415 ...
##  $ gdpPercap: num [1:1704] 779 821 853 836 740 ...
gapminder


Write one sentence explaining, in your own words, what each of the three codes above accomplish:

  • head(): type your response here, leaving the asterisks intact to keep your response in bold (don’t included spaces after your response).
  • str(): response 2
  • Running the name of the dataset: response 3


Summarizing median gdpPercap by continent

Run the following chunk of code to summarize median gdpPercap by continent in the year 1952. Then, adjust the code to re-run for 1997.

# Summarize the median gdpPercap by continent in 1952
by_continent <- gapminder %>%
  filter(year == 1952) %>%
  group_by(continent) %>%
  summarize(medianGdpPercap = median(gdpPercap))

# Create a bar plot showing medianGdp by continent
ggplot(by_continent, aes(x = continent, y = medianGdpPercap)) +
  geom_col()


Summarizing median lifeExp by country

Run the following chunk of code to summarize median lifeExp by country, limited to those countries in Oceania, in 2007. Then, adjust the code to re-run for the Americas (“Americas”) in 1987. Save this new dplyr filter as by_country_americas instead of by_country_oceania. Be sure to call the correct dataframe when plotting below.

# Summarize the median lifeExp by country, limited to those countries in Oceania, in 2007
by_country_oceania <- gapminder %>%
  filter(year == 2007, continent == "Oceania") %>%
  group_by(country) %>%
  summarize(medianlifeExp = median(lifeExp))

# Create a bar plot showing lifeExp by continent
ggplot(by_country_oceania, aes(x = country, y = medianlifeExp)) +
  geom_col()


Recreate graphs using ggplot2

Replicate the following scatter plots as accurately as possible, using the gapminder dataset. For full credit, all elements must match, including the capitalization of “Continent”.

Consult for assistance: Modify axis, legend, and plot labels


Graph #1

Open the image at the following URL: https://www.public.asu.edu/~jbronowi/Gapminder1.png

# Recreate Graph #1 here


Graph #2

Open the image at the following URL: https://www.public.asu.edu/~jbronowi/Gapminder2.png


Hints for recreating Graph #2:

  • With ggthemes, use theme_fivethirtyeight() but do not use its color scale
  • Challenge: Explain why you can’t use scale_color_fivethirtyeight()

Additional resources:

# Install required ggthemes package and learn more using the ?ggthemes code.
install.packages('ggthemes', repos = "http://cran.us.r-project.org")
## 
## The downloaded binary packages are in
##  /var/folders/gg/hw0c_7m17zz__l5mpsm9j6vw0000gq/T//RtmpFZdxRM/downloaded_packages
library(ggthemes)
?ggthemes

# Recreate Graph #2 here



How to Submit

Use the following instructions to submit your assignment, which may vary depending on your course’s platform. Thank you to Dr. Jesse Lecy for these instructions on submitting using RMarkdown.


Knitting to HTML

When you have completed your assignment, click the “Knit” button to render your .RMD file into a .HTML report.


Special Instructions

Perform the following depending on your course’s platform:

  • Canvas: Upload both your .RMD and .HTML files to the appropriate link
  • Blackboard or iCollege: Compress your .RMD and .HTML files in a .ZIP file and upload to the appropriate link

.HTML files are preferred but not allowed by all platforms.


Before You Submit

Remember to ensure the following before submitting your assignment.

  1. Name your files using this format, filling in your last name: OMT548-ggplot2-LastName.rmd and OMT548-ggplot2-LastName.html
  2. Show both the solution for your code and write out your answers in the body text
  3. Do not show excessive output; truncate your output, e.g. with function head()
  4. Follow appropriate styling conventions, e.g. spaces after commas, etc.
  5. Above all, ensure that your conventions are consistent

See Google’s R Style Guide for examples of common conventions.



Common Knitting Issues

.RMD files are knit into .HTML and other formats procedural, or line-by-line.

  • An error in code when knitting will halt the process; error messages will tell you the specific line with the error
  • Certain functions like install.packages() or setwd() are bound to cause errors in knitting
  • Altering a dataset or variable in one chunk will affect their use in all later chunks
  • If an object is “not found”, make sure it was created or loaded with library() in a previous chunk

If All Else Fails: If you cannot determine and fix the errors in a code chunk that’s preventing you from knitting your document, add eval = FALSE inside the brackets of {r} at the beginning of a chunk to ensure that R does not attempt to evaluate it, that is: {r eval = FALSE}. This will prevent an erroneous chunk of code from halting the knitting process.



Further Resources

Learn more about ggplot2 with the following resources:




Works Cited

This assignment references and cites the following sources:

Acknowledgements

Thank you to Dr. Giovanni Circo and Dr. Jesse Lecy for inspiring this assignment as part of ASU’s program in Program Evaluation and Data Analytics.