Introduction

In this tutorial, we will go through a piece of code that reads in data, manipulates it, and generates a plot. Specifically, the code imports a CSV file containing data on weekly death trends in the United States, manipulates the data to ensure the date column is in the correct format, and generates a line graph of the number of weekly deaths.

Setting Up the Environment

Before we can start using R, we need to make sure that we have the necessary packages installed. We will be using the following packages in this tutorial:

To install these packages, we can use the following code:

install.packages("readr")
install.packages("tidyverse")
install.packages("lubridate")
install.packages("ggthemes")

Once we have installed these packages, we can load them into our environment using the library() function:

library(readr)
library(tidyverse)
library(lubridate)
library(ggthemes)

Reading in and Manipulating the Data

The first part of our code reads in a CSV file containing data on weekly death trends in the United States. This is done using the read_csv() function from the readr package:

data_table_for_weekly_death_trends_the_united_states <- read_csv("~/Downloads/data_table_for_weekly_death_trends__the_united_states.csv", skip = 2)

Here, we specify the path to our file using ~/Downloads/, followed by the name of our file, data_table_for_weekly_death_trends__the_united_states.csv. The skip = 2 argument tells the function to ignore the first two rows of the file.

We then assign the resulting table to a variable called data:

data <- data_table_for_weekly_death_trends_the_united_states

Next, we use the mdy() function from the lubridate package to convert the date column in our table to the correct format:

mdy(data$Date) -> data$Date

Here, we are using the %>% symbol (also known as the “pipe” operator) to pass the date vector to the mdy() function. The resulting vector is then assigned back to the Date column of our table.

Creating a Plot

Finally, we use the ggplot() function from the ggplot2 package (part of the tidyverse) to create a line graph of the number of weekly deaths. We use the aes() function to specify that we want to plot Weekly Deaths against Date:

ggplot(data, aes(Date, `Weekly Deaths`)) + geom_line() + geom_smooth() + theme_economist()

We then add a line to our plot using the geom_line() function and a smoothed line using the geom_smooth() function. Finally, we apply a theme to our plot using the theme_economist() function from the ggthemes package.

Conclusion

This tutorial has covered a simple yet useful example of manipulating data and creating visualizations in R using some popular packages. With more practice, these concepts will become more familiar and intuitive, helping you to unlock R’s full potential!