These exercises accompany the Reshaping Data tutorial: http://rpubs.com/NateByers/Reshaping. The exercises use data frames from the region5air
library. Run the following code to clean out your global environment and load the data you need:
rm(list = ls())
library(tidyr)
library(dplyr)
library(region5air)
data(airdata)
data(chicago_air)
chicago_air
data frame is in a wide format. Use gather()
to make a long data frame named chicago_air_long
.airdata
data frame is in a long format. Use the filter()
function to create a data frame called site22
. Filter down to site “840180890022” and a poc of 1 (remember to use ==
). Use the select()
function to select only the “datetime”, “parameter”, and “value” columns. Use spread()
on site22
to make a wide data frame called site22_wide
with separate columns for each parameter. Hint: you want to spread the “parameter” column, so identify that column as the key
in the spread()
function. The “value” column should be identified as the value
in the function.filter()
function on airdata
to create a data frame called pm25
. Filter down to parameter “88101”. Use the select()
function to select only the “datetime”, “site”, and “value” columns. Use spread()
on pm25
to make a wide data frame called pm25_wide
with separate columns for each site. Hint: you want to spread the “site” column, so identify that column as the key
in the spread()
function.ggplot2
to plot the chicago_air_long
data frame that was created in exercise 1. First make sure to convert the “date” column to a Date
class using as.Date()
. Use facet_grid()
in the plot to make separate facets for each parameter, and be sure to set the scales to “free”.chicago_air_long <- gather(chicago_air, key = "parameter", value = "value",
ozone:solar)
head(chicago_air_long)
## date month weekday parameter value
## 1 2013-01-01 1 3 ozone 0.032
## 2 2013-01-02 1 4 ozone 0.020
## 3 2013-01-03 1 5 ozone 0.021
## 4 2013-01-04 1 6 ozone 0.028
## 5 2013-01-05 1 7 ozone 0.025
## 6 2013-01-06 1 1 ozone 0.026
site22 <- filter(airdata, site == "840180890022", poc == 1)
site22 <- select(site22, datetime, parameter, value)
site22_wide <- spread(site22, key = "parameter", value = "value")
head(site22_wide)
## datetime 44201 62101
## 1 20130101T0100-0600 NA 24
## 2 20130101T0200-0600 NA 24
## 3 20130101T0300-0600 NA 24
## 4 20130101T0400-0600 NA 24
## 5 20130101T0500-0600 NA 23
## 6 20130101T0600-0600 NA 22
pm25 <- filter(airdata, parameter == "88101")
pm25 <- select(pm25, datetime, site, value)
pm25_wide <- spread(pm25, key = "site", value = "value")
head(pm25_wide)
## datetime 840180890022 840180892004 840181270024
## 1 20130101T0000-0600 16.2 18.4 10.6
## 2 20130101T0100-0600 15.7 15.2 11.2
## 3 20130101T0200-0600 22.5 15.1 11.0
## 4 20130101T0300-0600 16.5 13.1 10.0
## 5 20130101T0400-0600 22.9 10.6 7.6
## 6 20130101T0500-0600 28.8 9.5 11.4
library(ggplot2)
chicago_air_long$date <- as.Date(chicago_air_long$date)
ggplot(chicago_air_long, aes(date, value)) +
geom_point() + facet_grid(parameter ~ ., scales = "free")