Air Quality Tutorial and Homework

Author

Annet Isa (partly)

library(tidyverse)

Contents:

Part 1 - Copies of four plots from the Airquality Tutorial for DATA 110

Part 2 - A fifth plot of my design

Part 3 - Write-up

PART 1: First Four Plots

airquality$Month[airquality$Month == 5]<- "May"
airquality$Month[airquality$Month == 6]<- "June"
airquality$Month[airquality$Month == 7]<- "July"
airquality$Month[airquality$Month == 8]<- "August"
airquality$Month[airquality$Month == 9]<- "September"
airquality$Month<-factor(airquality$Month, levels=c("May", "June","July", "August", "September"))

Plot 1 - Histogram Categorized By Month

p1 <- airquality |>
  ggplot(aes(x=Temp, fill=Month)) +
  geom_histogram(position="identity")+
  scale_fill_discrete(name = "Month", 
                      labels = c("May", "June","July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")  #provide the data source
p1
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Plot 2 - Histogram Improvement with ggplot

p2 <- airquality |>
  ggplot(aes(x=Temp, fill=Month)) +
  geom_histogram(position="identity", alpha=0.5, binwidth = 5, color = "white")+
  scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")
p2

Plot 3 - Boxplots Categorized By Month

p3 <- airquality |>
  ggplot(aes(Month, Temp, fill = Month)) + 
  labs(x = "Months from May through September", y = "Temperatures", 
       title = "Side-by-Side Boxplot of Monthly Temperatures",
       caption = "New York State Department of Conservation and the National Weather Service") +
  geom_boxplot() +
  scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September"))
p3 

Plot 4 - Boxplots in Grey Scale

p4 <- airquality |>
  ggplot(aes(Month, Temp, fill = Month)) + 
  labs(x = "Monthly Temperatures", y = "Temperatures", 
       title = "Side-by-Side Boxplot of Monthly Temperatures",
       caption = "New York State Department of Conservation and the National Weather Service") +
  geom_boxplot()+
  scale_fill_grey(name = "Month", labels = c("May", "June","July", "August", "September"))
p4

PART 2: A fifth plot of my design

Plot 5 - Histogram Improvement with ggplot

p5 <- airquality |>
  ggplot(aes(Temp, Wind, color = Month)) +
  labs(x = "Temperature (F)", y = "Wind Speed (mph)",
       title = "Exploration of Wind Speed and Temperature",
       caption = "New York State Department of Conservation and the National Weather Service") +
  geom_point () +
  scale_fill_hue(name = "Month", labels = c("May", "June", "July", "August", "September"))
p5

PART 3: Write-up

My scatterplot shows the interplay between wind speed, temperature, and month for the period of May 1973 - September 1973 in New York state.

The scatterplot is informative. The higher the wind speed, the lower the temperature. Apart from September, most months did not have a wide variation in temperature. The plot suggests August was the hottest month.

I combined the code from Plot 3 with sample code from Chapter 5 of “An Introduction to R” (Douglas, et. al). I changed geom_boxplot to geom_point to switch from a boxplot to a scatterplot. I changed fill = Month to color = Month to differentiate between points of different months. Finally, I updated the captions and labels where needed.