Airquality Assignment

Author

Joseph Flores

Airquality Assignment

Load in the library

library(tidyverse)

Load the dataset into your global environment

data("airquality")
head(airquality)
  Ozone Solar.R Wind Temp Month Day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5
6    28      NA 14.9   66     5   6
mean(airquality$Temp)
[1] 77.88235
median(airquality$Temp)
[1] 79
sd(airquality$Wind)
[1] 3.523001
var(airquality$Wind)
[1] 12.41154

Rename months from numbers to names

airquality$Month[airquality$Month == 5]<- "May"
airquality$Month[airquality$Month == 6]<- "June"
airquality$Month[airquality$Month == 7]<- "July"
airquality$Month[airquality$Month == 8]<- "August"
airquality$Month[airquality$Month == 9]<- "September"
summary(airquality$Month)
   Length     Class      Mode 
      153 character character 
airquality$Month<-factor(airquality$Month, 
                         levels=c("May", "June","July", "August",
                                  "September"))
p1 <- airquality |>
  ggplot(aes(x=Temp, fill=Month)) +
  geom_histogram(position="identity")+
  scale_fill_discrete(name = "Month", 
                      labels = c("May", "June","July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")  #provide the data source

p1
`stat_bin()` using `bins = 30`. Pick better value `binwidth`.

p2 <- airquality |>
  ggplot(aes(x=Temp, fill=Month)) +
  geom_histogram(position="identity", alpha=0.5, binwidth = 5, color = "white")+
  scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")
p2

p3 <- airquality |>
  ggplot(aes(Month, Temp, fill = Month)) + 
  labs(x = "Months from May through September", y = "Temperatures", 
       title = "Side-by-Side Boxplot of Monthly Temperatures",
       caption = "New York State Department of Conservation and the National Weather Service") +
  geom_boxplot() +
  scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September"))

p3

p4 <- airquality |>
ggplot(aes(Month, Temp, fill = Month)) + 
  labs(x = "Monthly Temperatures", y = "Temperatures", 
       title = "Side-by-Side Boxplot of Monthly Temperatures",
       caption = "New York State Department of Conservation and the National Weather Service") +
  geom_boxplot()+
  scale_fill_grey(name = "Month", labels = c("May", "June","July", "August", "September"))

p4

p5 <- airquality |>
  ggplot(aes(Month, Temp, fill = Month)) + 
  labs(x = "Marginal Changes: May-September", y = "Marginal % change", 
       title = "Side-by-Side Boxplot of Marginal changes in Temperature",
       caption = "Montgomery College's Weekly Weather Club Report") +
  geom_boxplot() +
  scale_fill_discrete(name = "Month", labels = c("April-May", "May-June","June-July", "July-August", "August-September"))
p5

Essay

I created a box plot that represents the marginal changes in temperature in percentage form from month to month. The plot functions in order to give the visualization of the average change in temperatures on a day to basis with their corresponding months. The outliers act as a substantial change to not be included in the box plot. There is no special code I used to create this box plot, as everything I used was pulled from rpubs or the lessons.