Airquality Plots

Author

Latifah Traore

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data("airquality")
head(airquality)
  Ozone Solar.R Wind Temp Month Day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5
6    28      NA 14.9   66     5   6
mean(airquality$Temp)
[1] 77.88235
mean(airquality[,4]) 
[1] 77.88235
median(airquality$Temp)
[1] 79
sd(airquality$Wind)
[1] 3.523001
var(airquality$Wind)
[1] 12.41154
airquality$Month[airquality$Month == 5]<- "May"
airquality$Month[airquality$Month == 6]<- "June"
airquality$Month[airquality$Month == 7]<- "July"
airquality$Month[airquality$Month == 8]<- "August"
airquality$Month[airquality$Month == 9]<- "September"
summary(airquality$Month)
   Length     Class      Mode 
      153 character character 
airquality$Month<-factor(airquality$Month, 
                         levels=c("May", "June","July", "August",
                                  "September"))
p1 <- ggplot(airquality, aes(x = Temp, fill = Month)) +
  geom_histogram(position = "identity", binwidth = 5) +  # Set binwidth for clarity
  scale_fill_discrete(name = "Month", 
                      labels = c("May", "June", "July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")

print(p1)

p2 <-ggplot(airquality, aes(x=Temp, fill=Month)) +
  geom_histogram(position="identity", alpha=0.5, binwidth = 5, color = "white")+
  scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")
p2 <- airquality |>
  ggplot(aes(x = Temp, fill = Month)) +
  geom_histogram(position = "identity", alpha = 0.5, binwidth = 5, color = "white") +
  scale_fill_discrete(name = "Month", labels = c("May", "June", "July", "August", "September")) +
  labs(x = "Monthly Temperatures from May - Sept", 
       y = "Frequency of Temps",
       title = "Histogram of Monthly Temperatures from May - Sept, 1973",
       caption = "New York State Department of Conservation and the National Weather Service")

print(p2)

print(p3)

p3 <- airquality |>
  ggplot(aes(Month, Temp, fill = Month)) + 
  geom_boxplot() +
  scale_fill_discrete(name = "Month", labels = c("May", "June", "July", "August", "September")) +
  labs(x = "Months from May through September", y = "Temperatures", 
       title = "Side-by-Side Boxplot of Monthly Temperatures",
       caption = "New York State Department of Conservation and the National Weather Service")

print(p3)

p4 <- airquality |>
  ggplot(aes(x = factor(Month), y = Temp, fill = factor(Month))) + 
  geom_boxplot() +
  scale_fill_grey(name = "Month", labels = c("May", "June", "July", "August", "September")) +
  labs(x = "Month", y = "Temperature (Fahrenheit)", 
       title = "Side-by-Side Boxplot of Monthly Temperatures in Grey Scale",
       caption = "New York State Department of Conservation and the National Weather Service") +
  theme_minimal()

print(p4)

# Plot 5 Code
p5 <- airquality |>
  ggplot(aes(x = Ozone, fill = ..count..)) +
  geom_histogram(binwidth = 10, color = "black", alpha = 0.7) +
  scale_fill_viridis_c(name = "Frequency") +
  labs(x = "Ozone Concentration (ppb)", 
       y = "Frequency", 
       title = "Histogram of Ozone Levels",
       caption = "New York State Department of Conservation and the National Weather Service") +
  theme_minimal()

print(p5)
Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(count)` instead.
Warning: Removed 37 rows containing non-finite outside the scale range
(`stat_bin()`).

In Plot 5, we created a histogram to show how ozone levels are distributed in the dataset. The plot helps us see how common different levels of ozone are. We used a bin width of 10 and added color to show how frequent each level is. The code I used are:

geom_histogram(binwidth = 10, color = “black”, alpha = 0.7)

scale_fill_viridis_c(name = “Frequency”)