library(tidyverse)Airquality HW
Load the library
Load the dataset into your global environment
data("airquality")Set Up Data
airquality$Month[airquality$Month == 5]<- "May"
airquality$Month[airquality$Month == 6]<- "June"
airquality$Month[airquality$Month == 7]<- "July"
airquality$Month[airquality$Month == 8]<- "August"
airquality$Month[airquality$Month == 9]<- "September"
airquality$Month<-factor(airquality$Month,
levels=c("May", "June","July", "August", "September"))Plot 1
p1 <- airquality |>
ggplot(aes(x=Temp, fill=Month)) +
geom_histogram(position="identity")+
scale_fill_discrete(name = "Month",
labels = c("May", "June","July", "August", "September")) +
labs(x = "Monthly Temperatures from May - Sept",
y = "Frequency of Temps",
title = "Histogram of Monthly Temperatures from May - Sept, 1973",
caption = "New York State Department of Conservation and the National Weather Service") #provide the data source
p1`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Plot 2
p2 <- airquality |>
ggplot(aes(x=Temp, fill=Month)) +
geom_histogram(position="identity", alpha=0.5, binwidth = 5, color = "white")+
scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September")) +
labs(x = "Monthly Temperatures from May - Sept",
y = "Frequency of Temps",
title = "Histogram of Monthly Temperatures from May - Sept, 1973",
caption = "New York State Department of Conservation and the National Weather Service")
p2Plot 3
p3 <- airquality |>
ggplot(aes(Month, Temp, fill = Month)) +
labs(x = "Months from May through September", y = "Temperatures",
title = "Side-by-Side Boxplot of Monthly Temperatures",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_boxplot() +
scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September"))
p3Plot 4
p4 <- airquality |>
ggplot(aes(Month, Temp, fill = Month)) +
labs(x = "Monthly Temperatures", y = "Temperatures",
title = "Side-by-Side Boxplot of Monthly Temperatures",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_boxplot()+
scale_fill_grey(name = "Month", labels = c("May", "June","July", "August", "September"))
p4Plot 5
p5 <- airquality |>
ggplot(aes(x=Temp, y=Solar.R, color=Month)) +
labs(x = "Temperature", y = "Solar Radiation",
title = "Temperature in Comparison to Solar Radiation in May - September, 1973",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_path() +
geom_point(size = 1) +
scale_fill_discrete(name = "Month", labels = c("May", "June","July", "August", "September"))
p5Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
My plot is of the correlation of temperature and solar radiation levels throughout the months of May to September in 1973. The x-axis is the temperature in Fahrenheit, the y-axis are the solar radiation levels and I used five different colors to illustrate the different months in the data set. I used the geom_path() function to purposefully create a line graph unique to every month to display a month’s unique comparison of temperatures in comparison to solar radiation levels. With this illustration, one can view a month’s “cluster” and where they sit around temperature-wise while highlighting the varying solar radiation levels within said month day after day. I also used geom_point() to add a point to every object/day to better display the rate of change within days in a month.