library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
airquality <-airquality
head(airquality)
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 67 5 1
## 2 36 118 8.0 72 5 2
## 3 12 149 12.6 74 5 3
## 4 18 313 11.5 62 5 4
## 5 NA NA 14.3 56 5 5
## 6 28 NA 14.9 66 5 6
mean(airquality$Temp)
## [1] 77.88235
mean(airquality[,4])
## [1] 77.88235
median(airquality$Temp)
## [1] 79
sd(airquality$Wind)
## [1] 3.523001
var(airquality$Wind)
## [1] 12.41154
airquality$Month[airquality$Month == 5]<- "May"
airquality$Month[airquality$Month == 6]<- "June"
airquality$Month[airquality$Month == 7]<- "July"
airquality$Month[airquality$Month == 8]<- "August"
airquality$Month[airquality$Month == 9]<- "September"
summary(airquality$Month)
## Length Class Mode
## 153 character character
#Reorder months so they dont do default to alphabetical
airquality$Month<-factor(airquality$Month, levels=c("May","June", "July", "August", "September"))
p1 <- qplot(data = airquality,Temp,fill = Month,geom = "histogram", bins = 20)
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
p1
p2 <- airquality %>%
ggplot(aes(x=Temp, fill=Month)) +
geom_histogram(position="identity", alpha=0.5, binwidth = 5, color = "white")+
scale_fill_discrete(name="Month", labels = c("May", "June", "July", "August", "September")) +
xlab("Monthly Temperatures") +
ylab("Frequency") +
ggtitle("Histogram of Monthly Temperatures")
p2
p3 <- airquality %>%
ggplot(aes(Month, Temp, fill= Month)) +
labs(x = "Monthly Temperatures", y = "Temperatures",
title = "Side-by-Side Boxplot of Monthly Temperatures",
caption = "New York State Department of Conservation and the Nation Weather Service") +
geom_boxplot() +
scale_fill_discrete(name = "Month", labels = c("May", "June", "July", "August", "September"))
p3
p4 <- airquality %>%
ggplot(aes(Month, Temp, fill= Month)) +
labs(x = "Monthly Temperatures", y = "Temperatures",
titles = "Side-by-Side Boxplot of Monthly Temperatures",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_boxplot()+
scale_fill_grey(name = "Month", labels = c("May", "June", "July", "August", "September"))
p4
p5 <-
ggplot(airquality, aes(x= Temp,y= Wind, color = Ozone)) +
labs(x = "Temperature", y = "Wind",
titles = "Scatterplot - How Temperature affects Wind and the Ozone",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_point(
size = 4, alpha = 0.8)
p5
This scatter plot shows how the rise in temperature decreases wind speed and its relationship with ozone levels. As the temperature is lower, there tends to be an increase in wind speed, and when the temperature increases, there is a decrease in wind speed. An unexpected discovery was made when I compared the relationships between temperature and wind with the ozone levels. Temperature is directly proportionate to the levels of the Ozone. During my investigation, I uncovered the correlation between the wind and temperature and learned that when you have significant temperature differences, the wind tends to increase. This makes sense since in New York when the arctic winds swing its cold air from Canada, the wind picks up—the relationship between temperature and wind with the ozone layer kind of made sense. As the temperature increases, you have increased ozone levels. This also makes sense because you have to wear sunscreen and check the UV index. As I continued my investigation, I learned that there’s a complex relationship between high temperatures, a decrease in the wind, and stagnating the atmosphere, which further increases the ozone surface area.
For the code, I used the ggplot library. I had three different aesthetics to make the graph come alive. I used the x-axis as my temperature, the y-axis as the wind, and the third variable, color, as the Ozone. The x and y axes are labeled, and the title is How temperature affects wind and the Ozone. The color correlates with the increased Ozone level, from dark blue being normal to light blue showing high levels of ozone activity. The size of the scatter plot was four, with an alpha of 0.8 which affects the opacity of the plot.
There is a negative correlation between increase temperature and wind speed.
p6 <-
ggplot(airquality, aes(x= Temp,y= Wind, color = Ozone)) +
labs(x = "Temperature", y = "Wind",
titles = "Scatterplot - How Temperature affects Wind and the Ozone",
caption = "New York State Department of Conservation and the National Weather Service") +
geom_point(
size = 4, alpha = 0.8) +
geom_smooth(method = "lm", se= FALSE)
p6
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: colour
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?