── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
The source for this dataset is the New York State Department of Conservation and the National Weather Service of 1973 for five months from May to September recorded daily.
p1 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")
p1
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Plot 2: Improve the histogram of Average Temperature by Month
p2 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity", alpha=0.5, binwidth =5, color ="white")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")
p2
Plot 3: Create side-by-side boxplots categorized by Month
p3 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Months from May through September", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot() +scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September"))
p3
Plot 4: Side by Side Boxplots in Gray Scale
p4 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Monthly Temperatures", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot()+scale_fill_grey(name ="Month", labels =c("May", "June","July", "August", "September"))
p4
Plot 5
library(tidyverse)
ggplot(airquality,aes(x=Ozone))+geom_histogram(binwidth =10,fill="purple", color="white")+labs(title ="Histogram of Ozone Levels", x="Ozone Levels", y="Frequncy")
Warning: Removed 37 rows containing non-finite outside the scale range
(`stat_bin()`).
Brief Essay
The histogram graph shows that most ozone levels fall within the range of 0-50, with the highest frequency observed between 0 and 25. This suggests that lower ozone levels were commonly observed during this period. As the ozone level increases, the frequency appears to decline, with only a few occurrences in the 75-100 range and very rare cases above 100. There is only one instance where the ozone level exceeds 150, reinforcing the idea that extreme ozone concentrations were infrequent during the recorded months. The distribution appears to be skewed to the right, indicating that while that most ozone levels fall in the lower range, there were fewer occurrences of higher levels.
#Code Elements I used, col = “blue” to make the visualization clearer.Fill = “white” to set the border color of the bars.Tittle= “Ozone Concentration Histogram” to set the main title.xlab = “Ozone Level” to label the x-axis.ylab = “Frequency” to label the y-axis.