p1 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service") #provide the data sourcep1
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Plot 2: Improve histogram of Average Temperature by Month
p2 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity", alpha=0.5, binwidth =5, color ="white")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")p2
Plot 3: Create side-by-side boxplots categorized by Month
p3 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Months from May through September", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot() +scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September"))p3
Plot 4: Side-by-side boxplots in grayscale
p4 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Monthly Temperatures", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot()+scale_fill_grey(name ="Month", labels =c("May", "June","July", "August", "September"))p4
Plot 5: Density Plot of Wind Speed by Month
p5 <- airquality |>ggplot(aes(x = Wind, fill =factor(Month))) +geom_density(alpha =0.4) +scale_fill_discrete(name ="Month", labels =c("May", "June", "July", "August", "September")) +labs(x ="Wind Speed (mph)", y ="Density", title ="Density Plot of Wind Speed by Month",caption ="New York State Department of Conservation and the National Weather Service") +theme_minimal()p5
Plot 5 Essay
Description
For plot 5, I chose a density plot to visualize wind speed distribution across different months. A density plot provides a smooth estimate of the data point distribution and helps identify how values are distributed over a continuous range. In this case, I visualized the wind speeds recorded between May and September, using color to represent each month.
Insights
The density plot reveals wind speeds vary significantly by month, with May and June consistently having the highest wind speeds, peaking around 10-12 mph. In contrast, wind speeds in July and August are lower, with peaks closer to 8 mph. September shows a more evenly distributed range of wind speeds. The plot offers valuable insights into the seasonal fluctuations in wind speeds, with the smooth curves effectively emphasizing trends without being disrupted by excessive noise.
Special Code
To create this plot, I used geom_density() to generate smooth curves representing the wind speed distribution for each month from May to September. Within geom_density(), I set the alpha parameter to 0.4, slightly transparent, to visualize multiple distributions’ overlap. The color scheme was applied using scale_fill_discrete() to differentiate between the months. The overall minimalist plot aesthetic uses theme_minimal() to maintain focus on the data without unnecessary visual elements.