p1 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service") #provide the data sourceprint(p1)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The histogram plot is useful for analyzing monthly temperature values as it shows the distribution of temperatures for each month with different colors. It helps compare how temperature ranges vary across the months from May to September. To improve clarity adjusting the bin width or adding transparency can be resourceful
PLOT 2: Improving the histogram of Average Temperature by Month
p2 <- airquality |>ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity", alpha=0.5, binwidth =5, color ="white")+scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")print(p2)
Yes, this plot improves readability by using side-by-side boxplots to clearly compare temperature distributions across months. It visually distinguishes each month’s temperatures, highlighting August as having the highest temperatures. The plot also identifies outliers in June and July, providing insight into unusual temperature values. The use of distinct colors and clear labels further enhances the plot’s effectiveness in displaying and comparing monthly temperature patterns
PLOT 3: A side-by-side boxplots categorized by Month
p3 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Months from May through September", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot() +scale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September"))print(p3)
Plot 4: Side by Side Boxplots in Gray Scale
p4 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Monthly Temperatures", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot()+scale_fill_grey(name ="Month", labels =c("May", "June","July", "August", "September"))print(p4)
Plot 5: Scatterplot of Solar Radiation vs. Temperature
p5 <- airquality |>ggplot(aes(x = Solar.R, y = Temp)) +geom_point(aes(color = Month), alpha =0.7) +geom_smooth(method ="lm", se =FALSE, color ="black") +labs(x ="Solar Radiation (Langley)",y ="Temperature (°F)",title ="Scatterplot of Solar Radiation vs. Temperature",caption ="New York State Department of Conservation and the National Weather Service") +scale_color_discrete(name ="Month", labels =c("May", "June", "July", "August", "September"))print(p5)
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 7 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 7 rows containing missing values or values outside the scale range
(`geom_point()`).
brief essay
Plot Type: The plot is a scatterplot displaying the relationship between Solar Radiation (Solar.R) and Temperature(Temp).
Insights: This scatterplot reveals how solar radiation levels are associated with temperature variations. The plot shows a general trend where higher solar radiation tends to be associated with higher temperatures. The added regression line (black line) provides a visual indication of this relationship, suggesting a positive correlation between the two variables. Different colors represent the months, allowing us to see if this relationship varies by month.
Special Code: I used geom_point() to create the scatterplot and geom_smooth() with method = “lm” to add a linear regression line, which helps to identify the overall trend. The alpha = 0.7 parameter in geom_point() adds transparency to the points, making overlapping points easier to distinguish. The scale_color_discrete() function colors the points by month, adding an additional layer of information to compare how the relationship between solar radiation and temperature might differ across months.