airquality$Month <-factor(airquality$Month,levels=c("May", "June","July", "August","September"))# the default order is alphabetical, so after changing the month names, reordering them is important.
Plot 1: Histogram categorized by months
p1 <- airquality %>%ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity")+scale_fill_discrete(name ="Month", # create legendlabels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept",y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service") #important to provide the data sourcep1
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Plot 2, more cohesive
p2 <- airquality %>%ggplot(aes(x=Temp, fill=Month)) +geom_histogram(position="identity", alpha=0.5, binwidth =5, color ="white")+# alpha defines transparency, binwidth defines width and color defines outlinesscale_fill_discrete(name ="Month", labels =c("May", "June","July", "August", "September")) +labs(x ="Monthly Temperatures from May - Sept", y ="Frequency of Temps",title ="Histogram of Monthly Temperatures from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")p2
Here July stands out for having high frequency of 85 degree temperatures. The dark purple color indicates overlaps of months due to the transparency.
Plot 3, side-by-side boxplots categorized by month
p3 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Months from May through September", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot() +scale_fill_discrete(name ="Month", "labels"=c("May", "June","July", "August", "September"))p3
Plot 4: Side-by-side greyscale boxplots
p4 <- airquality |>ggplot(aes(Month, Temp, fill = Month)) +labs(x ="Monthly Temperatures", y ="Temperatures", title ="Side-by-Side Boxplot of Monthly Temperatures",caption ="New York State Department of Conservation and the National Weather Service") +geom_boxplot()+scale_fill_grey(name ="Month", labels =c("May", "June","July", "August", "September"))p4
My own plot
??position??scale_fillp5 <- airquality %>%ggplot(aes(x=Wind, fill=Month)) +geom_histogram(position="stack", alpha=0.5, binwidth =3, color ="white")+scale_fill_brewer(palette="PuBuGn",name ="Month", labels =c("May", "June", "July", "August", "September")) +labs(x ="Monthly Wind Speed from May - Sept", y ="Frequency of Wind Speed",title ="Histogram of Monthly Wind Speed from May - Sept, 1973",caption ="New York State Department of Conservation and the National Weather Service")p5
For this plot, I was mostly exploring the different elements and commands involved in the plots that Professor Saidi used earlier in this document. There’s a lot of commands I don’t recognize. I am trying to figure out what they do, when they should be used and how to use them effectively. I used ??position and ??scale_fill to search documentation on these two commands. My main goal was to learn how to change the colors of the month variables in the plot. I couldn’t figure out how to make it adhere to colors I chose myself, and ??scale_fill didn’t help me very much, so I looked online and found an article by “Cookbook for R” explaining how scale_fill_brewer works. I ended up using a simple colorbrewer palette. I am not completely satisfied with the plot created. I haven’t had the time I wanted to explore this assignment properly. I think the plot looks very pretty, but it’s difficult to read or understand what it’s trying to say. Wind has been on my mind lately; earlier today, after being hit with a powerful gust of wind for the first time in a while, I was wondering which seasons are the windiest. The plot seems to indicate that June is the windiest month of the year, and experiences wind speeds of about 10 mph the most, more frequently than other speeds. Using a stacked histogram was interesting because it allows me to see the data for each row and column at once. However, it’s confusing to try and read.
External sources used:
Cookbook for R. (n.d.). Colors (ggplot2). Cookbook for R. http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/.