The temps data contains the high (TMAX) and low (TMIN) for each day (DATE), recorded at the South Burlington Airport between December 1940 to September 2023. Snow (SNOW) and rain (PRCP) are also included in the data, but won’t be used (neither will be TAVG since it is missing for almost 80% of the days).
You will be creating a graph that shows the year-to-year difference of the maximum temperature across the twelve months of the year.
Before we create the graph, we need to do a little data cleaning and create two data sets:
Create a data set named temps_month by making the following changes:
Change the column names to lower case
Using the functions in the lubridate
package,
convert the date column from a character column to a
date-type column, then create new columns for:
Remove any days with the daily high temp missing or during 1940
Calculate the average temperature for each month by year combination (Aka the average high temp for January 1945 was …)
Display the first 10 rows using tibble(temps_month)
and
make sure that it is not a grouped data frame!
temps |>
# Making all the names lowercase:
janitor::clean_names() |>
# Converting the date column into days and year
mutate(
# Need to convert date column to a date type column with mdy()
date = mdy(date),
# find the day of the year with yday()
day = yday(date),
# Find the month of the year using the month abbreviation
week = week(date),
# Find the month of the year with number
month = month(date),
# Find the month of the year using the month abbreviation
month_label = month(date, label = T),
# find the year with year()
year = year(date)
) |>
# Only keeping the days with a non-missing tavg and after 1940
filter(
!is.na(tmax),
year > 1940
) |>
# Averaging the max temp across the month and year
summarize(
.by = c(month, month_label, year),
tmax_avg = mean(tmax)
) ->
temps_month
tibble(temps_month)
## # A tibble: 993 Ă— 4
## month month_label year tmax_avg
## <dbl> <ord> <dbl> <dbl>
## 1 1 Jan 1941 22.6
## 2 2 Feb 1941 28.5
## 3 3 Mar 1941 32.8
## 4 4 Apr 1941 61.2
## 5 5 May 1941 69.1
## 6 6 Jun 1941 81.7
## 7 7 Jul 1941 83.5
## 8 8 Aug 1941 77.5
## 9 9 Sep 1941 73.5
## 10 10 Oct 1941 57.0
## # ℹ 983 more rows
Using the temps_month data set from part 1A, find the highest and lowest average temperature for each month and save the results in a data frame named temps_min_max. Display all 12 rows in the knitted document.
temps_month |>
# Finding the minimum and maximum average high temp for each month (keeping the label)
summarize(
.by = c(month, month_label),
min_tmax = min(tmax_avg),
max_tmax = max(tmax_avg)
) ->
temps_min_max
head(temps_min_max, n = 12)
## month month_label min_tmax max_tmax
## 1 1 Jan 14.64516 37.58065
## 2 2 Feb 16.10714 40.32143
## 3 3 Mar 30.58065 52.77419
## 4 4 Apr 45.30000 61.16667
## 5 5 May 57.90323 76.19355
## 6 6 Jun 68.93333 83.43333
## 7 7 Jul 75.96774 87.41935
## 8 8 Aug 73.19355 84.67742
## 9 9 Sep 64.93333 79.00000
## 10 10 Oct 51.22581 69.03226
## 11 11 Nov 37.70000 52.13333
## 12 12 Dec 17.45161 45.70968
Create the graph seen in the solutions in Brightspace. You’ll need to
use geom_ribbon()
to create the grey region in the graph
(it should also be the only geom_
for part 2A) and the
temps_min_max data set. The help menu can be helpful! Save it
as geom_temp_ribbon and make sure to display the results in the
knitted document
ggplot(
#data = temps_min_max,
mapping = aes(x = month)
) +
# Adding the shaded region using the temps_min_max data set
geom_ribbon(
data = temps_min_max,
mapping = aes(
ymin = min_tmax, # Bottom part of the ribbon
ymax = max_tmax, # Top part of the ribbon
xmin = month, # Left part of the ribbon
xmax = month # right part of the ribbon
),
fill = "grey70",
alpha = 0.55
) +
# Making the graph look better
labs(
#x = NULL, #"Month of the year",
y = "Temperature",
title = "Average Monthly High Temperatures",
subtitle = "Burlington, VT: 1940 - 2023",
caption = "Data: ncdc.noaa.gov"
) ->
gg_temps_ribbon
gg_temps_ribbon
Add lines for the month-to-month change for the years 1942, 1982, and 2022. See the graph on Brightspace for what the results should look like. Save the graph as gg_tempsB and make sure that it appears in the knitted document
gg_temps_ribbon +
geom_line(
data = temps_month |> filter(year %in% c(1942, 1982, 2022)),
mapping = aes(y = tmax_avg,
color = factor(year)),
linewidth = 1,
show.legend = F
) +
# Adding the year at the end of the 3 lines
geom_text(
data = temps_month |> filter(year %in% c(1942, 1982, 2022), month == 12),
mapping = aes(y = tmax_avg,
color = factor(year),
label = year),
show.legend = F,
nudge_x = 0.35,
fontface = "bold"
) ->
gg_tempsB
gg_tempsB
Use gg_tempsB and the appropriate functions to make x, y,
and color match the graph in Brightspace, making sure to remove the
labels for the x and y axis. To get the colors to match, you’ll need to
add scale_color_tq()
from the tidyquant
package and choose the right value for the theme
argument
in said function. Save the results as gg_tempsC.
To get the degree F symbol to appear, use "\u00b0F"
gg_tempsB +
# Changing the colors
scale_color_tq(
theme = "dark"
) +
# Changing the x-axis
scale_x_continuous(
expand = c(0, 0, 0.05, 0),
breaks = seq(from = 1, to = 12, by = 1),
labels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"),
minor_breaks = NULL,
position = "top"
) +
# Changing the y-axis
scale_y_continuous(
breaks = seq(from = 20, to = 80, by = 20),
labels = paste0(seq(from = 20, to = 80, by = 20), "\u00b0F")
) +
# Removing the x and y-axis labels
labs(
x = NULL,
y = NULL
) ->
gg_tempsC
gg_tempsC
Finally, make the needed changes to have the graph match what is in
Brightspace. You’ll need to add theme_tq()
from the
tidyquant
package then make the additional needed changes.
All of the lines and text use white, even if they look a little gray in
the graph!
The title is size 16 and the subtitle is size 12, all other labels have the default size
gg_tempsC +
# Changing the pre-packaged theme
theme_tq() +
# Changing the other theme options
theme(
# Changing the title, subtitle and caption
plot.title = element_text(hjust = 0.5,
size = 16),
plot.subtitle = element_text(hjust = 0.5,
size = 12),
plot.caption = element_text(hjust = 0,
face = "italic"),
# Changing the caption location to be on the "plot" instead of "panel"
plot.caption.position = "plot",
# Changing the background color of the plot
plot.background = element_rect(fill = "black",
color = "black"),
# Changing all the text to white
text = element_text(color = "white",
face = "bold"),
# Changing the panel background to black
panel.background = element_rect(fill = "black"),
#axis.line = element_line(color = "white"),
# Changing the panel lines to be white
panel.border = element_rect(color = "white"),
panel.grid.major = element_line(color = "white"),
panel.grid.minor = element_line(color = "white")
)