# import csv
temperatures <- read.csv("temperature.csv")Assignment 3B: Temperature Window Functions
Approach
I will begin by asking an LLM to generate a data set that includes the daily high and low temperatures for two different cities over the past eighteen months. Using this data, I will then use window functions in dplyr, grouped by city and ordered by date, to calculate the YTD average & six day moving average for each daily weather record. A challenge that I might anticipate with this data is that I must also decide how to analyze both the high & low temperatures, and ensure that window functions are applied in the way that I intend for them to be.
Code Base
Introduction
An LLM generated a data set with the high and low temperatures in Phoenix, Arizona and Seattle, Washington over the past 18 months (August 1st 2024 - February 15th 2026). This data is stored in the file temperature.csv.
Body
# load dplyr
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
# convert date column data type
temperatures$Date <- as.Date(temperatures$Date)Year to Date Averages
temperatures <- temperatures %>%
filter(format(Date, "%Y") == "2026") %>%
group_by(City) %>%
arrange(Date) %>%
mutate(
YTD_high = cummean(High_Temp_F),
YTD_low = cummean(Low_Temp_F)
) %>%
ungroup()
temperatures %>%
group_by(City) %>%
summarise(
overall_avg_high = mean(High_Temp_F),
overall_avg_low = mean(Low_Temp_F)
)# A tibble: 2 × 3
City overall_avg_high overall_avg_low
<chr> <dbl> <dbl>
1 Phoenix_AZ 70.7 50.2
2 Seattle_WA 43.5 31.9
The YTD average high temperature in Phoenix is 70.7, while in Seattle it is 43.54. The average low temperatures are 50.19 in Phoenix and 31.89 in Seattle.
Six Day Moving Averages
temperatures <- temperatures %>%
filter(format(Date, "%Y") == "2026") %>%
group_by(City) %>%
arrange(Date) %>%
mutate(
MA_high = stats::filter(High_Temp_F, rep(1/6, 6), sides = 1),
MA_low = stats::filter(Low_Temp_F, rep(1/6, 6), sides = 1)
) %>%
ungroup()
temperatures %>%
arrange(Date) %>%
slice_tail(n = 6) %>%
select(City, Date, MA_high, MA_low)# A tibble: 6 × 4
City Date MA_high MA_low
<chr> <date> <dbl> <dbl>
1 Phoenix_AZ 2026-02-13 76.4 54.9
2 Seattle_WA 2026-02-13 47.1 34.8
3 Phoenix_AZ 2026-02-14 76.0 54.6
4 Seattle_WA 2026-02-14 47.1 35.2
5 Phoenix_AZ 2026-02-15 75.1 54.4
6 Seattle_WA 2026-02-15 47.6 35.3
Above are the 6 day moving averages for the past 6 days in each city.
Conclusion
This assignment was extremely helpful in getting some dplyr practice. To further expand on it, the temperatures in the two cities could be compared, the standard deviation could be calculated, or future temperatures could be forecasted.