Temperature Sensitive Demand
This tutorial will walk you through how to analyze the relationship between the temperature and electrical demand. The first thing we always need to do is load in our data.
Read in your data and store it as a variable called full_data.
full_data <- read_csv("full_data.csv")
Visualize the Temperature vs Energy
To begin we’ll visualize the daily temperature versus the electricity demand. To do this we’ll use a scatterplot which in {ggplot2} world is geom_point().
Use glimpse() to remind your self of the columns in the data to consider what you want to graph.
glimpse(full_data)If there are NAs in year, run this line of code. (Might as well just run it.)
full_data <- full_data |>
mutate(year = year(date))We want to only graph the days that we actually have energy values for, so let’s filter to remove anything where energy is NA. To do this you can use ! which means “does not equal” in conjunction with the is.na() function.
See if you can add to the code to make the filter.
energy_record <- full_data |>
_______________
The filter would look like !is.na(count).
energy_record <- full_data |>
filter(!is.na(energy))
Add to this code to create the scatterplot.
ggplot(data = ______, aes( x = ______, y = ______)) +
______
ggplot(data = energy_record, aes(x = temp, y = energy)) +
geom_point()
Fitting a Model
We can add a trend line to our data by adding to the code for the plot. The typical trend line you may have seen is a linear model, which is just a straight line through the points. Our data obviously wouldn’t fit a straight line. We can make a line that better fits our data using a loess regression instead of linear regression. loess stands for LOcally Estimated Scatterplot Smoothing, it doesn’t have the assumption of any specific shape.
We can add a loess regression line to our graph with the function geom_smooth(). We just need to specify that the method of smoothing we want is “loess”.
ggplot(data = energy_record, aes(x = temp, y = energy)) +
geom_point() +
geom_smooth(method = "loess") Now we’ll actually create the model. This will calculate the actual line that is being drawn. If it were a linear model, the line formula would by y = mx + b. A loess model doesn’t have as straightforward of an equation, but is essentially doing the same thing. The equation is now stored as loess_model.
# Fit loess model on historical data
loess_model <- loess(energy ~ temp, data = energy_record)Finding our Balance Point from the Loess Curve
From our predicted data we can find out the balance point. It will be the lowest part of our loess curve. We can use the model to find that point. Because the loess curve doesn’t have traditional parameters like a y = mx + b curve, the easiest way to find the lowest point is to use it to predict values for the same time period and then take the lowes of those values.
To do this we’ll create a new dataset called predicted_data from our historical data that we will add the predicted curve values to.
predicted_data <- full_data |>
filter(scenario == "historical")Now we’ll use the predict() function to fill the energy values. For our input temperature values we’ll the tmax temperatures.
input_temps <- data.frame(temp = predicted_data$temp)
predicted_energy_values <- predict(loess_model, newdata = input_temps)Now we have a vector (the same as a single column, just not yet in a table) of energy predictions, that we can add to our predicted dataset.
predicted_data <- predicted_data |>
mutate(predicted_energy = predicted_energy_values)Now we’ll add the predicted energy usage to the graph. We can do this just by adding another geom_point() statement.
ggplot(data = energy_record, aes(x = temp, y = energy)) +
geom_point() +
geom_smooth(method = "loess") +
geom_point() +
geom_point(data = predicted_data, aes(x = temp,y = predicted_energy, color = "red"))You can see that the predicted values fit the curve exactly.
Now we can find the minimum of these values and that will be our balance point temperature. The energy usage at this point reflects our lowest level of energy use.
To find the lowest value, we use slice() which will slice off however many rows we tell it to. We’ll put our data in order of energy use and slice off the last row.
balance_point <- predicted_data |>
arrange(predicted_energy) |>
slice(1)Now we can run select() on balance_point to see just the relevant information.
balance_point |>
select(temp, predicted_energy)From this you should see the temp value, which is our estimated balance point temperature, and predicted_energy which is the estimated minimum amount of energy usage.
Find the Extreme Heat and Cold Degree Days
We’re using a balance point temperature of 60 to find our cooling degree days, which shows how far above the balance point the temperature is. To do this we’ll find all of the differences from 60 and then we’ll categorize the positive ones as cooling and the negatives as heating.
full_data <- full_data |>
mutate(degrees_off = 60 - temp) |>
mutate(day_type = case_when(
degrees_off > 0 ~ "cooling",
degrees_off < 0 ~ "heating",
.default = NA
))The only default values here, meaning the ones not being set to heating or cooling, will be 0s–days where the temperature is 60.
Now we’ll visualize these days versus the electricity demand. To do this we’ll use a scatterplot which in {ggplot2} world is geom_point().
Add to this code to create the scatterplot
ggplot(data = ______, aes( x = ______, y = ______)) +
______
ggplot(data = full_data, aes(x = degrees_off, y = energy)) +
geom_point()
We can add a color option to our aes() to color the days by whether they are heating or cooling.
Add a color option with color = and set it equal to day_type.
ggplot(data = full_data, aes(x = degrees_off, y = energy, color = day_type)) +
geom_point()Make your plot look nicer. At minimum make sure the axis labels are informative, but you can also look at picking your own colors with manual coloring or changing the background from gray.