Author

Penelope Pooler Eisenbies

Published

April 16, 2026

Setup

Run the following chunk of R code to install and load the packages needed for this assignment.

Click green triangle in upper right corner of the setup chunk to run the setup code.

Note that setup code will not appear in the rendered HTML file.

NOTES:

  • Data used in HW 10 include vlaues through December of 2024.

  • These datasets were last updated in the Spring of 2025 when the HW 10 demo videos were created.

    • Forecast HW Demo videos are updated every other year.

The Alaska data used in class has been updated to December 2025 so the HW data differs slightly.

Median Marriage Data

Code
```{r}
#|label: Import Marriage Data


# import and examine data
med_mar <- read_csv("data/med_marriage.csv", 
                    show_col_types = F) |>
  mutate(Date = mdy(Date)) |>
  glimpse(width=60)
```
Rows: 78
Columns: 4
$ Date    <date> 1947-12-31, 1948-12-31, 1949-12-31, 1950-…
$ Year    <dbl> 1947, 1948, 1949, 1950, 1951, 1952, 1953, …
$ Males   <dbl> 23.7, 23.3, 22.7, 22.8, 22.9, 23.0, 22.8, …
$ Females <dbl> 20.5, 20.4, 20.3, 20.3, 20.4, 20.2, 20.2, …

Marriage Age Time Series Plot

Female Time Series

  • Create time series using female data

    • Specify freq = 1 - one observation per year

    • Specify start = 1947 - first year in dataset

  • Model data using auto.arima function

    • Specify ic = aic - aic is the information criterion used to determine model.

    • Specify seasonality = F - no seasonal (repeating) pattern in the data.

Create Time Series (ts) and Model

Code
```{r}
#|label: Create female time series

# create time series for forecast
female_ts <- ts(med_mar$Females, freq=1, start=1947)

# model data using auto.arima function
female_model <- auto.arima(female_ts, ic="aic", seasonal=F)
```

Female Model Forecast

  • Create forecasts (until 2040)

    • h = 16 indicates we want to forecast 16 years

    • Most recent year in data is 2024

    • 2040 - 2024 - 16

  • Forecasts become less accurate the further into the future you specify.

Code
```{r}
#|label: female model forecast and plot

# create forecasts (until 2040)
female_forecast <- forecast(female_model, h=16)

# plot forecasts with 2 prediction interval bounds shown
autoplot(female_forecast) + 
  labs(y = "Median Age of First Marriage (Females)") +
  theme_classic()
```

Female Model Fit

Code
```{r}
#|label: numeric female model forecasts

# examine numerical forecasts 
# and prediction intervals
female_forecast
```
     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2025       28.64900 28.43747 28.86053 28.32549 28.97250
2026       28.70875 28.42177 28.99573 28.26985 29.14765
2027       28.76851 28.39436 29.14266 28.19629 29.34073
2028       28.82826 28.35721 29.29932 28.10784 29.54869
2029       28.88802 28.31154 29.46450 28.00637 29.76967
2030       28.94778 28.25818 29.63737 27.89313 30.00243
2031       29.00753 28.19772 29.81735 27.76903 30.24604
2032       29.06729 28.13062 30.00396 27.63478 30.49980
2033       29.12705 28.05726 30.19683 27.49095 30.76314
2034       29.18680 27.97795 30.39566 27.33802 31.03559
2035       29.24656 27.89294 30.60018 27.17638 31.31674
2036       29.30631 27.80247 30.81016 27.00638 31.60624
2037       29.36607 27.70674 31.02540 26.82834 31.90379
2038       29.42583 27.60593 31.24572 26.64253 32.20912
2039       29.48558 27.50020 31.47097 26.44920 32.52196
2040       29.54534 27.38969 31.70098 26.24856 32.84211
Code
```{r}
#|label: formatted table of female model forecasts

female_forecast |> kable() |> kable_styling(full_width = F)
```
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2025 28.64900 28.43747 28.86053 28.32549 28.97250
2026 28.70875 28.42177 28.99573 28.26985 29.14765
2027 28.76851 28.39436 29.14266 28.19629 29.34073
2028 28.82826 28.35721 29.29932 28.10784 29.54869
2029 28.88802 28.31154 29.46450 28.00637 29.76967
2030 28.94778 28.25818 29.63737 27.89313 30.00243
2031 29.00753 28.19772 29.81735 27.76903 30.24604
2032 29.06729 28.13062 30.00396 27.63478 30.49980
2033 29.12705 28.05726 30.19683 27.49095 30.76314
2034 29.18680 27.97795 30.39566 27.33802 31.03559
2035 29.24656 27.89294 30.60018 27.17638 31.31674
2036 29.30631 27.80247 30.81016 27.00638 31.60624
2037 29.36607 27.70674 31.02540 26.82834 31.90379
2038 29.42583 27.60593 31.24572 26.64253 32.20912
2039 29.48558 27.50020 31.47097 26.44920 32.52196
2040 29.54534 27.38969 31.70098 26.24856 32.84211
Code
```{r}
#|label: accuracy of female model

# fit statistics only
accuracy(female_forecast)
```
                     ME      RMSE      MAE        MPE      MAPE      MASE
Training set 0.00774556 0.1607687 0.118513 0.04393575 0.4934645 0.8221173
                    ACF1
Training set -0.01682794

Female Model Residuals

  • Top Plot: No spikes should be too large

  • ACF stands for auto-correlation function.

    • Ideally all or most values are with dashed lines
  • Histogram: Distribution of residuals should be approx. normal

Code
```{r}
#|label: female forecast model residuals


checkresiduals(female_forecast)
```

    Ljung-Box test

data:  Residuals from ARIMA(0,2,2)
Q* = 7.3446, df = 8, p-value = 0.4999

Model df: 2.   Total lags used: 10

AK Elec. Revenue Data

Code
```{r}
#|label: Import AK data


# a little data mgmt (if you are interested)
ak_res <- read_csv("data/AK_Residential_Electricity_Revenue_2025.csv", 
                   show_col_types = F,
                   skip=5, 
                   col_names = c("quarter","Revenue")) |>
  separate(quarter, c("Quarter", "Year")) |>
  mutate(Date=yq(paste(Year, Quarter))) |>
  select(Date, Year, Quarter, Revenue) |>
  arrange(Date) |>
  glimpse()
```
Rows: 96
Columns: 4
$ Date    <date> 2001-01-01, 2001-04-01, 2001-07-01, 2001-10-01, 2002-01-01, 2…
$ Year    <chr> "2001", "2001", "2001", "2001", "2002", "2002", "2002", "2002"…
$ Quarter <chr> "Q1", "Q2", "Q3", "Q4", "Q1", "Q2", "Q3", "Q4", "Q1", "Q2", "Q…
$ Revenue <dbl> 542.9275, 424.4111, 394.4869, 529.6425, 570.5655, 439.6120, 40…

AK Resid Time Series Plot

Code
```{r}
#|label: Create AK dygraph


# convert to xts (extensible time series)
ak_res_xts <- xts(x = ak_res[,4], 
                   order.by = ak_res$Date)

# ak-res time_series 
(ak_dg <- dygraph(ak_res_xts, "Alaska Residential Electric Utility Revenue") |>
    dySeries("Revenue", label="Revenue ($M)", color= "darkmagenta") |>
    dyAxis("y", label = "", drawGrid = FALSE) |>
    dyAxis("x", label = "", drawGrid = FALSE) |>
    dyShading(from="2020-3-12", to="2021-6-14", color = "lightgrey") |>
    dyRangeSelector())
```

AK Resid. Rev. Time Series

  • Creat time series using ak_res data

    • Specify freq = 4 - four observations per year

    • Specify start = c(2001,1) - first year/Q in dataset

  • Model data using auto.arima function

    • Specify ic = aic - aic is the information criterion used to determine model.

    • Specify seasonality = T - seasonal (repeating) pattern is present in these data.

  • This chunk will create and save the model to be used below.

Create Time Series (ts) and Model

Code
```{r}
#|label: Create AK time series


# create time series for forecast
ak_res_ts <- ts(ak_res$Revenue, 
                freq=4, 
                start=c(2001, 1))

# model data using auto.arima function
ak_res_model <- auto.arima(ak_res_ts, ic="aic", seasonal=T)
```

AK Revenue Model Forecast

  • Create forecasts (until 2027)

    • h = 12 indicates we want to forecast 12 quarters

    • Most recent year in data is 2024

    • \(2027-2024 = 3 \times 4Qtrs. = 12Qtrs.\)

  • Note that forecasts become less accurate the further into the future you specify.

Code
```{r}
#|label: AK model forecast and plot


# create forecasts (until end of 2027)
ak_res_forecast <- forecast(ak_res_model, h=12)

# plot forecasts with 2 prediction interval bounds shown
autoplot(ak_res_forecast) + 
  labs(y = "AK Resid. Elec. Revenue ($ Mill)") +
  theme_classic()
```

AK Revenue Model Fit

Code
```{r}
#|label: numeric AK model forecasts


# examine numerical forecasts 
# and prediction intervals

# examine numerical forecasts 
# and prediction intervals

ak_res_forecast
```
        Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2025 Q1       612.2996 592.9063 631.6930 582.6401 641.9592
2025 Q2       449.4930 428.3375 470.6485 417.1384 481.8475
2025 Q3       432.7463 410.5294 454.9631 398.7685 466.7240
2025 Q4       561.7324 538.8579 584.6070 526.7488 596.7160
2026 Q1       610.6503 583.6543 637.6464 569.3634 651.9372
2026 Q2       452.1051 423.9903 480.2199 409.1072 495.1030
2026 Q3       432.7534 403.9408 461.5660 388.6884 476.8184
2026 Q4       557.7100 528.4567 586.9632 512.9709 602.4490
2027 Q1       617.3684 585.9695 648.7674 569.3479 665.3890
2027 Q2       451.6921 419.6352 483.7490 402.6653 500.7189
2027 Q3       433.3990 400.9252 465.8727 383.7347 483.0632
2027 Q4       558.8483 526.1089 591.5878 508.7777 608.9190
Code
```{r}
#|label: formatted table of AK model forecasts


female_forecast |> kable() |> kable_styling(full_width = F)
```
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
2025 28.64900 28.43747 28.86053 28.32549 28.97250
2026 28.70875 28.42177 28.99573 28.26985 29.14765
2027 28.76851 28.39436 29.14266 28.19629 29.34073
2028 28.82826 28.35721 29.29932 28.10784 29.54869
2029 28.88802 28.31154 29.46450 28.00637 29.76967
2030 28.94778 28.25818 29.63737 27.89313 30.00243
2031 29.00753 28.19772 29.81735 27.76903 30.24604
2032 29.06729 28.13062 30.00396 27.63478 30.49980
2033 29.12705 28.05726 30.19683 27.49095 30.76314
2034 29.18680 27.97795 30.39566 27.33802 31.03559
2035 29.24656 27.89294 30.60018 27.17638 31.31674
2036 29.30631 27.80247 30.81016 27.00638 31.60624
2037 29.36607 27.70674 31.02540 26.82834 31.90379
2038 29.42583 27.60593 31.24572 26.64253 32.20912
2039 29.48558 27.50020 31.47097 26.44920 32.52196
2040 29.54534 27.38969 31.70098 26.24856 32.84211
Code
```{r}
#|label: accuracy of AK model


# fit statistics only
accuracy(ak_res_forecast)
```
                    ME     RMSE     MAE       MPE     MAPE      MASE
Training set 0.9936991 14.48846 10.7029 0.1383641 1.989123 0.8121866
                    ACF1
Training set -0.01750432

AK Revenue Model Residuals

  • Top Plot: No spikes should be too large

  • ACF stands for auto-correlation function.

    • Ideally all or most values are with dashed lines
  • Histogram: Distribution of residuals should be approx. normal

Code
```{r}
#|label: AK forecast model residuals


checkresiduals(ak_res_forecast)
```

    Ljung-Box test

data:  Residuals from ARIMA(1,0,1)(2,1,0)[4]
Q* = 2.44, df = 4, p-value = 0.6554

Model df: 4.   Total lags used: 8

When you are done…

  1. Save your changes to this file.(Ctrl + S or Cmd + S)
  2. OPTIONAL: Click Render button to update html file with your changes.
  3. Close R/RStudio on your laptop or close Posit Cloud Browser.