The 9.32 figure shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
Yes, all three ACF plots are consistent with white noise, but the appearance varies due to sample size:
All three plots are consistent with white noise. Apparent spikes in the ACF at small n are expected due to sampling variability. As n increases, the sample ACF more closely resembles the theoretical ACF of white noise (zero at all lags).
Why are the critical values (blue dashed lines) at different distances from zero?
The dashed lines show the 95% confidence limits for the autocorrelations. They are calculated as: ±1.96 / √n So, the bigger the sample size (n), the smaller the range. That’s why: For n = 36 → bounds ≈ ±0.327. For n = 360 → bounds ≈ ±0.103. For n = 1,000 → bounds ≈ ±0.062. As n increases, the limits shrink toward zero. This happens because with more data, we can estimate the autocorrelations more precisely.
Why are the autocorrelations different in each figure if they all represent white noise?
Even though all the data are white noise (no true autocorrelation), the sample autocorrelations can still vary due to randomness. With small n, there’s more variation just by chance, so spikes are more common. With large n, the random fluctuations get smaller, so the ACF looks flatter. All three plots are consistent with white noise — the differences are just due to sample size.
A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.1
## ✔ lubridate 1.9.4 ✔ fable 0.4.1
## ✔ ggplot2 3.5.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
library(ggplot2)
library(dplyr)
library(purrr)
library(distributional)
library(tidyr)
# Amazon stock
amazon_stock <- gafa_stock %>%
filter(Symbol == "AMZN")
# Plot - daily closing prices
autoplot(amazon_stock, Close) +
labs(title = "Amazon Daily Closing Prices", y = "Price (USD)", x = "Date")
The plot shows a strong upward trend over time, with increasing variability. This indicates the series is non-stationary — the mean and variance are not constant.
# Plot ACF
amazon_stock %>%
ACF(Close) %>%
autoplot() +
labs(title = "ACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
The ACF stays close to 1 across all lags and decays very slowly. This is a typical pattern of a non-stationary time series with a trend.
# Plot PACF
amazon_stock %>%
PACF(Close) %>%
autoplot() +
labs(title = "PACF of Amazon Stock Prices")
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
The PACF has a large spike at lag 1 and smaller but still noticeable spikes at other lags. This pattern supports the need to difference the series to make it stationary.
For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
# Turkish GDP
turkey_gdp <- global_economy %>% filter(Country == "Turkey")
# Box-Cox lambda
turkey_gdp %>%
features(GDP, features = guerrero)
## # A tibble: 1 × 2
## Country lambda_guerrero
## <fct> <dbl>
## 1 Turkey 0.157
# Plot GDP
turkey_gdp %>% autoplot(GDP) + labs(title = "Turkish GDP")
# Box-Cox transformation (use lambda from above, e.g., lambda ≈ 0.3)
turkey_gdp <- turkey_gdp %>%
mutate(GDP_transformed = box_cox(GDP, 0.3))
# Plot ACF of transformed GDP
turkey_gdp %>%
ACF(GDP_transformed) %>%
autoplot()
# first differences
turkey_gdp %>%
mutate(diff1 = difference(GDP_transformed)) %>%
ACF(diff1) %>%
autoplot()
To make Turkish GDP stationary: Apply a Box-Cox transformation with λ ≈ 0.16. Apply first-order differencing (no seasonal differencing needed, since data is annual).
# Tasmania data
tas_accom <- aus_accommodation %>%
filter(State == "Tasmania")
# Box-Cox lambda
tas_accom %>%
features(Takings, features = guerrero)
## # A tibble: 1 × 2
## State lambda_guerrero
## <chr> <dbl>
## 1 Tasmania 0.00182
# Box-Cox transformation (use lambda ≈ 0 → log transformation)
tas_accom <- tas_accom %>%
mutate(Takings_log = log(Takings))
# seasonal (lag = 4) and regular differencing
tas_accom_diff <- tas_accom %>%
mutate(diff_seasonal = difference(Takings_log, lag = 4),
diff_full = difference(diff_seasonal))
# Plot ACF of differenced series
tas_accom_diff %>%
ACF(diff_full) %>%
autoplot()
Box-Cox lambda was approximately 0.002, suggesting a log transformation. The data is quarterly, so seasonal differencing with lag = 4 is appropriate. A regular difference was added to remove remaining trend. The final ACF plot confirms stationarity — autocorrelations are near zero.
# Souvenir sales
souvenirs %>%
features(Sales, features = guerrero)
## # A tibble: 1 × 1
## lambda_guerrero
## <dbl>
## 1 0.00212
Box-Cox lambda is approximately 0.002, which is close to 0. This suggests a log transformation is appropriate to stabilize variance:
# log transformation
souvenirs <- souvenirs %>%
mutate(Sales_log = log(Sales))
# seasonal and regular differencing
souvenirs_diff <- souvenirs %>%
mutate(diff_seasonal = difference(Sales_log, lag = 12),
diff_full = difference(diff_seasonal))
# Plot ACF of differenced series
souvenirs_diff %>%
ACF(diff_full) %>%
autoplot()
To make the souvenir sales series stationary:
Use a log transformation (Box-Cox λ ≈ 0). Apply seasonal differencing (lag = 12) and regular differencing. The final ACF plot supports that the series is now stationary.
For your retail data (from Exercise 7 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
set.seed(13579)
myseries_id <- sample(aus_retail$`Series ID`, 1)
myseries <- aus_retail %>%
filter(`Series ID` == myseries_id)
myseries_id
## [1] "A3349434X"
unique(myseries$State)
## [1] "Western Australia"
unique(myseries$Industry)
## [1] "Department stores"
# original series
myseries %>%
autoplot(Turnover) +
labs(title = "Original Series", subtitle = paste("Series ID:", myseries_id))
# Box-Cox lambda
myseries %>%
features(Turnover, features = guerrero)
## # A tibble: 1 × 3
## State Industry lambda_guerrero
## <chr> <chr> <dbl>
## 1 Western Australia Department stores 0.0816
# the lambda is approximately 0 → log transformation
myseries <- myseries %>%
mutate(Turnover_log = log(Turnover))
# seasonal and regular differencing
myseries_diff <- myseries %>%
mutate(seasonal_diff = difference(Turnover_log, lag = 12),
diff_full = difference(seasonal_diff))
# Plot ACF
myseries_diff %>%
ACF(diff_full) %>%
autoplot() +
labs(title = "ACF of Differenced Series")
The original time plot shows strong upward trend and clear annual seasonality (monthly data). A seasonal difference (lag = 12) was applied to remove seasonality. A regular difference was then applied to remove the remaining trend.
The ACF of the fully differenced series shows: The large negative spike at lag 1, which is typical after first differencing. No strong seasonal autocorrelation remains at lag 12 or multiples of 12. Remaining autocorrelations fall mostly within the confidence bounds, suggesting the series is now stationary.
The resulting ACF plot confirms that these steps have effectively removed trend and seasonality.
Simulate and plot data from simple ARIMA models.
y <- numeric(100)
e <- rnorm(100)
for (i in 2:100)
y[i] <- 0.6 * y[i - 1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
# time plot of AR(1) series with phi = 0.6
sim %>%
autoplot(y) +
labs(title = "Simulated AR(1) Process (ϕ₁ = 0.6)", y = "y", x = "Time index")
The series shows moderate persistence. Shocks to the system have a temporary effect that fades gradually. The series fluctuates around zero with relatively quick mean reversion and moderate autocorrelation.
# experimenting with other values of ϕ₁ by modifying the simulation step
# phi = 0.95 (stronger persistence)
phi <- 0.95
y2 <- numeric(100)
e2 <- rnorm(100)
for (i in 2:100) {
y2[i] <- phi * y2[i - 1] + e2[i]
}
sim2 <- tsibble(idx = seq_len(100), y = y2, index = idx)
autoplot(sim2, y) +
labs(title = "Simulated AR(1) Process (ϕ₁ = 0.95)", y = "y", x = "Time index")
The series is highly persistent. Shocks have long-lasting effects, and the series exhibits longer swings and a smoother appearance. This behavior is close to non-stationarity — it looks more like a random walk with memory.
set.seed(123)
n <- 100
theta <- 0.6
e <- rnorm(n)
y <- numeric(n)
# Generate MA(1)
for (i in 2:n) {
y[i] <- e[i] + theta * e[i - 1]
}
# to tsibble
ma1_sim <- tsibble(idx = seq_len(n), y = y, index = idx)
# first few values
head(ma1_sim)
## # A tsibble: 6 x 2 [1]
## idx y
## <int> <dbl>
## 1 1 0
## 2 2 -0.566
## 3 3 1.42
## 4 4 1.01
## 5 5 0.172
## 6 6 1.79
ma1_sim %>%
autoplot(y) +
labs(title = "Simulated MA(1) Process (θ1 = 0.6)", y = "y", x = "Time index")
The series shows moderate local smoothing. Each value depends on its own white noise term and the previous one, resulting in small runs and visible randomness. The effect of shocks is short-lived — usually only for one time step.
# MA(1) with theta = 0.95
theta2 <- 0.95
e2 <- rnorm(n)
y2 <- numeric(n)
for (i in 2:n) {
y2[i] <- e2[i] + theta2 * e2[i - 1]
}
ma1_sim2 <- tsibble(idx = seq_len(n), y = y2, index = idx)
autoplot(ma1_sim2, y) +
labs(title = "Simulated MA(1) Process (θ1 = 0.95)", y = "y", x = "Time index")
The series appears slightly smoother with longer swings compared to θ₁ = 0.6. The impact of each shock is more strongly felt in the next observation, but still does not persist beyond lag 1. There’s still no long memory like in AR processes.
set.seed(123)
n <- 100
phi <- 0.6
theta <- 0.6
e <- rnorm(n)
y <- numeric(n)
# Generate ARMA(1,1)
for (i in 2:n) {
y[i] <- phi * y[i - 1] + e[i] + theta * e[i - 1]
}
# to tsibble
arma11_sim <- tsibble(idx = seq_len(n), y = y, index = idx)
# Plot
arma11_sim %>%
autoplot(y) +
labs(title = "Simulated ARMA(1,1) Process (ϕ1 = 0.6, θ1 = 0.6)", y = "y", x = "Time index")
The series shows smooth but bounded fluctuations. It is stationary, and each point is influenced by its previous value and noise, but the influence diminishes over time. The process stays within a limited range and reverts to the mean.
set.seed(123)
n <- 100
phi1 <- -0.8
phi2 <- 0.3
e <- rnorm(n)
y <- numeric(n)
# Generate AR(2) series
for (i in 3:n) {
y[i] <- phi1 * y[i - 1] + phi2 * y[i - 2] + e[i]
}
# Convert to tsibble
ar2_sim <- tsibble(idx = seq_len(n), y = y, index = idx)
# Plot the AR(2) series
ar2_sim %>%
autoplot(y) +
labs(title = "Simulated AR(2) Process (ϕ1 = -0.8, ϕ2 = 0.3)", y = "y", x = "Time index")
The series exhibits rapidly growing oscillations, clearly non-stationary. Values swing back and forth with increasing magnitude, eventually becoming unstable. This explosive behavior is due to AR coefficients that violate the stationarity condition — the roots of the characteristic equation lie inside the unit circle, making the process diverge.
# to tibble
comparison <- bind_rows(
arma11_sim %>% as_tibble() %>% mutate(model = "ARMA(1,1)"),
ar2_sim %>% as_tibble() %>% mutate(model = "AR(2)")
)
# Plot both series using facets
comparison %>%
ggplot(aes(x = idx, y = y)) +
geom_line() +
facet_wrap(~model, ncol = 1, scales = "free_y") +
labs(title = "Comparison of ARMA(1,1) and AR(2) Simulated Series", x = "Time index", y = "y")
set.seed(123)
n <- 100
phi1 <- -0.8
phi2 <- 0.3
e <- rnorm(n)
y <- numeric(n)
# Generate AR(2) process
for (i in 3:n) {
y[i] <- phi1 * y[i - 1] + phi2 * y[i - 2] + e[i]
}
# to tsibble
ar2_sim <- tsibble(idx = seq_len(n), y = y, index = idx)
# Plot
ar2_sim %>%
autoplot(y) +
labs(title = "Simulated AR(2) Process (ϕ1 = -0.8, ϕ2 = 0.3)", y = "y", x = "Time index")
The plot shows explosive oscillations that grow over time. Initially, values are near zero, but after about time point 40–50, the amplitude rapidly increases. This confirms that the process is non-stationary, due to the AR parameters violating the stationarity condition (i.e., the characteristic roots are inside the unit circle).
# tsibbles to tibbles
comparison <- bind_rows(
arma11_sim %>% as_tibble() %>% mutate(model = "ARMA(1,1)"),
ar2_sim %>% as_tibble() %>% mutate(model = "AR(2)")
)
# both series
comparison %>%
ggplot(aes(x = idx, y = y)) +
geom_line() +
facet_wrap(~model, ncol = 1, scales = "free_y") +
labs(title = "Comparison of ARMA(1,1) and AR(2) Simulated Series",
x = "Time index", y = "y")
ARMA(1,1): The series is stationary. It exhibits moderate fluctuations around the mean with persistent but decaying memory. Shocks affect the process temporarily and then fade.
AR(2): The series is non-stationary and displays explosive oscillations. Early values are stable, but the effect of instability builds over time, with values diverging rapidly. This is due to AR parameters violating stationarity conditions.
Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011.
# data
aus_airpassengers %>%
autoplot(Passengers) +
labs(title = "Australian Air Passengers (1970–2011)",
y = "Passengers (millions)", x = "Year")
# best ARIMA model automatically
fit_a <- aus_airpassengers %>%
model(ARIMA(Passengers))
report(fit_a)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8963
## s.e. 0.0594
##
## sigma^2 estimated as 4.308: log likelihood=-97.02
## AIC=198.04 AICc=198.32 BIC=201.65
# residuals
fit_a %>%
gg_tsresiduals()
# next 10 years
fc_a <- fit_a %>%
forecast(h = 10)
# Plot
fc_a %>%
autoplot(aus_airpassengers) +
labs(title = "ARIMA Forecast: Australian Air Passengers",
y = "Passengers (millions)", x = "Year")
The automatically selected model is: ARIMA (0,2,1) with the following estimated coefficient:
Residual Diagnostics (gg_tsresiduals):
The forecast for the next 10 years shows:
Backshift Operator Form:
Let B be the backshift operator. The model is:
(1−𝐵)2𝑦𝑡=(1+𝜃1𝐵)𝑒𝑡(1−B) 2 y t =(1+θ 1 B)e t
Substituting θ₁:
(1−𝐵)2𝑦𝑡=(1−0.8963𝐵)𝑒𝑡(1−B) 2 y t =(1−0.8963B)e t
This is a second-difference MA(1) model, which implies the series has strong trend-like behavior, and only the acceleration (second difference) is stationary.
# ARIMA(0,1,0) with drift
fit_b <- aus_airpassengers %>%
model(ARIMA(Passengers ~ 1 + pdq(0,1,0)))
report(fit_b)
## Series: Passengers
## Model: ARIMA(0,1,0) w/ drift
##
## Coefficients:
## constant
## 1.4191
## s.e. 0.3014
##
## sigma^2 estimated as 4.271: log likelihood=-98.16
## AIC=200.31 AICc=200.59 BIC=203.97
# 10 years ahead
fc_b <- fit_b %>%
forecast(h = 10)
# Plot
fc_b %>%
autoplot(aus_airpassengers) +
labs(title = "Forecast from ARIMA(0,1,0) with Drift",
y = "Passengers (millions)", x = "Year")
The forecast line rises linearly, perhaps due to the constant annual increase estimated by the drift. Forecast intervals widen with the horizon, as uncertainty accumulates.
# ARIMA(2,1,2) with drift
fit_c <- aus_airpassengers %>%
model(ARIMA(Passengers ~ 1 + pdq(2,1,2)))
report(fit_c)
## Series: Passengers
## Model: ARIMA(2,1,2) w/ drift
##
## Coefficients:
## ar1 ar2 ma1 ma2 constant
## -0.5518 -0.7327 0.5895 1.0000 3.2216
## s.e. 0.1684 0.1224 0.0916 0.1008 0.7224
##
## sigma^2 estimated as 4.031: log likelihood=-96.23
## AIC=204.46 AICc=206.61 BIC=215.43
# 10 years ahead
fc_c <- fit_c %>%
forecast(h = 10)
# Plot
fc_c %>%
autoplot(aus_airpassengers) +
labs(title = "ARIMA(2,1,2) with Drift: Forecast",
y = "Passengers (millions)", x = "Year")
The forecast shows continued upward growth. The curve is slightly more flexible than ARIMA(0,1,0) with drift.
# ARIMA(0,2,1) with a constant
fit_e <- aus_airpassengers %>%
model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))
## Warning: Model specification induces a quadratic or higher order polynomial trend.
## This is generally discouraged, consider removing the constant or reducing the number of differences.
report(fit_e)
## Series: Passengers
## Model: ARIMA(0,2,1) w/ poly
##
## Coefficients:
## ma1 constant
## -1.0000 0.0571
## s.e. 0.0585 0.0213
##
## sigma^2 estimated as 3.855: log likelihood=-95.1
## AIC=196.21 AICc=196.79 BIC=201.63
# 10 years ahead
fc_e <- fit_e %>%
forecast(h = 10)
# Plot
fc_e %>%
autoplot(aus_airpassengers) +
labs(title = "ARIMA(0,2,1) with Constant: Forecast",
y = "Passengers (millions)", x = "Year")
The forecast increases faster than linearly, showing accelerating growth — a direct result of including a constant in a second-order differenced model.
The ARIMA(0,2,1) with constant model gives the lowest AIC, forecasts that seems to match the shape of the data, a good balance of simplicity and fit.
For the United States GDP series (from global_economy):
# USA GDP
us_gdp <- global_economy %>%
filter(Country == "United States")
# raw GDP series
us_gdp %>%
autoplot(GDP) +
labs(title = "United States GDP (Raw)", y = "GDP (USD billions)", x = "Year")
# Box-Cox lambda
us_gdp %>%
features(GDP, features = guerrero)
## # A tibble: 1 × 2
## Country lambda_guerrero
## <fct> <dbl>
## 1 United States 0.282
Guerrero’s lambda ≈ 0.28, which suggests that a Box-Cox transformation is useful
# Box-Cox transformation using lambda = 0.28
lambda <- 0.28
us_gdp <- us_gdp %>%
mutate(GDP_transformed = box_cox(GDP, lambda))
fit_b <- us_gdp %>%
model(ARIMA(GDP_transformed))
report(fit_b)
## Series: GDP_transformed
## Model: ARIMA(1,1,0) w/ drift
##
## Coefficients:
## ar1 constant
## 0.4588 111.6233
## s.e. 0.1198 8.9762
##
## sigma^2 estimated as 4887: log likelihood=-322.06
## AIC=650.13 AICc=650.58 BIC=656.26
# Plot
fit_b %>%
forecast(h = 10) %>%
autoplot(us_gdp) +
labs(title = "ARIMA Forecast of Transformed U.S. GDP",
y = "Box-Cox Transformed GDP", x = "Year")
A log-like Box-Cox transformation (λ ≈ 0.28) was applied to stabilize variance. Forecast shows a continued upward trend in the transformed GDP space. Forecast interval widens slowly, indicating relatively low uncertainty.
# ARIMA(0,1,1)
# ARIMA(1,1,1)
# ARIMA(2,1,0)
# all models in a single model table
fit_all <- us_gdp %>%
model(
arima_01_1 = ARIMA(GDP_transformed ~ 1 + pdq(0,1,1)),
arima_11_1 = ARIMA(GDP_transformed ~ 1 + pdq(1,1,1)),
arima_21_0 = ARIMA(GDP_transformed ~ 1 + pdq(2,1,0))
)
# Compare
fit_all %>% glance()
## # A tibble: 3 × 9
## Country .model sigma2 log_lik AIC AICc BIC ar_roots ma_roots
## <fct> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <list>
## 1 United States arima_01_1 5075. -323. 652. 653. 658. <cpl [0]> <cpl [1]>
## 2 United States arima_11_1 4977. -322. 652. 653. 660. <cpl [1]> <cpl [1]>
## 3 United States arima_21_0 4977. -322. 652. 653. 660. <cpl [2]> <cpl [0]>
Both ARIMA(1,1,1) and ARIMA(2,1,0) have the lowest AIC/AICc, so they are the best-fitting models.
Residual variance (σ²) is lowest for ARIMA(1,1,1), slightly better than the others.
Between those two, ARIMA(1,1,1) is simpler (fewer parameters) and thus preferred under the principle of parsimony.
But let’s check the diagnostics:
# Extract the ARIMA(1,1,1) model from the fit_all object
fit_best <- fit_all %>% select(arima_11_1)
# Residual diagnostics
fit_best %>%
gg_tsresiduals()
The residuals show no major violations of ARIMA assumptions. The model is well-fitted.
fit_auto <- us_gdp %>%
model(ARIMA(GDP_transformed))
# *actual model object* from fit_auto
best_model <- fit_auto$.fit[[1]]
## Warning: Unknown or uninitialised column: `.fit`.
# a future_data tsibble for 10 yrs
future_data <- new_data(us_gdp, 10)
# Simulate 1,000 future paths
sim_paths <- generate(best_model, new_data = future_data, times = 1000)
# simulations into 80% and 95% intervals + mean
sim_summary <- sim_paths %>%
as_tibble() %>%
group_by(Year) %>%
summarise(
lower_80 = quantile(.sim, 0.10, na.rm = TRUE),
upper_80 = quantile(.sim, 0.90, na.rm = TRUE),
lower_95 = quantile(.sim, 0.025, na.rm = TRUE),
upper_95 = quantile(.sim, 0.975, na.rm = TRUE),
mean_sim = mean(.sim, na.rm = TRUE)
)
# Back-transform from Box–Cox
sim_summary <- sim_summary %>%
mutate(
lower_80 = inv_box_cox(lower_80, lambda),
upper_80 = inv_box_cox(upper_80, lambda),
lower_95 = inv_box_cox(lower_95, lambda),
upper_95 = inv_box_cox(upper_95, lambda),
mean_sim = inv_box_cox(mean_sim, lambda)
)
# historical data to tibble
us_gdp_tbl <- us_gdp %>%
as_tibble() %>%
select(Year, GDP)
# Plot
ggplot() +
geom_line(data = us_gdp_tbl, aes(x = Year, y = GDP), color = "black") +
geom_ribbon(data = sim_summary,
aes(x = Year, ymin = lower_95, ymax = upper_95),
fill = "blue", alpha = 0.1) +
geom_ribbon(data = sim_summary,
aes(x = Year, ymin = lower_80, ymax = upper_80),
fill = "blue", alpha = 0.2) +
geom_line(data = sim_summary, aes(x = Year, y = mean_sim), color = "blue") +
labs(title = "Forecast: U.S. GDP (Back-transformed)",
x = "Year", y = "GDP (USD billions)")
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_line()`).
I kept trying to do this, but i kept getting errors! Tried so many different ways, and still got errors. I spent maybe 2 hours just on this to no avail :(