library(fpp3)
library(latex2exp)
<- global_economy |> filter(Country == "United States") |>
us_economy autoplot(GDP) + labs(title = "United States GDP")
us_economy
Time Series Assignment 2
2. Graphs of time series (and transformations)
United States Economy
This is a plot of United States’ GDP over years 1960 - 2017. It makes sense to apply a transformation to this data, as GDP in a vacuum carries little meaning. Commonly, GDP per-capita is a useful metric to gauge economic change, especially because GDP itself is somewhat dependent on population.
<- global_economy |> filter(Country == "United States") |>
us_economy_adj autoplot(GDP/Population) + labs(title = "United States GDP per-capita")
us_economy_adj
Even still, the plot looks roughly the same!
Victorian Cows
<- aus_livestock |> filter(Animal == "Bulls, bullocks and steers") |>
aus_cows filter(State == "Victoria") |>
autoplot(Count) + labs(title = "Victorian Cows Count")
aus_cows
<- aus_livestock |> filter(Animal == "Bulls, bullocks and steers") |>
cows_dcmp filter(State == "Victoria") |>
model(stl = STL(Count))
components(cows_dcmp) |> autoplot()
In the STL decomposition, we can see that the variation in Count
from year to year has strong seasonality that seems to be relatively unrelated to the level of Count
itself, suggesting no real need for a transformation of the data.
Victorian Electricity
<- vic_elec |>
aus_elec autoplot(Demand) + labs(title = "Victorian Electricity Demand")
aus_elec
<- vic_elec |>
elec_dcmp model(STL(Demand))
components(elec_dcmp) |> autoplot()
There are several areas of seasonality here: yearly, weekly, daily, and hourly. The first three seem to be dependent on the level of demand, the last dependent on the overall trend. This would suggest that a transformation is in order, we’ll try a Box-Cox transform.
<- vic_elec |>
lambda features(Demand, features = guerrero) |>
pull(lambda_guerrero)
<- vic_elec |>
aus_elec_t autoplot(box_cox(Demand, lambda)) + labs(title = latex2exp::TeX(paste0(
"Transformed Victorian Electricity Demand with $\\lambda$ = ",
round(lambda,2))))
aus_elec_t
<- vic_elec |>
elec_t_dcmp model(STL(box_cox(Demand, lambda)))
components(elec_t_dcmp) |> autoplot()
The transform slightly helped, I guess? The daily seasonality has been almost fixed, same with the monthly. Not the yearly, though. This would tell us that the seasonality is more complex than just multiplicative, and further transforms could probably be used (but I don’t know many useful nonlinear transforms).
Australian Gas
<- aus_production[, c(1, 7)] |>
aus_gas autoplot(Gas) + labs(title = "Australian Gas Production")
aus_gas
<- aus_production[, c(1, 7)] |>
gas_dcmp model(STL(Gas))
components(gas_dcmp) |> autoplot()
CLEAR seasonality as the level (and trend) of gas production increase. We’ll try Box-Cox again:
<- aus_production[, c(1, 7)] |>
lambda_gas features(Gas, features = guerrero) |>
pull(lambda_guerrero)
<- aus_production[, c(1, 7)] |>
aus_gas_t autoplot(box_cox(Gas, lambda_gas)) + labs(title = latex2exp::TeX(paste0("Transformed Australian Gas Production with $\\lambda$ = ", round(lambda_gas,2))))
aus_gas_t
<- aus_production[, c(1, 7)] |>
gas_t_dcmp model(STL(box_cox(Gas, lambda_gas)))
components(gas_t_dcmp) |> autoplot()
The Box-Cox transform successfully fixed the increasing seasonal variance (with year).
7. Australian Gas time series
(a). Plotting the time series
<- tail(aus_production, 5*4) |> select(Gas)
gas autoplot(gas) + labs(title = "Gas Production Over Time")
From an immediate simple plot, we can see a trend of gas production slowly increasing over time, along with a strong seasonality favoring increased production in quarters 2 and 3 relative to quarters 1 and 4.
(b + c). Gas time series decomposition
<- tail(aus_production, 5*4) |> select(Gas) |>
gas_dcmp model(
classical_decomposition(Gas, type = "multiplicative")
|> components()
) |> autoplot() gas_dcmp
This cleanly matches what was seen in part (a). trend
is slowly increasing over time, seasonal
is extremely periodic, nearly identical in each 1-year period.
(d). Seasonal adjustment of Gas data
= gas
gas_adj 1] = gas_adj[, 1]/gas_dcmp$seasonal
gas_adj[, autoplot(gas_adj) + labs(title = "Gas Production (Seasonally Adj.)")
(e). Changing the data? Making an outlier?
Lets redo the above, but make a middle point (like 2008 Q1) an outlier. We can do so by artificially adding 300 to it. In theory, this should screw up the moving averages:
= gas
gas_weird 11, 1] = gas_weird[11, 1] + 300
gas_weird[
<- gas_weird |>
gas_weird_dcmp model(
classical_decomposition(Gas, type = "multiplicative")
|> components()
) |> autoplot() gas_weird_dcmp
= gas_weird
gas_weird_adj 1] = gas_weird_adj[, 1]/gas_weird_dcmp$seasonal
gas_weird_adj[, autoplot(gas_weird_adj) + labs(title = "Gas Production (With Outlier, Seasonally Adj.)")
Interestingly, adding the outlier makes the seasonal fluctuations want to persist, even after seasonal adjustment. With the massive outlier present, the initial calculation of trend
is harmed severely, being based off of moving averages (many of which pick up the outlier). The averages allow the outlier to have more influence than it otherwise would in the trend.
The trend is then separated from the raw data, allowing for seasonal fluctuation calculations. As the outlier overstates its influence in the trend, dividing out the trend from the raw data understates the seasonal component of the data. Thus, when we seasonally adjust the data, the adjustment is not enough, which still leaves seasonal artifacts present (visible in the plot above)
(f). Outlier on the edges
If the outlier is on the edges, fewer moving average moments should contain it, meaning that seasonal adjustment should be better:
= gas
gas_weird 1, 1] = gas_weird[1, 1] + 300
gas_weird[
<- gas_weird |>
gas_weird_dcmp model(
classical_decomposition(Gas, type = "multiplicative")
|> components()
) |> autoplot() gas_weird_dcmp
= gas_weird
gas_weird_adj 1] = gas_weird_adj[, 1]/gas_weird_dcmp$seasonal
gas_weird_adj[, autoplot(gas_weird_adj) + labs(title = "Gas Production (With End Outlier, Seasonally Adj.)")
And this is exactly what we see. Less seasonality in the seasonally adjusted data, compared with before, with the strong seasonality.
8. Decompose Chapter 2 Time Series with X-11
set.seed(2468)
<- aus_retail |>
myseries filter(`Series ID` == sample(aus_retail$`Series ID`,1))
<- myseries |>
x11_dcmp model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
components()
|> autoplot() x11_dcmp
I don’t see any outliers with the aus_retail
data selection. In fact, the X-11 decomposition makes the data look incredibly predictable and stable. We see a very clean bounding, and consistent values, in the seasonal
component. The trend
is stably increasing across the 4 present decades. Additionally, the irregular
component is quite minor, indicating few random fluctuations.