For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
- Slaughter of Victorian “Bulls, bullocks and steers” in
aus_livestock
a <- aus_livestock %>%
filter(Animal == "Bulls, bullocks and steers") %>%
select(!Animal) %>%
index_by(Month) %>%
summarise(Count = sum(Count)) #getting rid of the State variable
autoplot(a,Count)
The variance seem to change a little over time but we can’t see much with the naked eye. Let us find the best lambda using the Box-Cox transformation method.
features(a,Count,features = guerrero)
## # A tibble: 1 x 1
## lambda_guerrero
## <dbl>
## 1 0.869
As we can see from above the best lambda value is roughly equal to 0.8689894. Note that this transformation is close to 1. That is, the transformation is small, we could argue that it is small enough to not use.
- Victorian Electricity Demand from
vic_elec.
vic_elec %>%
autoplot(Demand)
The variance looks like it might be decreasing over time so we should look into a transformation. But the variance doesn’t look like it changes much.
features(vic_elec,Demand,features = guerrero)
## # A tibble: 1 x 1
## lambda_guerrero
## <dbl>
## 1 0.0999
As we can see from above the best lambda value is roughly equal to 0.01.
- Gas production from
aus_production
aus_production %>%
autoplot(Gas)
The variance does look like it increases over time so we should make a transformation.
features(aus_production,Gas,features = guerrero)
## # A tibble: 1 x 1
## lambda_guerrero
## <dbl>
## 1 0.121
As we can see from above the best lambda value is roughly equal to 0.1205.
Why is a Box-Cox transformation unhelpful for the
canadian_gasdata?
canadian_gas %>% autoplot(Volume)
Box-Cot transformation is going to be unhelpful here because the beginning and the end of the series have similar variance but the center of the series has different variance from the beginning and end. Box-Cot Transformation is only good the the variance is monotonically increasing over time or monotonically decreasing over time.
Consider the last five years of the Gas data from
aus_production.
gas <- tail(aus_production, 5*4) %>% select(Gas)
- Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
gas <- tail(aus_production, 5*4) %>% select(Gas)
gas %>% autoplot(Gas)
There is clearly a seasonal trend for every year. That is, gas starts the year low and peaks at June and then proceeds to slowly decrease till January where it starts its incline again. We don’t have enough data to see any potential business cycles. But there deos seem to be a general upward trend.
- Use
classical_decompositionwithtype=multiplicativeto calculate the trend-cycle and seasonal indices.
gas_dcmp <- gas %>%
model(
classical_decomposition(Gas,type = "multiplicative")
) %>%
components()
gas_dcmp #looks better in console
## # A dable: 20 x 7 [1Q]
## # Key: .model [1]
## # : Gas = trend * seasonal * random
## .model Quarter Gas trend seasonal random season_adjust
## <chr> <qtr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "classical_decomposition(G~ 2005 Q3 221 NA 1.13 NA 196.
## 2 "classical_decomposition(G~ 2005 Q4 180 NA 0.925 NA 195.
## 3 "classical_decomposition(G~ 2006 Q1 171 200. 0.875 0.974 195.
## 4 "classical_decomposition(G~ 2006 Q2 224 204. 1.07 1.02 209.
## 5 "classical_decomposition(G~ 2006 Q3 233 207 1.13 1.00 207.
## 6 "classical_decomposition(G~ 2006 Q4 192 210. 0.925 0.987 208.
## 7 "classical_decomposition(G~ 2007 Q1 187 213 0.875 1.00 214.
## 8 "classical_decomposition(G~ 2007 Q2 234 216. 1.07 1.01 218.
## 9 "classical_decomposition(G~ 2007 Q3 245 219. 1.13 0.996 218.
## 10 "classical_decomposition(G~ 2007 Q4 205 219. 0.925 1.01 222.
## 11 "classical_decomposition(G~ 2008 Q1 194 219. 0.875 1.01 222.
## 12 "classical_decomposition(G~ 2008 Q2 229 219 1.07 0.974 213.
## 13 "classical_decomposition(G~ 2008 Q3 249 219 1.13 1.01 221.
## 14 "classical_decomposition(G~ 2008 Q4 203 220. 0.925 0.996 219.
## 15 "classical_decomposition(G~ 2009 Q1 196 222. 0.875 1.01 224.
## 16 "classical_decomposition(G~ 2009 Q2 238 223. 1.07 0.993 222.
## 17 "classical_decomposition(G~ 2009 Q3 252 225. 1.13 0.994 224.
## 18 "classical_decomposition(G~ 2009 Q4 210 226 0.925 1.00 227.
## 19 "classical_decomposition(G~ 2010 Q1 205 NA 0.875 NA 234.
## 20 "classical_decomposition(G~ 2010 Q2 236 NA 1.07 NA 220.
- Do the results support the graphical interpretation from part a?
Looking at the data frame gas_dcmp in the console we see that the seasonal component oscillates every 4 observations which agrees with our annually seasonal trend speculation above. Next our trend variable gradually increases which also agrees with what we said above.
- Compute and plot the seasonally adjusted data.
gas_dcmp %>%
ggplot(aes(x = Quarter))+
geom_line(aes(y = Gas, color = "Data"))+
geom_line(aes(y = season_adjust, color = "Seasonally Adjusted"))+
geom_line(aes(y = trend, color = "Trend"))+
labs(title = "Gas Production")+
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
This exercise uses the
canadian_gasdata (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).
- Plot the data using
autoplot(),gg_subseries()andgg_season()to look at the effect of the changing seasonality over time. What do you think is causing it to change so much?
canadian_gas %>% autoplot(Volume)
From autoplot() we can see that there is more variability in Volume for each season for 1974-1990.
canadian_gas %>% gg_subseries(Volume)
As we can see from the graph above we can see a dip in volume in the middle of the graph. This is due to the variation explained above.
canadian_gas %>% gg_season(Volume)
From this graph we can see that there is more variability in the middle of the graph. Indicating that the years between 1975 and 1990 have more variability in there seasons.
- Do an STL decomposition of the data. You will need to choose a seasonal window to allow for the changing shape of the seasonal component.
STL_dcmp <- canadian_gas %>%
model(
STL(Volume ~ trend(window=12) +
season(window="periodic"),
robust = T)) %>%
components()
STL_dcmp
## # A dable: 542 x 7 [1M]
## # Key: .model [1]
## # : Volume = trend + season_year + remainder
## .model Month Volume trend season_year remainder season_adjust
## <chr> <mth> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "STL(Volume ~ tren~ 1960 Jan 1.43 0.750 1.02 -0.341 0.409
## 2 "STL(Volume ~ tren~ 1960 Feb 1.31 0.881 0.207 0.218 1.10
## 3 "STL(Volume ~ tren~ 1960 Mar 1.40 1.01 0.618 -0.229 0.784
## 4 "STL(Volume ~ tren~ 1960 Apr 1.17 1.13 -0.0292 0.0703 1.20
## 5 "STL(Volume ~ tren~ 1960 May 1.12 1.24 -0.288 0.159 1.40
## 6 "STL(Volume ~ tren~ 1960 Jun 1.01 1.34 -0.899 0.567 1.91
## 7 "STL(Volume ~ tren~ 1960 Jul 0.966 1.44 -0.637 0.162 1.60
## 8 "STL(Volume ~ tren~ 1960 Aug 0.977 1.41 -0.573 0.141 1.55
## 9 "STL(Volume ~ tren~ 1960 Sep 1.03 1.38 -0.768 0.423 1.80
## 10 "STL(Volume ~ tren~ 1960 Oct 1.25 1.31 0.0606 -0.114 1.19
## # ... with 532 more rows
- How does the seasonal shape change over time? [Hint: Try plotting the seasonal component using
gg_season().]
STL_dcmp %>% gg_season(season_adjust) #couldn't plot season_year
As we can see the the graph above there is more variability in the blue and green lines (center of the time series).
- Can you produce a plausible seasonally adjusted series?
STL_dcmp %>%
autoplot(season_adjust) #by adjusted did you mean transformation
- Compare the results with those obtained using SEATS and X11. How are they different?
x11_dcmp <- canadian_gas %>%
model( x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
components()
x11_dcmp %>% autoplot()
seats_dcmp <- canadian_gas %>%
model(seats = X_13ARIMA_SEATS(Volume ~ seats())) %>%
components()
seats_dcmp %>% autoplot()
X11 and Seats have a different Seasonal Variable. That is, they try to account for the change in variance we are seeing in the middle of our series.