Forecasting: Principles and
Practice (3rd ed)
Chapter 3 Time series Decomposition
Consider the GDP information in global_economy. Plot the
GDP per capita for each country over time. Which country has the highest
GDP per capita? How has this changed over time?
top_countries_df = global_economy |>
mutate(GDP_per_capita = GDP/Population) |>
arrange(desc(GDP_per_capita))
# identify the top 10 countries with the highest GDP per capita to display in
# the legend of the plot
top_countries = top_countries_df |>
distinct(Country) |>
head(10) |>
pull(Country)
# plot he GDP per capita for each country over time
global_economy |>
autoplot(GDP/Population) +
labs(title= "GDP per capita", y = "$US") +
scale_colour_manual(values = unique(global_economy$Country), breaks = c(top_countries))
# plot the country with the highest GDP per capita over time
top_countries_year_df = top_countries_df |>
index_by(Year) |>
filter(GDP_per_capita == max(GDP_per_capita, na.rm = TRUE)) |>
arrange(desc(Year))
ggplot(top_countries_year_df, aes(x = Year, y = GDP_per_capita, fill = Country)) +
geom_col() +
labs(title= "Max GDP per capita per Year", y = "$US")
knit_table(top_countries_df |> head(10), 'Top GDP per Capita Countries & Years')
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_capita |
|---|---|---|---|---|---|---|---|---|---|
| Monaco | MCO | 2014 | 7060236168 | 7.1796368 | NA | NA | NA | 38132 | 185152.5 |
| Monaco | MCO | 2008 | 6476490406 | 0.7318007 | NA | NA | NA | 35853 | 180640.1 |
| Liechtenstein | LIE | 2014 | 6657170923 | NA | NA | NA | NA | 37127 | 179308.1 |
| Liechtenstein | LIE | 2013 | 6391735894 | NA | NA | NA | NA | 36834 | 173528.2 |
| Monaco | MCO | 2013 | 6553372278 | 9.5707988 | NA | NA | NA | 37971 | 172588.9 |
| Monaco | MCO | 2016 | 6468252212 | 3.2138488 | NA | NA | NA | 38499 | 168010.9 |
| Liechtenstein | LIE | 2015 | 6268391521 | NA | NA | NA | NA | 37403 | 167590.6 |
| Monaco | MCO | 2007 | 5867916781 | 14.4288990 | NA | NA | NA | 35111 | 167124.7 |
| Liechtenstein | LIE | 2016 | 6214633651 | NA | NA | NA | NA | 37666 | 164993.2 |
| Monaco | MCO | 2015 | 6258178995 | 4.9423298 | NA | NA | NA | 38307 | 163369.1 |
knit_table(top_countries_year_df, 'Max GDP per Capita per Year')
| Country | Code | Year | GDP | Growth | CPI | Imports | Exports | Population | GDP_per_capita |
|---|---|---|---|---|---|---|---|---|---|
| Luxembourg | LUX | 2017 | 6.240446e+10 | 2.2979413 | 111.41323 | 193.969823 | 230.016430 | 599449 | 104103.037 |
| Monaco | MCO | 2016 | 6.468252e+09 | 3.2138488 | NA | NA | NA | 38499 | 168010.915 |
| Liechtenstein | LIE | 2015 | 6.268392e+09 | NA | NA | NA | NA | 37403 | 167590.608 |
| Monaco | MCO | 2014 | 7.060236e+09 | 7.1796368 | NA | NA | NA | 38132 | 185152.527 |
| Liechtenstein | LIE | 2013 | 6.391736e+09 | NA | NA | NA | NA | 36834 | 173528.150 |
| Monaco | MCO | 2012 | 5.743030e+09 | 0.9849603 | NA | NA | NA | 37783 | 152000.362 |
| Monaco | MCO | 2011 | 6.080345e+09 | 7.0737008 | NA | NA | NA | 37497 | 162155.499 |
| Monaco | MCO | 2010 | 5.362649e+09 | 2.0542939 | NA | NA | NA | 37094 | 144569.176 |
| Monaco | MCO | 2009 | 5.451653e+09 | -11.3175072 | NA | NA | NA | 36534 | 149221.362 |
| Monaco | MCO | 2008 | 6.476490e+09 | 0.7318007 | NA | NA | NA | 35853 | 180640.125 |
| Monaco | MCO | 2007 | 5.867917e+09 | 14.4288990 | NA | NA | NA | 35111 | 167124.741 |
| Monaco | MCO | 2006 | 4.582988e+09 | 5.8039365 | NA | NA | NA | 34408 | 133195.429 |
| Monaco | MCO | 2005 | 4.202980e+09 | 1.8954147 | NA | NA | NA | 33793 | 124374.268 |
| Monaco | MCO | 2004 | 4.110348e+09 | 2.4704857 | NA | NA | NA | 33314 | 123382.016 |
| Monaco | MCO | 2003 | 3.588989e+09 | 1.0875362 | NA | NA | NA | 32933 | 108978.489 |
| Monaco | MCO | 2002 | 2.905973e+09 | 1.0264955 | NA | NA | NA | 32629 | 89061.051 |
| Monaco | MCO | 2001 | 2.671401e+09 | 2.1877308 | NA | NA | NA | 32360 | 82552.567 |
| Monaco | MCO | 2000 | 2.647884e+09 | 3.9102306 | NA | NA | NA | 32082 | 82534.874 |
| Monaco | MCO | 1999 | 2.906009e+09 | 3.3005427 | NA | NA | NA | 31800 | 91383.940 |
| Monaco | MCO | 1998 | 2.934579e+09 | 3.5032655 | NA | NA | NA | 31523 | 93093.260 |
| Monaco | MCO | 1997 | 2.840182e+09 | 2.2372502 | NA | NA | NA | 31251 | 90882.923 |
| Monaco | MCO | 1996 | 3.137849e+09 | 1.1107282 | NA | NA | NA | 30967 | 101328.795 |
| Monaco | MCO | 1995 | 3.130271e+09 | 2.1171526 | NA | NA | NA | 30691 | 101993.122 |
| Monaco | MCO | 1994 | 2.720298e+09 | 2.2156235 | NA | NA | NA | 30427 | 89404.075 |
| Monaco | MCO | 1993 | 2.574440e+09 | -0.9139168 | NA | NA | NA | 30138 | 85421.727 |
| Monaco | MCO | 1992 | 2.737067e+09 | 1.3666397 | NA | NA | NA | 29863 | 91654.121 |
| Monaco | MCO | 1991 | 2.480498e+09 | 1.0151285 | NA | NA | NA | 29624 | 83732.701 |
| Monaco | MCO | 1990 | 2.481316e+09 | 2.6434542 | NA | NA | NA | 29439 | 84286.696 |
| Monaco | MCO | 1989 | 2.010117e+09 | 4.1636593 | NA | NA | NA | 29312 | 68576.585 |
| Monaco | MCO | 1988 | 2.000675e+09 | 4.5986015 | NA | NA | NA | 29235 | 68434.228 |
| Monaco | MCO | 1987 | 1.839096e+09 | 2.4845742 | NA | NA | NA | 29172 | 63043.178 |
| Monaco | MCO | 1986 | 1.515210e+09 | 2.4523647 | NA | NA | NA | 29041 | 52174.842 |
| Monaco | MCO | 1985 | 1.082851e+09 | 1.7086997 | NA | NA | NA | 28835 | 37553.358 |
| Monaco | MCO | 1984 | 1.037315e+09 | 1.4845156 | NA | NA | NA | 28512 | 36381.697 |
| Monaco | MCO | 1983 | 1.092552e+09 | 1.1949733 | NA | NA | NA | 28095 | 38887.766 |
| Monaco | MCO | 1982 | 1.143229e+09 | 2.4323895 | NA | NA | NA | 27624 | 41385.356 |
| Monaco | MCO | 1981 | 1.205166e+09 | 0.9223784 | NA | NA | NA | 27164 | 44366.295 |
| Monaco | MCO | 1980 | 1.378131e+09 | 1.6854574 | NA | NA | NA | 26745 | 51528.547 |
| Monaco | MCO | 1979 | 1.209898e+09 | 3.5342049 | NA | NA | NA | 26395 | 45838.162 |
| Monaco | MCO | 1978 | 1.000536e+09 | 3.9535513 | NA | NA | NA | 26087 | 38353.806 |
| United Arab Emirates | ARE | 1977 | 2.487178e+10 | 21.4393302 | NA | NA | NA | 748117 | 33245.836 |
| United Arab Emirates | ARE | 1976 | 1.921302e+10 | 16.5268565 | NA | NA | NA | 646943 | 29698.169 |
| Monaco | MCO | 1975 | 7.119230e+08 | -0.9726375 | NA | NA | NA | 25197 | 28254.276 |
| Monaco | MCO | 1974 | 5.639397e+08 | 4.4746111 | NA | NA | NA | 24835 | 22707.456 |
| Monaco | MCO | 1973 | 5.235528e+08 | 6.5537575 | NA | NA | NA | 24439 | 21422.841 |
| Monaco | MCO | 1972 | 4.024603e+08 | 4.6483096 | NA | NA | NA | 24051 | 16733.622 |
| Monaco | MCO | 1971 | 3.276515e+08 | 5.2280600 | NA | NA | NA | 23720 | 13813.301 |
| Monaco | MCO | 1970 | 2.930739e+08 | NA | NA | NA | NA | 23484 | 12479.725 |
| United States | USA | 1969 | 1.019900e+12 | 3.1000000 | 16.82293 | 4.951466 | 5.088734 | 202677000 | 5032.145 |
| United States | USA | 1968 | 9.425000e+11 | 4.8000000 | 15.95160 | 4.944297 | 5.082228 | 200706000 | 4695.923 |
| United States | USA | 1967 | 8.617000e+11 | 2.5000000 | 15.29809 | 4.630382 | 5.048161 | 198712000 | 4336.427 |
| Kuwait | KWT | 1966 | 2.391487e+09 | 12.3155611 | NA | 24.355972 | 65.690866 | 524856 | 4556.463 |
| Kuwait | KWT | 1965 | 2.097452e+09 | NA | NA | 23.097463 | 67.690254 | 473554 | 4429.171 |
| United States | USA | 1964 | 6.858000e+11 | 5.8000000 | 14.22421 | 4.097404 | 5.103529 | 191889000 | 3573.941 |
| United States | USA | 1963 | 6.386000e+11 | 4.4000000 | 14.04459 | 4.087065 | 4.870028 | 189242000 | 3374.515 |
| United States | USA | 1962 | 6.051000e+11 | 6.1000000 | 13.87261 | 4.131549 | 4.809122 | 186538000 | 3243.843 |
| United States | USA | 1961 | 5.633000e+11 | 2.3000000 | 13.70828 | 4.029824 | 4.899698 | 183691000 | 3066.563 |
| United States | USA | 1960 | 5.433000e+11 | NA | 13.56306 | 4.196576 | 4.969630 | 180671000 | 3007.123 |
Monaco had the highest GDP per capita in 2014 at $185,152.5. In the 1960s, the United States had the highest GDP per capita other than 1965 and 1966 in which it was Kuwait. From 1970 to 2017, Monaco maintained the highest GDP per capita per year for almost every year. The United Arab Emirates surpassed Monaco in 1976 and 1977 and Liechtenstein took the leader status in more recent years including 2013, 2015, and 2017.
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
global_economyglobal_economy |>
filter(Country == 'United States') |>
autoplot(GDP/Population) +
labs(title= "United States GDP per Capita", y = "$US")
Transformed to GDP per capita to account for population changes.
aus_livestockaus_livestock |>
filter(
Animal == 'Bulls, bullocks and steers',
State == 'Victoria'
) |>
mutate(Year = year(Month)) |>
group_by(Year) |>
mutate(Avg_Count = mean(Count)) |>
distinct(Year, Avg_Count) |>
as_tsibble(index=Year) |>
autoplot(Avg_Count) +
labs(title= "Average Slaughter of Victorian “Bulls, bullocks and steers” per year", y = "Average Count")
Applied a calendar adjustment to account for variation between the months due to the different numbers of days in each month by taking the average number of animals slaughtered per year.
vic_elecvic_elec |>
mutate(Date = date(Time)) |>
group_by(Date) |>
mutate(Avg_Demand = mean(Demand)) |>
distinct(Date, Avg_Demand) |>
as_tsibble(index=Date) |>
autoplot(Avg_Demand) +
labs(title= "Average Victorian Electricity Demand per day", y = "Average Demand (MWh)")
Applied a calendar adjustment to account for variation between the half-hourly data due to large variation in demand throughout a typical day by taking the average demand per day.
vic_elec |>
mutate(Month = yearmonth(Time)) |>
group_by(Month) |>
mutate(Avg_Demand = mean(Demand)) |>
distinct(Month, Avg_Demand) |>
as_tsibble(index=Month) |>
autoplot(Avg_Demand) +
labs(title= "Average Victorian Electricity Demand per Month", y = "Average Demand (MWh)")
Due to there still being large variation at the daily level, another calendar adjustment was applied by taking the average demand per month to further simplify the seasonal patterns.
aus_productionlambda <-aus_production |>
features(Gas, features = guerrero) |>
pull(lambda_guerrero)
aus_production |>
autoplot(box_cox(Gas, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed gas production with $\\lambda$ = ",
round(lambda,2))))
Due to the data showing variation that increases/decreases with the level of the series, the data was transformed using \(\lambda\) = 0.11 to standardize the seasonal variation or make it about the same across the whole series.
Why is a Box-Cox transformation unhelpful for the
canadian_gas data?
canadian_gas |>
autoplot(Volume) +
labs(title='Canadian gas production')
lambda <-canadian_gas |>
features(Volume, features = guerrero) |>
pull(lambda_guerrero)
canadian_gas |>
autoplot(box_cox(Volume, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Canadian gas production with $\\lambda$ = ",
round(lambda,2))))
A Box-Cox transformation is helpful if the data shows variation that
increases or decreases with the level of the series. The
canadian_gas data does not have this pattern and portrays
inconsistent variation with seasonal variability being larger between
1975-1990 when compared to either end of the series. Thus, the Box-Cox
transformation is unhelpful for the canadian_gas data. This
is further supported by seeing that when applying the Box-Cox
transformation, the seasonal variation is not consistent across the
series.
What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?
set.seed(87654321)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries |>
autoplot(Turnover) +
labs(y=' $Million AUD', title='Australian retail trade turnover')
lambda <- myseries |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
myseries |>
autoplot(box_cox(Turnover, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Australian retail trade turnover with $\\lambda$ = ",
round(lambda,2))))
myseries |>
autoplot(box_cox(Turnover, 0)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Australian retail trade turnover with $\\lambda$ = ",
0)))
The guerrero feature ouput an optimal lambda value of -0.08. Due to this being negative and producing very similar results as a lamda value or 0 or applying a log transformation, I would select. \(\lambda\) = 0.
For the following series, find an appropriate Box-Cox transformation
in order to stabilize the variance. Tobacco from
aus_production, Economy class passengers between Melbourne
and Sydney from ansett, and Pedestrian counts at Southern
Cross Station from pedestrian.
aus_productionaus_production |>
autoplot(Tobacco) +
labs(title='Australia Tobacco production')
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
lambda <- aus_production |>
features(Tobacco, features = guerrero) |>
pull(lambda_guerrero)
aus_production |>
autoplot(box_cox(Tobacco, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Australia Tobacco production with $\\lambda$ = ",
round(lambda,2))))
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
ansettansett |>
filter(
Class == 'Economy',
Airports == "MEL-SYD"
) |>
autoplot(Passengers) +
labs(title='Economy passengers between Melbourne and Sydney')
lambda <- ansett |>
filter(
Class =='Economy',
Airports == "MEL-SYD"
) |>
features(Passengers, features = guerrero) |>
pull(lambda_guerrero)
ansett |>
filter(
Class == 'Economy',
Airports == "MEL-SYD"
) |>
autoplot(box_cox(Passengers, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Economy passengers between MEL-SYD with $\\lambda$ = ",
round(lambda,2))))
pedestrian# adjusted to be average passenger count per week
pedestrian_df = pedestrian |>
filter(Sensor == "Southern Cross Station") |>
mutate(Week = yearweek(Date)) |>
group_by(Week) |>
mutate(Avg_Count = mean(Count)) |>
distinct(Week, Avg_Count) |>
as_tsibble(index=Week)
pedestrian_df |>
autoplot(Avg_Count) +
labs(title='Average Pedestrianat Southern Cross Station per Week')
lambda <- pedestrian_df |>
features(Avg_Count, features = guerrero) |>
pull(lambda_guerrero)
pedestrian_df |>
autoplot(box_cox(Avg_Count, lambda)) +
labs(y = "",
title = latex2exp::TeX(paste0(
"Transformed Average Pedestrian per Week with $\\lambda$ = ",
round(lambda,2))))
Consider the last five years of the Gas data from aus_production.
gas <- tail(aus_production, 5*4) |> select(Gas)
Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
gas |>
autoplot(Gas)
The series appears to have an overall increasing trend with a strong seasonality pattern. Gas production increases significantly in the second and third quarters of the year and declines during the first and fourth quarters.
Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
gas_decomp = gas |>
model(classical_decomposition(Gas, type = "multiplicative")) |>
components()
gas_decomp |>
autoplot() +
labs(title = "Classical multiplicative decomposition of last five years of the Gas data")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
Do the results support the graphical interpretation from part a?
yes the results support the graphical interpretation from part a. The components display an increasing trend with annual seasonality.
Compute and plot the seasonally adjusted data.
gas_decomp |>
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas production",
title = "Quarterly production of Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
gas1 = gas
gas1$Gas[10] = gas1$Gas[10] + 300
gas_decomp1 = gas1 |>
model(classical_decomposition(Gas, type = "multiplicative")) |>
components()
gas_decomp1 |>
autoplot() +
labs(title = "Classical multiplicative decomposition of Gas data with outlier")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
gas_decomp1 |>
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas production",
title = "Quarterly production of Gas in Australia with Outlier") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
knit_table(gas_decomp1, 'Seasonally Adjusted Data with Outlier in the Middle')
| .model | Quarter | Gas | trend | seasonal | random | season_adjust |
|---|---|---|---|---|---|---|
| classical_decomposition(Gas, type = “multiplicative”) | 2005 Q3 | 221 | NA | 1.0570139 | NA | 209.0796 |
| classical_decomposition(Gas, type = “multiplicative”) | 2005 Q4 | 180 | NA | 1.1236007 | NA | 160.1993 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q1 | 171 | 200.500 | 0.8209216 | 1.0389151 | 208.3025 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q2 | 224 | 203.500 | 0.9984638 | 1.1024306 | 224.3446 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q3 | 233 | 207.000 | 1.0570139 | 1.0648904 | 220.4323 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q4 | 192 | 210.250 | 1.1236007 | 0.8127430 | 170.8792 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q1 | 187 | 213.000 | 0.8209216 | 1.0694496 | 227.7928 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q2 | 234 | 253.625 | 0.9984638 | 0.9240415 | 234.3600 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q3 | 245 | 293.625 | 1.0570139 | 0.7893914 | 231.7850 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q4 | 505 | 293.875 | 1.1236007 | 1.5293847 | 449.4479 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q1 | 194 | 293.750 | 0.8209216 | 0.8044928 | 236.3198 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q2 | 229 | 256.500 | 0.9984638 | 0.8941611 | 229.3523 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q3 | 249 | 219.000 | 1.0570139 | 1.0756588 | 235.5693 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q4 | 203 | 220.375 | 1.1236007 | 0.8198260 | 180.6692 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q1 | 196 | 221.875 | 0.8209216 | 1.0760836 | 238.7560 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q2 | 238 | 223.125 | 0.9984638 | 1.0683078 | 238.3662 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q3 | 252 | 225.125 | 1.0570139 | 1.0590004 | 238.4075 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q4 | 210 | 226.000 | 1.1236007 | 0.8269873 | 186.8991 |
| classical_decomposition(Gas, type = “multiplicative”) | 2010 Q1 | 205 | NA | 0.8209216 | NA | 249.7193 |
| classical_decomposition(Gas, type = “multiplicative”) | 2010 Q2 | 236 | NA | 0.9984638 | NA | 236.3631 |
When placing an outlier in the middle of the gas data, the seasonally adjusted data spikes where the outlier is present. In addition, the seasonally adjusted data shows a lot more variation before and after the outlier. This is due to the increase in variation of the trend and random compoents.
Does it make any difference if the outlier is near the end rather than in the middle of the time series?
gas2 = gas
gas2$Gas[1] = gas2$Gas[1] + 300
gas_decomp2 = gas2 |>
model(classical_decomposition(Gas, type = "multiplicative")) |>
components()
gas_decomp2 |>
autoplot() +
labs(title = "Classical multiplicative decomposition of Gas data with outlier")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
gas_decomp2 |>
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas production",
title = "Quarterly production of Gas in Australia with Outlier") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
knit_table(gas_decomp2, 'Seasonally Adjusted Data with Outlier at the Beginning')
| .model | Quarter | Gas | trend | seasonal | random | season_adjust |
|---|---|---|---|---|---|---|
| classical_decomposition(Gas, type = “multiplicative”) | 2005 Q3 | 521 | NA | 1.1352158 | NA | 458.9436 |
| classical_decomposition(Gas, type = “multiplicative”) | 2005 Q4 | 180 | NA | 0.9329010 | NA | 192.9465 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q1 | 171 | 238.000 | 0.8488157 | 0.8464587 | 201.4572 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q2 | 224 | 203.500 | 1.0830675 | 1.0163144 | 206.8200 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q3 | 233 | 207.000 | 1.1352158 | 0.9915329 | 205.2473 |
| classical_decomposition(Gas, type = “multiplicative”) | 2006 Q4 | 192 | 210.250 | 0.9329010 | 0.9788805 | 205.8096 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q1 | 187 | 213.000 | 0.8488157 | 1.0343050 | 220.3070 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q2 | 234 | 216.125 | 1.0830675 | 0.9996669 | 216.0530 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q3 | 245 | 218.625 | 1.1352158 | 0.9871606 | 215.8180 |
| classical_decomposition(Gas, type = “multiplicative”) | 2007 Q4 | 205 | 218.875 | 0.9329010 | 1.0039733 | 219.7446 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q1 | 194 | 218.750 | 0.8488157 | 1.0448171 | 228.5537 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q2 | 229 | 219.000 | 1.0830675 | 0.9654635 | 211.4365 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q3 | 249 | 219.000 | 1.1352158 | 1.0015596 | 219.3415 |
| classical_decomposition(Gas, type = “multiplicative”) | 2008 Q4 | 203 | 220.375 | 0.9329010 | 0.9874115 | 217.6008 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q1 | 196 | 221.875 | 0.8488157 | 1.0407210 | 230.9100 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q2 | 238 | 223.125 | 1.0830675 | 0.9848570 | 219.7462 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q3 | 252 | 225.125 | 1.1352158 | 0.9860487 | 221.9842 |
| classical_decomposition(Gas, type = “multiplicative”) | 2009 Q4 | 210 | 226.000 | 0.9329010 | 0.9960366 | 225.1043 |
| classical_decomposition(Gas, type = “multiplicative”) | 2010 Q1 | 205 | NA | 0.8488157 | NA | 241.5130 |
| classical_decomposition(Gas, type = “multiplicative”) | 2010 Q2 | 236 | NA | 1.0830675 | NA | 217.8996 |
If the outlier is at the beginning of the series, there is not nearly as much variation within the seasonally adjusted data due to the trend and random components being mostly consistent for majority of the series after the outlier occurs.
Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?
# x11_dcmp <- myseries |>
# model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
# components()
#
# autoplot(x11_dcmp) +
# labs(title =
# "Decomposition of retail trade turnover using X-11.")
I am getting an error when trying to run this code. I have attempted to troubleshoot and have not been successful in resolving this issue.
Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.
Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.
The results of the decomposition for the number of persons in the civilian labor force in Australia each month from February 1978 to August 1995 show that the labor force has an increasing trend over time with an annual seasonal pattern. This seasonal pattern peaks about three times a year around March, September, and December. The scale of the seasonal component is the smallest which mean that the variation in this component is smallest compared to the variation in the data. This could be due the increases variance/noise in the remainder component and the sharp decline on the labor force in the early 1990s.
Is the recession of 1991/1992 visible in the estimated components?
Yes, the recession of the 1991/1992 is visible in the remainder component which includes what is left over when the seasonal and trend-cycle components have been subtracted from the data.