Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
global_economy %>% autoplot(GDP/Population,show.legend=FALSE) ## Warning: Removed 3242 rows containing missing values (`geom_line()`).
Highest GDP per capita?
arrange(mu_global_economy, desc(GDP_per_capita))## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.
## # A tsibble: 15,150 x 10 [1Y]
## # Key: Country [263]
## Country Code Year GDP Growth CPI Imports Exports Population
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Monaco MCO 2014 7060236168. 7.18 NA NA NA 38132
## 2 Monaco MCO 2008 6476490406. 0.732 NA NA NA 35853
## 3 Liechtenstein LIE 2014 6657170923. NA NA NA NA 37127
## 4 Liechtenstein LIE 2013 6391735894. NA NA NA NA 36834
## 5 Monaco MCO 2013 6553372278. 9.57 NA NA NA 37971
## 6 Monaco MCO 2016 6468252212. 3.21 NA NA NA 38499
## 7 Liechtenstein LIE 2015 6268391521. NA NA NA NA 37403
## 8 Monaco MCO 2007 5867916781. 14.4 NA NA NA 35111
## 9 Liechtenstein LIE 2016 6214633651. NA NA NA NA 37666
## 10 Monaco MCO 2015 6258178995. 4.94 NA NA NA 38307
## # ℹ 15,140 more rows
## # ℹ 1 more variable: GDP_per_capita <dbl>
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.
global_economy |>
filter(Country == "United States") |>
autoplot(GDP/Population) +
labs(title= "GDP per capita", y = "$US") ### Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.
global_economy |>
filter(Country == "Australia") |>
autoplot(Exports) +
labs(y = "% of GDP", title = "Total Australian exports")aus_livestock %>% filter(Animal == "Bulls, bullocks and steers", State == "Victoria") %>%
autoplot() +
labs(title= "Bulls, bullocks and steers"
)## Plot variable not specified, automatically selected `.vars = Count`
Victorian Electricity Demand from vic_elec.
head(vic_elec)## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
## Time Demand Temperature Date Holiday
## <dttm> <dbl> <dbl> <date> <lgl>
## 1 2012-01-01 00:00:00 4383. 21.4 2012-01-01 TRUE
## 2 2012-01-01 00:30:00 4263. 21.0 2012-01-01 TRUE
## 3 2012-01-01 01:00:00 4049. 20.7 2012-01-01 TRUE
## 4 2012-01-01 01:30:00 3878. 20.6 2012-01-01 TRUE
## 5 2012-01-01 02:00:00 4036. 20.4 2012-01-01 TRUE
## 6 2012-01-01 02:30:00 3866. 20.2 2012-01-01 TRUE
autoplot(vic_elec) +
labs(title = "Victoria Electricity Demand")## Plot variable not specified, automatically selected `.vars = Demand`
Gas production from aus_production.
head(aus_production)## # A tsibble: 6 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
autoplot(aus_production, Gas) +
labs(title = "Australian Gas Production")Why is a Box-Cox transformation unhelpful for the canadian_gas data?
canadian_gas %>% autoplot()## Plot variable not specified, automatically selected `.vars = Volume`
lambda <- canadian_gas |>
features(Volume, features = guerrero) |>
pull(lambda_guerrero)
canadian_gas |>
autoplot(box_cox(Volume, lambda))There is no difference in the plot with or with out ## 3.4 What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?
set.seed(100)
ausdata <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
ausdata## # A tsibble: 441 x 5 [1M]
## # Key: State, Industry [1]
## State Industry `Series ID` Month Turnover
## <chr> <chr> <chr> <mth> <dbl>
## 1 New South Wales Supermarket and grocery stores A3349335T 1982 Apr 303.
## 2 New South Wales Supermarket and grocery stores A3349335T 1982 May 298.
## 3 New South Wales Supermarket and grocery stores A3349335T 1982 Jun 298
## 4 New South Wales Supermarket and grocery stores A3349335T 1982 Jul 308.
## 5 New South Wales Supermarket and grocery stores A3349335T 1982 Aug 299.
## 6 New South Wales Supermarket and grocery stores A3349335T 1982 Sep 305.
## 7 New South Wales Supermarket and grocery stores A3349335T 1982 Oct 318
## 8 New South Wales Supermarket and grocery stores A3349335T 1982 Nov 334.
## 9 New South Wales Supermarket and grocery stores A3349335T 1982 Dec 390.
## 10 New South Wales Supermarket and grocery stores A3349335T 1983 Jan 311.
## # ℹ 431 more rows
ausdata %>%
autoplot(Turnover)lambda <- ausdata |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
ausdata %>%
autoplot(box_cox(Turnover, lambda)) There is some varion in the plot in he middle and in the end before applying and after applying BoxCox.
For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian
Tobacco from aus_production
aus_production %>%
autoplot(Tobacco)## Warning: Removed 24 rows containing missing values (`geom_line()`).
lambda <- aus_production |>
features(Tobacco, features = guerrero) |>
pull(lambda_guerrero)
aus_production %>% autoplot(box_cox(Tobacco, lambda))## Warning: Removed 24 rows containing missing values (`geom_line()`).
Economy class passengers between Melbourne and Sydney from ansett
ansett %>%
filter(Class=="Economy", Airports=="MEL-SYD") -> economy
economy %>% autoplot(Passengers)lambda <- economy |>
features(Passengers, features = guerrero) |>
pull(lambda_guerrero)
economy %>% autoplot(box_cox(Passengers, 6) )pedest <- pedestrian %>%
filter(Sensor=="Southern Cross Station")
pedest %>% autoplot(Count)lambda <- pedest %>% features(Count, features=guerrero) %>% pull(lambda_guerrero)
pedest %>% autoplot(box_cox(Count, lambda))Consider the last five years of the Gas data from aus_production.
gas <- tail(aus_production, 5*4) |> select(Gas)
gas## # A tsibble: 20 x 2 [1Q]
## Gas Quarter
## <dbl> <qtr>
## 1 221 2005 Q3
## 2 180 2005 Q4
## 3 171 2006 Q1
## 4 224 2006 Q2
## 5 233 2006 Q3
## 6 192 2006 Q4
## 7 187 2007 Q1
## 8 234 2007 Q2
## 9 245 2007 Q3
## 10 205 2007 Q4
## 11 194 2008 Q1
## 12 229 2008 Q2
## 13 249 2008 Q3
## 14 203 2008 Q4
## 15 196 2009 Q1
## 16 238 2009 Q2
## 17 252 2009 Q3
## 18 210 2009 Q4
## 19 205 2010 Q1
## 20 236 2010 Q2
Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
gas %>%
autoplot(Gas) the last 5 years of Australian gas production we observe major seasonal fluctuations and an upward trend.
Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
gas %>% model(classical_decomposition(Gas, type = "mult")) %>%
components() %>%
autoplot() +
labs(title = "Classical multiplicative decomposition of Australian gas production")## Warning: Removed 2 rows containing missing values (`geom_line()`).
Do the results support the graphical interpretation from part a?
## Yes, results support the graphical interpretation from part a upward trend and sasonalityC. Compute and plot the seasonally adjusted data.
decomp <- gas %>%
model(stl = STL(Gas))
#Compute and plot the seasonally adjusted data
components(decomp) %>%
as_tsibble() %>%
autoplot(Gas, colour = "gray") +
geom_line(aes(y=season_adjust), colour = "#0072B2") +
labs(y = "Gas production",
title = "Australian Gas Production") The gray line in the plot above shows autoplot() output while the blue line represents seasonally-adjusted data. Seasonal ajusted data.
e.Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
#change one observation to be an outlier
gaso <- gas
gaso$Gas[3] <- gaso$Gas[3] + 700
#seasonally adjusted data
decomp <- gaso %>%
model(stl = STL(Gas))
#Compute and plot the seasonally adjusted data
components(decomp) %>%
as_tsibble() %>%
autoplot(Gas, colour = "gray") +
geom_line(aes(y=season_adjust), colour = "#0072B2") +
labs(y = "Gas production",
title = "Australian Gas Production") The outlier changed the level and shape of the seasonally adjusted data plot, gas production increased, a major (early) peak was created by this one outlier.
#change one observation to be an outlier
gas1 <- gas
gas1$Gas[16] <- gas1$Gas[16] + 700
#recompute the seasonally adjusted data
# STL decomposition
decomp <- gas1 %>%
model(stl = STL(Gas))
#Compute and plot the seasonally adjusted data
components(decomp) %>%
as_tsibble() %>%
autoplot(Gas, colour = "gray") +
geom_line(aes(y=season_adjust), colour = "#0072B2") +
labs(y = "Gas production",
title = "Australian Gas Production") It doesn’t seems to make a difference whether the outlier is near the middle, end, or beginning of the time series. If the outlier is present seasonally adjusted plot will be altered. The addition of an outlier made the original autoplot() smoother than the seasonally adjusted plot.
Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?
set.seed(200)
ausretseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
x11_decomp <- ausretseries %>%
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
components()
autoplot(x11_decomp) +
labs(title =
"Decomposition of Australian Retail Turnover using X-11.")We observe an outlier feature ~1991 with a significant spike in the irregular component.
Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labor force in Australia each month from February 1978 to August 1995.
a.Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.
In the value we observe a general upward trend. We see that the figures range from less than 7000 to less than 9000 from before January of 1980 through January of 1995.
STL Decomposition: There’s a clear upward trend in the trend component. In the season-year component we observe numerous peaks and drops. There’s an uptick to start every year followed by a sharp decline, a small uptick, and finally the start of the next year. This pattern appears to repeat seasonally. The remainder component with a more major drop off in employment figures by two clearly visible downward spikes.
Yes, the recession is visible in the estimated components