library(fpp3)
## Warning: package 'fpp3' was built under R version 4.2.3
## ── Attaching packages ────────────────────────────────────────────── fpp3 0.5 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.4
## ✔ dplyr 1.1.3 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.0 ✔ feasts 0.3.1
## ✔ lubridate 1.9.3 ✔ fable 0.3.3
## ✔ ggplot2 3.4.4 ✔ fabletools 0.4.0
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## Warning: package 'tidyr' was built under R version 4.2.3
## Warning: package 'lubridate' was built under R version 4.2.3
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'tsibble' was built under R version 4.2.3
## Warning: package 'tsibbledata' was built under R version 4.2.3
## Warning: package 'feasts' was built under R version 4.2.3
## Warning: package 'fabletools' was built under R version 4.2.3
## Warning: package 'fable' was built under R version 4.2.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
library(dplyr)
global_economy
## # A tsibble: 15,150 x 9 [1Y]
## # Key: Country [263]
## Country Code Year GDP Growth CPI Imports Exports Population
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan AFG 1960 537777811. NA NA 7.02 4.13 8996351
## 2 Afghanistan AFG 1961 548888896. NA NA 8.10 4.45 9166764
## 3 Afghanistan AFG 1962 546666678. NA NA 9.35 4.88 9345868
## 4 Afghanistan AFG 1963 751111191. NA NA 16.9 9.17 9533954
## 5 Afghanistan AFG 1964 800000044. NA NA 18.1 8.89 9731361
## 6 Afghanistan AFG 1965 1006666638. NA NA 21.4 11.3 9938414
## 7 Afghanistan AFG 1966 1399999967. NA NA 18.6 8.57 10152331
## 8 Afghanistan AFG 1967 1673333418. NA NA 14.2 6.77 10372630
## 9 Afghanistan AFG 1968 1373333367. NA NA 15.2 8.90 10604346
## 10 Afghanistan AFG 1969 1408888922. NA NA 15.0 10.1 10854428
## # ℹ 15,140 more rows
global_economy %>%
autoplot(GDP/Population, show.legend=F)
## Warning: Removed 3242 rows containing missing values (`geom_line()`).
global_economy2 <- global_economy
global_economy2 %>%
group_by(Country) %>%
mutate(GDP_Capita = GDP/Population) %>%
arrange(desc(GDP_Capita))
## Warning: Current temporal ordering may yield unexpected results.
## ℹ Suggest to sort by `Country`, `Year` first.
## # A tsibble: 15,150 x 10 [1Y]
## # Key: Country [263]
## # Groups: Country [263]
## Country Code Year GDP Growth CPI Imports Exports Population GDP_Capita
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Monaco MCO 2014 7.06e9 7.18 NA NA NA 38132 185153.
## 2 Monaco MCO 2008 6.48e9 0.732 NA NA NA 35853 180640.
## 3 Liecht… LIE 2014 6.66e9 NA NA NA NA 37127 179308.
## 4 Liecht… LIE 2013 6.39e9 NA NA NA NA 36834 173528.
## 5 Monaco MCO 2013 6.55e9 9.57 NA NA NA 37971 172589.
## 6 Monaco MCO 2016 6.47e9 3.21 NA NA NA 38499 168011.
## 7 Liecht… LIE 2015 6.27e9 NA NA NA NA 37403 167591.
## 8 Monaco MCO 2007 5.87e9 14.4 NA NA NA 35111 167125.
## 9 Liecht… LIE 2016 6.21e9 NA NA NA NA 37666 164993.
## 10 Monaco MCO 2015 6.26e9 4.94 NA NA NA 38307 163369.
## # ℹ 15,140 more rows
Monaco has the highest GDP per capita which occurs in 2014. In general, the overall trend for GDP per capita is increasing over time. From the graph, due to the amount of countries, it’s difficult to determine which line corresponds to which without the legend.
global_economy %>%
filter(Country == "United States") %>%
autoplot(GDP)
No transformations seem necessary. Appears to follow a normal trend,
with little variation. However, 3.1 of the text states “Any data that
are affected by population changes can be adjusted to give per-capita
data.” Therefore, we can apply that population transformation, which
appears almost the same.
global_economy %>%
filter(Country == "United States") %>%
autoplot(GDP/Population)
aus_livestock %>%
filter(State == "Victoria"& Animal == "Bulls, bullocks and steers") %>%
autoplot(Count)
A lot of seasonal variation. Will apply box-cox transformations for normality.
lambda <- aus_livestock %>%
filter(State == "Victoria"& Animal == "Bulls, bullocks and steers") %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
aus_livestock %>%
filter(State == "Victoria"& Animal == "Bulls, bullocks and steers") %>%
autoplot(box_cox(Count, lambda))
Looking closely you can see changes in variations of the plot. With a
lambda value close to 0, it appears the power transformation performed
was a log transform.
vic_elec %>%
autoplot(Demand)
Appears to have trends, cycles and seasonality, however difficult to
read with the data being captured every 30 minutes. Let’s try to group
by month based on the Date field.
#library(lubridate)
#vic_elec %>%
# mutate(M_Y= format_ISO8601(Date,precision="ym")) %>%
# index_by(M_Y) %>%
# summarise(Demand=sum(Demand)) %>%
# autoplot(Demand)
##Getting errors
Attempted to group by month and trend, did not work, as intended.
aus_production %>%
autoplot(Gas)
Applying another box-cox transformation:
lambda <- aus_production %>%
features(Gas, features = guerrero) %>%
pull(lambda_guerrero)
aus_production %>%
autoplot(box_cox(Gas, lambda))
This aims to make the size of the seasonal variation the same across the
entire series, which would be important for modeling.
canadian_gas %>%
autoplot(Volume)
lambda <- canadian_gas %>%
features(Volume, features = guerrero) %>%
pull(lambda_guerrero)
canadian_gas %>%
autoplot(box_cox(Volume, lambda))
The plots look almost identical and the seasonality does not appear normalized by the peaks and dips in the chart.
#code from 2.10 exercise 7
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
lambda <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
This returns a lambda of 0.271, which based on section 3.1 is closer to 1/2 than 0, therefore probably a square root plus linear transformation (lambda = 0.5)
#tobacco from aus_production
lambda_a <- aus_production %>%
features(Tobacco, features = guerrero) %>%
pull(lambda_guerrero)
#Economy class passengers between Melourne and Sydney from ansett
lambda_b <- ansett %>%
filter(Airports == 'MEL-SYD' & Class == 'Economy') %>%
features(Passengers, features = guerrero) %>%
pull(lambda_guerrero)
#Pedestrian counts at Southern Cross Station from pedestrian
lambda_c <- pedestrian %>%
filter(Sensor == 'Southern Cross Station') %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
-Tobacco from aus_production = 0.92 -#Economy class passengers between Melbourne and Sydney from ansett = 2 -Pedestrian counts at Southern Cross Station from pedestrian = -0.25
gas <- tail(aus_production, 5*4) |> select(Gas)
gas %>%
autoplot(Gas)
Upward trend year over year with seasonal highs in the summers and lows
in the winters.
gas %>%
model(
classical_decomposition(Gas, type = "multiplicative")
) %>%
components() %>%
autoplot() +
labs(title = "Classical decomposition with type = multiplicative")
## Warning: Removed 2 rows containing missing values (`geom_line()`).
### Do the results support the graphical interpretation from part a?
Yes, when decomposed, you can clearly see the upward overall trend and
seasonal component as in part a.
gas_decomp <- gas %>%
model(
classical_decomposition(Gas, type = "multiplicative")
) %>%
components()
#section 3.5
gas_decomp %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas",
title = "Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values (`geom_line()`).
Not 100% sure how to compute. The code above was leveraged from section
3.5
gas_mod <- gas
gas_mod$Gas[3] <- gas_mod$Gas[3] + 300
gas_mod_dc <- gas_mod %>%
model(
classical_decomposition(Gas, type = "multiplicative")
) %>%
components()
gas_mod_dc %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = Gas, colour = "Data")) +
geom_line(aes(y = season_adjust,
colour = "Seasonally Adjusted")) +
geom_line(aes(y = trend, colour = "Trend")) +
labs(y = "Gas",
title = "Gas in Australia") +
scale_colour_manual(
values = c("gray", "#0072B2", "#D55E00"),
breaks = c("Data", "Seasonally Adjusted", "Trend")
)
## Warning: Removed 4 rows containing missing values (`geom_line()`).
You can see the outlier in Q1 2006. This skews the seasonality and takes
on a downward trend.
Adding the outlier at the end should keep the trend since it is upward, however, it should still impact the seasonality as it would not follow the normal patterns shown. Any outlier would have to be addressed and normalized for modeling.
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
#3.5
library(seasonal)
## Warning: package 'seasonal' was built under R version 4.2.3
##
## Attaching package: 'seasonal'
## The following object is masked from 'package:tibble':
##
## view
x11_dcmp <- myseries |>
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
components()
autoplot(x11_dcmp) +
labs(title =
"Decomposition of Australia retail using X-11.")
The overall trend of turnover is increasing while overall seasonality is
decreasing over time.
There is an overall positive trend in the labor force. There is more of a pickup between 1987 and 1992. However, the seasonal pattern is pretty consistent, but I do notice slightly larger peaks and low points between 1990-95. However, due to the overall increase in the workforce, the total trend is not significantly impacted due to this.
The recession is visible in the remainder component but not in the trend or seasonal components. Per 3.6 STL is “robust to outliers, so that occasional unusual observations will not affect the estimates of the trend-cycle and seasonal components. They will, however, affect the remainder component.