Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
Monaco has the highest GPD per capita, and it had increased with a small dip from time to time.
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.1
## ✔ lubridate 1.9.4 ✔ fable 0.4.1
## ✔ ggplot2 3.5.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
global_economy |>
autoplot(GDP / Population, show.legend = FALSE)
## Warning: Removed 3242 rows containing missing values or values outside the scale range
## (`geom_line()`).
global_economy |>
mutate(GDP_per_capita = GDP / Population) |>
filter(GDP_per_capita == max(GDP_per_capita, na.rm = TRUE)) |>
select(Country, GDP_per_capita)
global_economy |>
filter(Country == "Monaco") |>
autoplot(GDP/Population)
## Warning: Removed 11 rows containing missing values or values outside the scale range
## (`geom_line()`).
For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP from global_economy.
Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.
Victorian Electricity Demand from vic_elec.
Gas production from aus_production.
There is no transformation for global_economy since I don’t think the population trend affected the GDP, aside from changing the dollars by trillions to make it easier to read.
No transformation done for aus_livestock here. The trend has alot of up and down, but overall, decreasing.
There is a transformation for vic_elec, to do it monthly to make it easier to see the seasonal change overtime and it seems like there are drastic increase around the middle of the year.
There is no transformation for aus_producton, since the gas is consistently rising over the decades.
global_economy |>
filter(Country == "United States") |>
autoplot(GDP / 10^12) +
labs(y="trillions")
aus_livestock |>
filter(Animal == "Bulls, bullocks and steers") |>
filter(State == "Victoria") |>
autoplot(Count)
vic_data <- vic_elec |>
group_by(Date) |>
mutate(Demand = sum(Demand)) |>
distinct(Date, Demand)
vic_data |>
as_tsibble(index = Date) |>
autoplot(Demand)
aus_production |>
autoplot(Gas)
Why is a Box-Cox transformation unhelpful for the
canadian_gas
data?
It’s unhelpful for the canadian_gas data because the data is already stable in term of variance before applying the box-cox tranformation, so there is not much different before or after the transformation.
canadian_gas |>
autoplot(Volume)
lambda <- canadian_gas |>
features(Volume, features = guerrero) |>
pull(lambda_guerrero)
canadian_gas |>
autoplot(box_cox(Volume, lambda)) +
labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
What Box-Cox transformation would you select for your retail data (from Exercise 7 in Section 2.10)?
I would select lambda = 0.08 since it helps stabilize the seasonal variation.
set.seed(12345678)
seed <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
autoplot(seed, Turnover)
lambda <- seed |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
seed |>
autoplot(box_cox(Turnover, lambda)) +
labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
For the following series, find an appropriate Box-Cox transformation
in order to stabilise the variance. Tobacco from
aus_production
, Economy class passengers between Melbourne
and Sydney from ansett
, and Pedestrian counts at Southern
Cross Station from pedestrian
.
For aus_production, the box-cox transformation didn’t seem to do much changes, so it is not very effective here, lambda being close to 1 as well.
For the ansett, lambda is 2, so the variance should be better.
The lambda is negative, so applying the box cox transformation seems unreliable.
autoplot(aus_production, Tobacco)+
labs(title = "Tobacco and Cigarette")
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
lambda <- aus_production |>
features(Tobacco, features = guerrero) |>
pull(lambda_guerrero)
aus_production |>
autoplot(box_cox(Tobacco, lambda)) +
labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
ansett_data <- ansett |>
filter(Class == "Economy", Airports == "MEL-SYD")
autoplot(ansett_data, Passengers)+
labs(title = "Economy class")
lambda <- ansett_data |>
features(Passengers, features = guerrero) |>
pull(lambda_guerrero)
ansett_data |>
autoplot(box_cox(Passengers, lambda)) +
labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
pede_data <- pedestrian |>
filter(Sensor == "Southern Cross Station")
autoplot(pede_data, Count)+
labs(title = "Hourly Pedestrian")
lambda <- pede_data |>
features(Count, features = guerrero) |>
pull(lambda_guerrero)
pede_data |>
autoplot(box_cox(Count, lambda)) +
labs(y = "", title = paste0("at lambda = ",round(lambda,2)))
Consider the last five years of the Gas data from
aus_production
.
gas <- tail(aus_production, 5*4) |> select(Gas)
Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
Use classical_decomposition
with
type=multiplicative
to calculate the trend-cycle and
seasonal indices.
Do the results support the graphical interpretation from part a?
Compute and plot the seasonally adjusted data.
Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
Does it make any difference if the outlier is near the end rather than in the middle of the time series?
For question 1, an increasing trend that increases at the beginning of the year, then decreases at the middle to end of the year.
For question 3, the results do support the graphical interpretation from part a, because the trend is increasing, and seasonality has the up and down similar to the first graph.
For question 5, there was a change in the overall data when applying the change to the outlier, the flow seems to have change, from up to down.
I don’t see much difference, seems to be showing the overall trend.
gas <- tail(aus_production, 5*4) |> select(Gas)
autoplot(gas, Gas)
clas_decom <- gas |>
model(classical_decomposition(Gas, type = "multiplicative"))
components(clas_decom) |>
autoplot()
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
components(clas_decom) |>
as_tsibble() |>
autoplot(Gas) +
geom_line(aes(y=season_adjust))
gas |>
mutate(Gas = ifelse(Quarter == yearquarter("2008Q1"), Gas + 300, Gas)) |>
model(classical_decomposition(Gas, type = "multiplicative")) |>
components() |> #extracting decomposed components(trend, seasonal, remainder)
as_tsibble() |>
autoplot(Gas) +
geom_line(aes(y=season_adjust))
gas |>
mutate(Gas = ifelse(Quarter == yearquarter("2010Q1"), Gas + 300, Gas)) |>
model(classical_decomposition(Gas, type = "multiplicative")) |>
components() |>
as_tsibble() |>
autoplot(Gas) +
geom_line(aes(y=season_adjust))
Recall your retail time series data (from Exercise 7 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?
This doesn’t seem to reveal much outlier, it also captures the irregular, which is interesting to see more noises in the graph.
library(seasonal)
##
## Attaching package: 'seasonal'
## The following object is masked from 'package:tibble':
##
## view
x_11 <- seed |>
model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
components()
autoplot(x_11)
Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.
Is the recession of 1991/1992 visible in the estimated components?