This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: QUESTION 1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?
# 1 -----------------------------------
country_gdp <- global_economy %>%
tsibble(key = Code, index = Year)%>%
autoplot(GDP/Population, show.legend = FALSE) +
labs(title= "GDP per capita",
y = "US Dollar") +
theme(plot.title = element_text(hjust = 0.5))
monaco_gdp <- global_economy %>%
filter(Country == "Monaco") %>%
autoplot(GDP/Population) +
labs(title= "GDP per capita for Monaco", y = "US Dollar") +
theme(plot.title = element_text(hjust = 0.5))
monaco_gdp
## Warning: Removed 11 row(s) containing missing values (geom_path).
Monaco has the highest GDP and seen in the second graph above, it has been increasing over time. Most notably, it has exponentially grown after 2000. ————————————— QUESTION 2 For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.
United States GDP from global_economy. Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock. Victorian Electricity Demand from vic_elec. Gas production from aus_production.
# a
us_gdp <- global_economy %>% filter(Country == "United States") %>%
autoplot(GDP/10^12) + labs(title = "US GDP", y = "USD (trillions)", x = "Year") +
theme(plot.title = element_text(hjust = 0.5))
us_gdp
No transformations needed here. There seems to be a steady increase in GDP over time.
# b
head(aus_livestock)
## # A tsibble: 6 x 4 [1M]
## # Key: Animal, State [1]
## Month Animal State Count
## <mth> <fct> <fct> <dbl>
## 1 1976 Jul Bulls, bullocks and steers Australian Capital Territory 2300
## 2 1976 Aug Bulls, bullocks and steers Australian Capital Territory 2100
## 3 1976 Sep Bulls, bullocks and steers Australian Capital Territory 2100
## 4 1976 Oct Bulls, bullocks and steers Australian Capital Territory 1900
## 5 1976 Nov Bulls, bullocks and steers Australian Capital Territory 2100
## 6 1976 Dec Bulls, bullocks and steers Australian Capital Territory 1800
aus_liv <- aus_livestock %>%
filter(State == "Victoria", Animal == "Bulls, bullocks and steers") %>%
autoplot(Count) + labs(title = "Slaughter of Victorian “Bulls, bullocks and steers”", y = "Count", x = "Year") +
theme(plot.title = element_text(hjust = 0.5))
aus_liv
Overall, there is a decrease in the slaughter count. However, there seems to be an uptick during the 90s, then a continued downtrend.
# c
head(vic_elec)
## # A tsibble: 6 x 5 [30m] <Australia/Melbourne>
## Time Demand Temperature Date Holiday
## <dttm> <dbl> <dbl> <date> <lgl>
## 1 2012-01-01 00:00:00 4383. 21.4 2012-01-01 TRUE
## 2 2012-01-01 00:30:00 4263. 21.0 2012-01-01 TRUE
## 3 2012-01-01 01:00:00 4049. 20.7 2012-01-01 TRUE
## 4 2012-01-01 01:30:00 3878. 20.6 2012-01-01 TRUE
## 5 2012-01-01 02:00:00 4036. 20.4 2012-01-01 TRUE
## 6 2012-01-01 02:30:00 3866. 20.2 2012-01-01 TRUE
vic <- vic_elec %>%
autoplot(Demand) + labs(title = "Victorian Electriity Demand", y = "Demand", x = "Year") +
theme(plot.title = element_text(hjust = 0.5))
vic_log <- vic_elec
vic_log$Demand <- log(vic_log$Demand)
vic_log %>%
autoplot(Demand) + labs(title = "Log of Victorian Electriity Demand", y = "Demand", x = "Year") +
theme(plot.title = element_text(hjust = 0.5))
The original graphs showed that there was a decent cyclical trend with some random spikes. I attempted a log transformation to normalize those spikes, but it didn’t do much.
# d
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
## Quarter Beer Tobacco Bricks Cement Electricity Gas
## <qtr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1956 Q1 284 5225 189 465 3923 5
## 2 1956 Q2 213 5178 204 532 4436 6
## 3 1956 Q3 227 5297 208 561 4806 7
## 4 1956 Q4 308 5681 197 570 4418 6
## 5 1957 Q1 262 5577 187 529 4339 5
## 6 1957 Q2 228 5651 214 604 4811 7
aus_gas <- aus_production %>%
autoplot(Gas) +
labs(title = "Australian Gas Production", y = "Production", x = "Year") +
theme(plot.title = element_text(hjust = 0.5))
aus_gas
No transformation was made as there is a clear upward trend. If anything, the variance in seasonality differs. ———————————- QUESTION 3 Why is a Box-Cox transformation unhelpful for the canadian_gas data?
# 3-------------------------------------
canadian_gas %>%
autoplot(Volume) + labs(title = "Original Canadian Gas Production") +
theme(plot.title = element_text(hjust = 0.5))
lambda <- canadian_gas %>%
features(Volume, features = guerrero) %>%
pull(lambda_guerrero)
canadian_gas %>%
autoplot(box_cox(Volume, lambda)) + labs(title = "Transformed Canadian Gas Production") +
theme(plot.title = element_text(hjust = 0.5))
The seasonal variance is not uniform after the Box Cox transformation. The decade between 1980 and 1990 has a higher variance than the previous years and lower variance around the 2000 mark. ————————————————– QUESTION 4 What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?
# 4 ---------------------------------------------
set.seed(12345678)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
autoplot(myseries, Turnover)+
labs(title = "AUS Retail", y = "$AUD (in millions)") +
theme(plot.title = element_text(hjust = 0.5))
lambda_retail <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
myseries %>%
autoplot(box_cox(Turnover, lambda_retail)) +
labs(title = "AUS Retail Transformed", y = "$AUD (in millions)") +
theme(plot.title = element_text(hjust = 0.5))
I would use the transformed data because it seemed to have smoothed out the variance across all years. It also allows us to see the drop right before 2000. ——————————– QUESTION 5 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.
# 5.2
ansett.data <- ansett
eco_class <- ansett %>%
filter(Airports == "MEL-SYD", Class == "Economy")
eco_class %>%
autoplot(Passengers) + labs(title = "Original Econ Passengers MEL-SYD") +
theme(plot.title = element_text(hjust = 0.5))
lambda5.2 <- eco_class %>%
features(Passengers, features = guerrero) %>%
pull(lambda_guerrero)
lambda5.2
## [1] 1.999927
eco_class %>%
autoplot(box_cox(Passengers, lambda5.2)) + labs(title = "Transformed Econ Passengers MEL-SYD") +
ylab("Passengers") +
theme(plot.title = element_text(hjust = 0.5))
# 5.3
pedestrian_data <- pedestrian
scross_station <-pedestrian %>%
filter(Sensor == "Southern Cross Station") %>%
mutate(Week = yearweek(Date)) %>%
index_by(Week) %>%
summarise(Count = sum(Count))
scross_station %>%
autoplot(Count) + labs(title = "Original Econ Passengers MEL-SYD") +
theme(plot.title = element_text(hjust = 0.5))
lambda5.3 <- scross_station %>%
features(Count, features = guerrero) %>%
pull(lambda_guerrero)
lambda5.3
## [1] 0.7986839
scross_station %>%
autoplot(box_cox(Count, lambda5.3)) + labs(title = "Transformed Econ Passengers MEL-SYD") +
ylab("Count") +
theme(plot.title = element_text(hjust = 0.5))
QUESTION 6 Show that a
3×5 MA is equivalent to a 7-term weighted moving average with weights of 0.067, 0.133, 0.200, 0.200, 0.200, 0.133, and 0.067.
QUESTION 7 A. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
# 7 -------------------------------------------
# a
aus_gas <- tail(aus_production, 5*4) %>% dplyr::select("Quarter", "Gas")
aus_gas %>%
autoplot(Gas) + labs(title = "Time Series of Australian Gas Production") +
ylab("Bcf") +
theme(plot.title = element_text(hjust = 0.5))
Yes, there seems to be an upwards trend.
B. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
# 7 -------------------------------------------
# b
gas_ts <- ts(aus_gas$Gas, frequency = 4)
gas_decomp = decompose(gas_ts, type = "multiplicative")
plot(gas_decomp)
C. Do the results support the graphical interpretation from part a? Yes, they support the interpretation from part a.
# d
data("aus_production")
ts_gas = ts(aus_gas, frequency = 12, start = 1949)
decompose_gas = decompose(ts_gas, "multiplicative")
adjust_gas = ts_gas / decompose_gas$seasonal
plot(adjust_gas)
D. Compute and plot the seasonally adjusted data.
# d
data("aus_production")
ts_gas = ts(aus_gas, frequency = 12, start = 1949)
decompose_gas = decompose(ts_gas, "multiplicative")
adjust_gas = ts_gas / decompose_gas$seasonal
plot(adjust_gas)
E. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
# e
aus_gas2 <- aus_gas %>% mutate(Gas = if_else(Gas == 187, Gas + 300, Gas))
gas_ts2 <- ts(aus_gas2$Gas, frequency = 4)
gas_decomp2 = decompose(gas_ts2, type = "multiplicative")
plot(gas_decomp2)
adjust_gas2 = gas_ts2 / gas_decomp2$seasonal
plot(adjust_gas2)
The outlier is present and results in a large spike, visibly seen in the graphs.
F. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
# f
aus_gas3 <- aus_gas %>% mutate(Gas = if_else(Quarter == yearquarter("2010Q2"), Gas + 300, Gas))
gas_ts3 <- ts(aus_gas3$Gas, frequency = 4)
gas_decomp3 = decompose(gas_ts3, type = "multiplicative")
plot(gas_decomp3)
adjust_gas3 = gas_ts3 / gas_decomp3$seasonal
plot(adjust_gas3)
No, the outlier still shows a significant surge at the end, similar to the middle. It essentially has the same effect. ————————————————- QUESTION 8
| QUESTION 9 a - The trend component of the decomposition shows that there is an increasing trend of employment. The seasonality chart indicates that there is no seasonal trend as there is not a consistent pattern of employment. The lack of seasonality proves that labor participation is not very correlated to time of the year. |
| b. There was a recession in Australia in 1991 which can be shown in the error graph and seen prominently in the seasonal graph for March, August, and November. |
QUESTION 10 This exercise uses the canadian_gas data (monthly Canadian gas production in billions of cubic metres, January 1960 – February 2005).
A. Plot the data using autoplot(), gg_subseries() and gg_season() to look at the effect of the changing seasonality over time.1
# 10 --------------------------------------------
# a
canadian_gas %>% autoplot(Volume) + labs(title = "Canadian Gas Volume by Month") +
theme(plot.title = element_text(hjust = 0.5))
canadian_gas %>% gg_subseries(Volume) + labs(title = "Canadian Gas Volume by Month") +
theme(plot.title = element_text(hjust = 0.5))
canadian_gas %>% gg_season(Volume) + labs(title = "Canadian Gas Volume by Month") +
theme(plot.title = element_text(hjust = 0.5))
B. Do an STL decomposition of the data. You will need to choose a seasonal window to allow for the changing shape of the seasonal component.
# b
stl_gas <- stl(canadian_gas, s.window = 13, robust = TRUE)
autoplot(stl_gas) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")
stl_gas2 <- stl(canadian_gas, s.window = "periodic", robust = TRUE)
autoplot(stl_gas2) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")
stl_gas3 <- stl(canadian_gas, s.window = 250, robust = TRUE)
autoplot(stl_gas3) + xlab("Month") + ggtitle("STL Decomposition of Canadian Gas Production")
C. How does the seasonal shape change over time? [Hint: Try plotting the seasonal component using gg_season().]
# c
canadian_gas %>% gg_season(Volume) + labs(title = "Canadian Gas Volume by Month") +
theme(plot.title = element_text(hjust = 0.5))
Rightly so, it seems that gas volume decreases during the summer months because the weather is warmer and demand for gas heating drops.
D. Can you produce a plausible seasonally adjusted series?
# d
canadian_gas$Volume <- as.ts(canadian_gas$Volume)
autoplot(canadian_gas$Volume) +
autolayer(trendcycle(stl_gas), series="Trend") +
autolayer(seasadj(stl_gas), series="Seasonally Adjusted") +
xlab("Year") + ylab("New orders index") +
ggtitle("STL Decomposition of Canadian Gas Production") +
scale_colour_manual(values=c("gray","blue","red"),
breaks=c("Data","Seasonally Adjusted","Trend"))
E. Compare the results with those obtained using SEATS and X-11. How are they different?
# e
#canadian_gas %>%
# model(x11 = X_13ARIMA_SEATS(Volume ~ x11())) %>%
#components() %>%
#autoplot()+
#labs(title = "X-11 decomposition of Canadian Gas Production")
#view(canadian_gas)