3.1

Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time?

head(global_economy)
## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414
global_economy %>% 
  tsibble(key = Code, index = Year)%>%
  autoplot(GDP/Population, show.legend= FALSE)
## Warning: Removed 3242 row(s) containing missing values (geom_path).

max_data <- global_economy %>%
              mutate(sum = GDP/Population)
grouped <- setkey(setDT(max_data), Country)[,list(sum=sum(sum)), by=list(Country)]
max_value <- grouped[which.max(grouped$sum),]
max_value
##       Country     sum
## 1: Luxembourg 2398911

3.2

For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect.

United States GDP from global_economy.

global_economy %>% 
  filter(Country=="United States") %>% 
  autoplot(GDP) %>% 
  labs(title = "USA GDP") 
## [[1]]

## 
## $title
## [1] "USA GDP"
## 
## attr(,"class")
## [1] "labels"

Slaughter of Victorian “Bulls, bullocks and steers” in aus_livestock.

aus_livestock %>% 
  filter(Animal == "Bulls, bullocks and steers",
         State == "Victoria") %>% 
  autoplot(Count) + 
  theme_replace() +  
  labs(title = "Slaughter of Victorian “Bulls, bullocks and steers") 

Victorian Electricity Demand from vic_elec.

vic_elec %>% 
  autoplot(Demand)

Gas production from aus_production.

aus_production %>% 
  autoplot(Gas)

3.3

Why is a Box-Cox transformation unhelpful for the canadian_gas data?

canadian_gas %>% 
  autoplot(Volume)

can_lambda <- canadian_gas %>% 
  features(Volume, features = guerrero) %>% 
  pull(lambda_guerrero)

can_lambda
## [1] 0.3921381
canadian_gas %>%
  autoplot(box_cox(Volume, lambda = can_lambda))

From the plot we can see some seasonality in each year, in the sixties and seventies it was small variance, while it increased from the end of seventies to eighties and started to decrease again after nighties.

Because of the seasonality increased and then decreased then we should not BOX COX to make seasonal variant uniform.

3.4

What Box-Cox transformation would you select for your retail data (from Exercise 8 in Section 2.10)?

retail <- aus_retail %>% 
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

retail %>% 
  autoplot(Turnover) + theme_replace()

Retail Box Cox

retail_lambda <- retail %>% 
  features(Turnover, features = guerrero) %>% 
  pull(lambda_guerrero)

can_lambda
## [1] 0.3921381
canadian_gas %>%
  autoplot(box_cox(Volume, lambda = can_lambda))

It looks like the seasonal variant increased over time since there is an upward trend in the plot, and there is a yearly seasonality.

3.5

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. Tobacco from aus_production, Economy class passengers between Melbourne and Sydney from ansett, and Pedestrian counts at Southern Cross Station from pedestrian.

#Tobacco
head(aus_production)
## # A tsibble: 6 x 7 [1Q]
##   Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##     <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
## 1 1956 Q1   284    5225    189    465        3923     5
## 2 1956 Q2   213    5178    204    532        4436     6
## 3 1956 Q3   227    5297    208    561        4806     7
## 4 1956 Q4   308    5681    197    570        4418     6
## 5 1957 Q1   262    5577    187    529        4339     5
## 6 1957 Q2   228    5651    214    604        4811     7
prod_lambda <- aus_production %>% 
  features(Tobacco, features = guerrero) %>% 
  pull(lambda_guerrero)

prod_lambda
## [1] 0.9289402
aus_production %>%
  autoplot(box_cox(Tobacco, lambda = prod_lambda))
## Warning: Removed 24 row(s) containing missing values (geom_path).

#Economy class passengers between Melbourne and Sydney
head(ansett)
## # A tsibble: 6 x 4 [1W]
## # Key:       Airports, Class [1]
##       Week Airports Class    Passengers
##     <week> <chr>    <chr>         <dbl>
## 1 1989 W28 ADL-PER  Business        193
## 2 1989 W29 ADL-PER  Business        254
## 3 1989 W30 ADL-PER  Business        185
## 4 1989 W31 ADL-PER  Business        254
## 5 1989 W32 ADL-PER  Business        191
## 6 1989 W33 ADL-PER  Business        136
pas_lambda <- ansett %>% 
  filter(Class == "Economy",Airports == "MEL-SYD")%>%
  features(Passengers, features = guerrero) %>% 
  pull(lambda_guerrero)

pas_lambda
## [1] 1.999927
ansett %>%
  filter(Class == "Economy",Airports == "MEL-SYD")%>%
  autoplot(box_cox(Passengers, lambda = pas_lambda)) 

#Pedestrian counts at Southern Cross Station
head(pedestrian)
## # A tsibble: 6 x 5 [1h] <Australia/Melbourne>
## # Key:       Sensor [1]
##   Sensor         Date_Time           Date        Time Count
##   <chr>          <dttm>              <date>     <int> <int>
## 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630
## 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826
## 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567
## 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264
## 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139
## 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77
ped_lambda <- pedestrian %>% 
  filter(Sensor == "Southern Cross Station")%>%
  features(Count, features = guerrero) %>% 
  pull(lambda_guerrero)

ped_lambda
## [1] -0.2255423
pedestrian %>%
  filter(Sensor == "Southern Cross Station")%>%
  autoplot(box_cox(Count, lambda = ped_lambda)) 

3.7

Consider the last five years of the Gas data from aus_production.

gas <- tail(aus_production, 5*4) %>% select(Gas)

  1. Plot the time series. Can you identify seasonal fluctuations and/or a trend-cycle?
gas <- tail(aus_production, 5*4) %>% select(Gas)

gas %>% 
  autoplot(Gas)

The seasonality in the plot above is a very clear

  1. Use classical_decomposition with type=multiplicative to calculate the trend-cycle and seasonal indices.
gas %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot()
## Warning: Removed 2 row(s) containing missing values (geom_path).

From the plot above we can see teh upward trend starting 2006, and sesonality on a yearly basis is very clear.

  1. Do the results support the graphical interpretation from part a? it supports a. and as classical multiplicative decomposition relies on moving averages, there is no data at the beginning and end of the trend-cycle.

  2. Compute and plot the seasonally adjusted data.

gas_dec <- gas %>% 
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components()

gas_dec %>%   
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend"))
## Warning: Removed 4 row(s) containing missing values (geom_path).

  1. Change one observation to be an outlier (e.g., add 300 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
gass <- gas
gass$Gas[10] <- gass$Gas[10]+300

gass %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot() 
## Warning: Removed 2 row(s) containing missing values (geom_path).

gass %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend"))
## Warning: Removed 4 row(s) containing missing values (geom_path).

After adding 300 to the 10th observation, it spiked the seasonality in the adjusted data, and gas data added more to the seasonality.

  1. Does it make any difference if the outlier is near the end rather than in the middle of the time series?
gasss <- gas
gasss$Gas[20] <- gasss$Gas[10]+300

gasss %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  autoplot()
## Warning: Removed 2 row(s) containing missing values (geom_path).

gasss %>%
  model(classical_decomposition(Gas,type = "multiplicative")) %>%
  components() %>%
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, colour = "Data")) +
  geom_line(aes(y = season_adjust,
                colour = "Seasonally Adjusted")) +
  geom_line(aes(y = trend, colour = "Trend")) 
## Warning: Removed 4 row(s) containing missing values (geom_path).

It does, the trend now is more clear, because of the increase of the last 300 observation.

3.8

Recall your retail time series data (from Exercise 8 in Section 2.10). Decompose the series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

retail %>%
  model(classical_decomposition(Turnover,type = "multiplicative")) %>%
  components() %>%
  autoplot() 
## Warning: Removed 6 row(s) containing missing values (geom_path).

retail %>%
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) %>%
  components() %>%
  autoplot()

Seeing bot decomposition, the X-11 trend has captured the sudden fall in the 2000 -2010 trand.

3.9

Figures 3.19 and 3.20 show the result of decomposing the number of persons in the civilian labour force in Australia each month from February 1978 to August 1995.

a. Write about 3–5 sentences describing the results of the decomposition. Pay particular attention to the scales of the graphs in making your interpretation.

Looking at the trend it shows it got increased throughout the years, there is small period on the early 90s when it didn’t other than that. the monthly variant in early 90s looks some months has more variants than others. seasonality is very clear throughout the years. and upward trend is clear as well since the 80s.

b. Is the recession of 1991/1992 visible in the estimated components?

Yes the recession is very clear in the estimated components while in the seasonality and trend is not that clear.