Q1

Use the following graphics functions: autoplot(), gg_lag(), ACF() and explore features from the following time serie: Bricks from aus_production

  • Can you spot any seasonality, cyclicity and trend?
  • What do you learn about the series?
Bricks <- aus_production %>% 
    select(Quarter, Beer)
Bricks %>% autoplot(.vars = Beer)

From the plot above we can see that there are seasonal trends since the cycles are about a year before repeating.

Bricks %>% gg_lag(y = Beer)

As we can see from the lagged plot above and the Autocorrelation graph below there is a strong seasonal trend every 4 quarters peaking at 4Q and 8Q.

Bricks %>% ACF(var = Beer) %>% 
    autoplot()

From the graphs we learned that there is a strong yearly seasonal trend. The first graph showed a strong trend upwards followed by a slower downward trend. And from the correlogram above we can see that there beer production is on a downward trend.

Q2

You can compute the daily changes in the Google stock price in 2018 using

dgoog <- gafa_stock %>%
    filter(Symbol == "GOOG", year(Date) >= 2018) %>% 
    mutate(trading_day = row_number()) %>% 
    update_tsibble(index=trading_day, regular=TRUE) %>%
    mutate(diff = difference(Close)) 
dgoog
## # A tsibble: 251 x 10 [1]
## # Key:       Symbol [1]
##    Symbol Date        Open  High   Low Close Adj_Close Volume trading_day   diff
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>  <dbl>       <int>  <dbl>
##  1 GOOG   2018-01-02 1048. 1067. 1045. 1065      1065  1.24e6           1 NA    
##  2 GOOG   2018-01-03 1064. 1086. 1063. 1082.     1082. 1.43e6           2 17.5  
##  3 GOOG   2018-01-04 1088  1094. 1084. 1086.     1086. 1.00e6           3  3.92 
##  4 GOOG   2018-01-05 1094  1104. 1092  1102.     1102. 1.28e6           4 15.8  
##  5 GOOG   2018-01-08 1102. 1111. 1102. 1107.     1107. 1.05e6           5  4.71 
##  6 GOOG   2018-01-09 1109. 1111. 1101. 1106.     1106. 9.02e5           6 -0.680
##  7 GOOG   2018-01-10 1097. 1105. 1096. 1103.     1103. 1.04e6           7 -3.65 
##  8 GOOG   2018-01-11 1106. 1107. 1100. 1106.     1106. 9.78e5           8  2.91 
##  9 GOOG   2018-01-12 1102. 1124. 1101. 1122.     1122. 1.72e6           9 16.7  
## 10 GOOG   2018-01-16 1133. 1140. 1118. 1122.     1122. 1.58e6          10 -0.5  
## # ... with 241 more rows
dgoog %>% autoplot(.vars = diff)

dgoog %>% ACF(diff) %>% autoplot()

Do the changes in the stock prices look like white noise?

The graph above does look like white noise. And the correlogram above also indicates that the change in daily price is white noise.

Q3

  1. Download tourism.xlsx from the book website and read it into R using read_excel() from the readxl package.
url <- "C:/Users/seanc/OneDrive/Documents/Fall21/STAT 427/Week 2/tourism.xlsx"
Tour <- readxl::read_excel(url) 
Tour
## # A tibble: 24,320 x 5
##    Quarter    Region   State           Purpose  Trips
##    <chr>      <chr>    <chr>           <chr>    <dbl>
##  1 1998-01-01 Adelaide South Australia Business  135.
##  2 1998-04-01 Adelaide South Australia Business  110.
##  3 1998-07-01 Adelaide South Australia Business  166.
##  4 1998-10-01 Adelaide South Australia Business  127.
##  5 1999-01-01 Adelaide South Australia Business  137.
##  6 1999-04-01 Adelaide South Australia Business  200.
##  7 1999-07-01 Adelaide South Australia Business  169.
##  8 1999-10-01 Adelaide South Australia Business  134.
##  9 2000-01-01 Adelaide South Australia Business  154.
## 10 2000-04-01 Adelaide South Australia Business  169.
## # ... with 24,310 more rows
  1. Create a tsibble which is identical to the tourism tsibble from the tsibble package.
Tour <- Tour%>% 
    mutate(Quarter = yearquarter(as_date(Quarter)),
           .keep = "unused") %>% 
    as_tsibble(index = Quarter,
               key = c(Region,State,Purpose))
Tour
## # A tsibble: 24,320 x 5 [1Q]
## # Key:       Region, State, Purpose [304]
##    Quarter Region   State           Purpose  Trips
##      <qtr> <chr>    <chr>           <chr>    <dbl>
##  1 1998 Q1 Adelaide South Australia Business  135.
##  2 1998 Q2 Adelaide South Australia Business  110.
##  3 1998 Q3 Adelaide South Australia Business  166.
##  4 1998 Q4 Adelaide South Australia Business  127.
##  5 1999 Q1 Adelaide South Australia Business  137.
##  6 1999 Q2 Adelaide South Australia Business  200.
##  7 1999 Q3 Adelaide South Australia Business  169.
##  8 1999 Q4 Adelaide South Australia Business  134.
##  9 2000 Q1 Adelaide South Australia Business  154.
## 10 2000 Q2 Adelaide South Australia Business  169.
## # ... with 24,310 more rows
tourism
## # A tsibble: 24,320 x 5 [1Q]
## # Key:       Region, State, Purpose [304]
##    Quarter Region   State           Purpose  Trips
##      <qtr> <chr>    <chr>           <chr>    <dbl>
##  1 1998 Q1 Adelaide South Australia Business  135.
##  2 1998 Q2 Adelaide South Australia Business  110.
##  3 1998 Q3 Adelaide South Australia Business  166.
##  4 1998 Q4 Adelaide South Australia Business  127.
##  5 1999 Q1 Adelaide South Australia Business  137.
##  6 1999 Q2 Adelaide South Australia Business  200.
##  7 1999 Q3 Adelaide South Australia Business  169.
##  8 1999 Q4 Adelaide South Australia Business  134.
##  9 2000 Q1 Adelaide South Australia Business  154.
## 10 2000 Q2 Adelaide South Australia Business  169.
## # ... with 24,310 more rows
identical(Tour,tourism) #not sure why this says false
## [1] FALSE
  1. Find what combination of Region and Purpose had the maximum number of overnight trips on average.
Tour %>% 
    group_by(Region,Purpose) %>% 
    summarise(Avg_num_trips = sum(Trips)/n()) %>% 
    arrange(desc(Avg_num_trips)) 
## # A tsibble: 24,320 x 4 [1Q]
## # Key:       Region, Purpose [304]
## # Groups:    Region [76]
##    Region          Purpose  Quarter Avg_num_trips
##    <chr>           <chr>      <qtr>         <dbl>
##  1 Melbourne       Visiting 2017 Q4          985.
##  2 Sydney          Business 2001 Q4          948.
##  3 Sydney          Visiting 2016 Q4          921.
##  4 Sydney          Visiting 2017 Q4          920.
##  5 Sydney          Visiting 2017 Q1          916.
##  6 South Coast     Holiday  1998 Q1          915.
##  7 North Coast NSW Holiday  2016 Q1          906.
##  8 Sydney          Business 2017 Q3          892.
##  9 Sydney          Business 2017 Q2          884.
## 10 Sydney          Visiting 2013 Q4          882.
## # ... with 24,310 more rows

As we can see Sydney Visiting had the maximum number of overnight trips on average.

  1. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
Tour %>% 
    group_by(Region,Purpose) %>% 
    summarise(total_trips = sum(Trips),
              Date = Quarter) 
## # A tsibble: 24,320 x 5 [1Q]
## # Key:       Region, Purpose [304]
## # Groups:    Region [76]
##    Region   Purpose  Quarter total_trips    Date
##    <chr>    <chr>      <qtr>       <dbl>   <qtr>
##  1 Adelaide Business 1998 Q1        135. 1998 Q1
##  2 Adelaide Business 1998 Q2        110. 1998 Q2
##  3 Adelaide Business 1998 Q3        166. 1998 Q3
##  4 Adelaide Business 1998 Q4        127. 1998 Q4
##  5 Adelaide Business 1999 Q1        137. 1999 Q1
##  6 Adelaide Business 1999 Q2        200. 1999 Q2
##  7 Adelaide Business 1999 Q3        169. 1999 Q3
##  8 Adelaide Business 1999 Q4        134. 1999 Q4
##  9 Adelaide Business 2000 Q1        154. 2000 Q1
## 10 Adelaide Business 2000 Q2        169. 2000 Q2
## # ... with 24,310 more rows

To have a tsibble object we need to have a date column so we keep that in our pipe above.

Q4

The aus_arrivals data set comprises quarterly international arrivals (in thousands) to Australia from Japan, New Zealand, UK and the US. Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries. Can you identify any unusual observations?

aus_arrivals %>% autoplot()
## Plot variable not specified, automatically selected `.vars = Arrivals`

As we can see from the first plot above there seems to be yearly seasonal trends in all countries. Then for for all countries except japan there seems to be a strong upward trend. For japan, the trend originally was going up before it started to go down. This could be a very long cyclical pattern that we can’t see in our data.

aus_arrivals %>% gg_season()
## Plot variable not specified, automatically selected `y = Arrivals`

As for the second plot it is very clear that there is a strong yearly seasonal trend. And if we look closely we see that all countries except japan have had an upward trend in arrivals.

aus_arrivals %>% gg_subseries()
## Plot variable not specified, automatically selected `y = Arrivals`

Lastly from this graph we see that all countries except for Japan have had a steady increase in arrivals. Though not as clear, there does still seem to be a yearly seasonal trend.

Q5

Textbook 2.10, Ex 10

The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.

1 == B
2 == A
3 == D
4 == C