Use the following graphics functions:
autoplot(),gg_lag(),ACF()and explore features from the following time serie:Bricksfromaus_production
- Can you spot any seasonality, cyclicity and trend?
- What do you learn about the series?
Bricks <- aus_production %>%
select(Quarter, Beer)
Bricks %>% autoplot(.vars = Beer)
From the plot above we can see that there are seasonal trends since the cycles are about a year before repeating.
Bricks %>% gg_lag(y = Beer)
As we can see from the lagged plot above and the Autocorrelation graph below there is a strong seasonal trend every 4 quarters peaking at 4Q and 8Q.
Bricks %>% ACF(var = Beer) %>%
autoplot()
From the graphs we learned that there is a strong yearly seasonal trend. The first graph showed a strong trend upwards followed by a slower downward trend. And from the correlogram above we can see that there beer production is on a downward trend.
You can compute the daily changes in the Google stock price in 2018 using
dgoog <- gafa_stock %>%
filter(Symbol == "GOOG", year(Date) >= 2018) %>%
mutate(trading_day = row_number()) %>%
update_tsibble(index=trading_day, regular=TRUE) %>%
mutate(diff = difference(Close))
dgoog
## # A tsibble: 251 x 10 [1]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume trading_day diff
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl>
## 1 GOOG 2018-01-02 1048. 1067. 1045. 1065 1065 1.24e6 1 NA
## 2 GOOG 2018-01-03 1064. 1086. 1063. 1082. 1082. 1.43e6 2 17.5
## 3 GOOG 2018-01-04 1088 1094. 1084. 1086. 1086. 1.00e6 3 3.92
## 4 GOOG 2018-01-05 1094 1104. 1092 1102. 1102. 1.28e6 4 15.8
## 5 GOOG 2018-01-08 1102. 1111. 1102. 1107. 1107. 1.05e6 5 4.71
## 6 GOOG 2018-01-09 1109. 1111. 1101. 1106. 1106. 9.02e5 6 -0.680
## 7 GOOG 2018-01-10 1097. 1105. 1096. 1103. 1103. 1.04e6 7 -3.65
## 8 GOOG 2018-01-11 1106. 1107. 1100. 1106. 1106. 9.78e5 8 2.91
## 9 GOOG 2018-01-12 1102. 1124. 1101. 1122. 1122. 1.72e6 9 16.7
## 10 GOOG 2018-01-16 1133. 1140. 1118. 1122. 1122. 1.58e6 10 -0.5
## # ... with 241 more rows
dgoog %>% autoplot(.vars = diff)
dgoog %>% ACF(diff) %>% autoplot()
Do the changes in the stock prices look like white noise?
The graph above does look like white noise. And the correlogram above also indicates that the change in daily price is white noise.
- Download
tourism.xlsxfrom the book website and read it into R usingread_excel()from thereadxlpackage.
url <- "C:/Users/seanc/OneDrive/Documents/Fall21/STAT 427/Week 2/tourism.xlsx"
Tour <- readxl::read_excel(url)
Tour
## # A tibble: 24,320 x 5
## Quarter Region State Purpose Trips
## <chr> <chr> <chr> <chr> <dbl>
## 1 1998-01-01 Adelaide South Australia Business 135.
## 2 1998-04-01 Adelaide South Australia Business 110.
## 3 1998-07-01 Adelaide South Australia Business 166.
## 4 1998-10-01 Adelaide South Australia Business 127.
## 5 1999-01-01 Adelaide South Australia Business 137.
## 6 1999-04-01 Adelaide South Australia Business 200.
## 7 1999-07-01 Adelaide South Australia Business 169.
## 8 1999-10-01 Adelaide South Australia Business 134.
## 9 2000-01-01 Adelaide South Australia Business 154.
## 10 2000-04-01 Adelaide South Australia Business 169.
## # ... with 24,310 more rows
- Create a tsibble which is identical to the
tourismtsibble from thetsibblepackage.
Tour <- Tour%>%
mutate(Quarter = yearquarter(as_date(Quarter)),
.keep = "unused") %>%
as_tsibble(index = Quarter,
key = c(Region,State,Purpose))
Tour
## # A tsibble: 24,320 x 5 [1Q]
## # Key: Region, State, Purpose [304]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
## 7 1999 Q3 Adelaide South Australia Business 169.
## 8 1999 Q4 Adelaide South Australia Business 134.
## 9 2000 Q1 Adelaide South Australia Business 154.
## 10 2000 Q2 Adelaide South Australia Business 169.
## # ... with 24,310 more rows
tourism
## # A tsibble: 24,320 x 5 [1Q]
## # Key: Region, State, Purpose [304]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
## 7 1999 Q3 Adelaide South Australia Business 169.
## 8 1999 Q4 Adelaide South Australia Business 134.
## 9 2000 Q1 Adelaide South Australia Business 154.
## 10 2000 Q2 Adelaide South Australia Business 169.
## # ... with 24,310 more rows
identical(Tour,tourism) #not sure why this says false
## [1] FALSE
- Find what combination of
RegionandPurposehad the maximum number of overnight trips on average.
Tour %>%
group_by(Region,Purpose) %>%
summarise(Avg_num_trips = sum(Trips)/n()) %>%
arrange(desc(Avg_num_trips))
## # A tsibble: 24,320 x 4 [1Q]
## # Key: Region, Purpose [304]
## # Groups: Region [76]
## Region Purpose Quarter Avg_num_trips
## <chr> <chr> <qtr> <dbl>
## 1 Melbourne Visiting 2017 Q4 985.
## 2 Sydney Business 2001 Q4 948.
## 3 Sydney Visiting 2016 Q4 921.
## 4 Sydney Visiting 2017 Q4 920.
## 5 Sydney Visiting 2017 Q1 916.
## 6 South Coast Holiday 1998 Q1 915.
## 7 North Coast NSW Holiday 2016 Q1 906.
## 8 Sydney Business 2017 Q3 892.
## 9 Sydney Business 2017 Q2 884.
## 10 Sydney Visiting 2013 Q4 882.
## # ... with 24,310 more rows
As we can see Sydney Visiting had the maximum number of overnight trips on average.
- Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
Tour %>%
group_by(Region,Purpose) %>%
summarise(total_trips = sum(Trips),
Date = Quarter)
## # A tsibble: 24,320 x 5 [1Q]
## # Key: Region, Purpose [304]
## # Groups: Region [76]
## Region Purpose Quarter total_trips Date
## <chr> <chr> <qtr> <dbl> <qtr>
## 1 Adelaide Business 1998 Q1 135. 1998 Q1
## 2 Adelaide Business 1998 Q2 110. 1998 Q2
## 3 Adelaide Business 1998 Q3 166. 1998 Q3
## 4 Adelaide Business 1998 Q4 127. 1998 Q4
## 5 Adelaide Business 1999 Q1 137. 1999 Q1
## 6 Adelaide Business 1999 Q2 200. 1999 Q2
## 7 Adelaide Business 1999 Q3 169. 1999 Q3
## 8 Adelaide Business 1999 Q4 134. 1999 Q4
## 9 Adelaide Business 2000 Q1 154. 2000 Q1
## 10 Adelaide Business 2000 Q2 169. 2000 Q2
## # ... with 24,310 more rows
To have a tsibble object we need to have a date column so we keep that in our pipe above.
The
aus_arrivalsdata set comprises quarterly international arrivals (in thousands) to Australia from Japan, New Zealand, UK and the US. Useautoplot(),gg_season()andgg_subseries()to compare the differences between the arrivals from these four countries. Can you identify any unusual observations?
aus_arrivals %>% autoplot()
## Plot variable not specified, automatically selected `.vars = Arrivals`
As we can see from the first plot above there seems to be yearly seasonal trends in all countries. Then for for all countries except japan there seems to be a strong upward trend. For japan, the trend originally was going up before it started to go down. This could be a very long cyclical pattern that we can’t see in our data.
aus_arrivals %>% gg_season()
## Plot variable not specified, automatically selected `y = Arrivals`
As for the second plot it is very clear that there is a strong yearly seasonal trend. And if we look closely we see that all countries except japan have had an upward trend in arrivals.
aus_arrivals %>% gg_subseries()
## Plot variable not specified, automatically selected `y = Arrivals`
Lastly from this graph we see that all countries except for Japan have had a steady increase in arrivals. Though not as clear, there does still seem to be a yearly seasonal trend.
Textbook 2.10, Ex 10
The following time plots and ACF plots correspond to four different time series. Your task is to match each time plot in the first row with one of the ACF plots in the second row.
1 == B
2 == A
3 == D
4 == C