https://otexts.com/fpp3/graphics-exercises.html Exercises 2.1, 2.2, 2.3, 2.4, 2.5 and 2.8 Rpubs link as well as attach the .PDF file with your code.
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.3.3
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble 3.2.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.0 ✔ feasts 0.4.1
## ✔ lubridate 1.9.2 ✔ fable 0.4.1
## ✔ ggplot2 3.5.0
## Warning: package 'dplyr' was built under R version 4.3.3
## Warning: package 'ggplot2' was built under R version 4.3.3
## Warning: package 'tsibble' was built under R version 4.3.3
## Warning: package 'tsibbledata' was built under R version 4.3.3
## Warning: package 'feasts' was built under R version 4.3.3
## Warning: package 'fabletools' was built under R version 4.3.3
## Warning: package 'fable' was built under R version 4.3.3
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
library(dplyr)
library(tsibble)
1. Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec. |
---|
2. Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock. |
```r Peak_Close <- gafa_stock %>% select(Close, Symbol, Date) %>% group_by(Symbol) %>% filter(Close == max(Close, na.rm = TRUE)) |
print(Peak_Close) ``` |
## # A tsibble: 4 x 3 [!] ## # Key: Symbol [4] ## # Groups: Symbol [4] ## Close Symbol Date ## <dbl> <chr> <date> ## 1 232. AAPL 2018-10-03 ## 2 2040. AMZN 2018-09-04 ## 3 218. FB 2018-07-25 ## 4 1268. GOOG 2018-07-26 |
# Load data - change the file location if you're reviwing this code
tute1 <- readr::read_csv("C:/Users/Ron/OneDrive/Documents/tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(tute1)
## # A tibble: 6 × 4
## Quarter Sales AdBudget GDP
## <date> <dbl> <dbl> <dbl>
## 1 1981-03-01 1020. 659. 252.
## 2 1981-06-01 889. 589 291.
## 3 1981-09-01 795 512. 291.
## 4 1981-12-01 1004. 614. 292.
## 5 1982-03-01 1058. 647. 279.
## 6 1982-06-01 944. 602 254
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid().
4. The USgas package contains data on the demand for natural gas in the US. Install the USgas package. Create a tsibble from us_total with year as the index and state as the key. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island). |
---|
5. Download tourism.xlsx from the book website and read it into R using readxl::read_excel(). Create a tsibble which is identical to the tourism tsibble from the tsibble package. Find what combination of Region and Purpose had the maximum number of overnight trips on average. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State. |
r # Load data - change the file location if you're reviwing this code tourism_data <- readxl::read_excel("C:/Users/Ron/OneDrive/Documents/tourism.xlsx") head(tourism_data) |
## # A tibble: 6 × 5 ## Quarter Region State Purpose Trips ## <chr> <chr> <chr> <chr> <dbl> ## 1 1998-01-01 Adelaide South Australia Business 135. ## 2 1998-04-01 Adelaide South Australia Business 110. ## 3 1998-07-01 Adelaide South Australia Business 166. ## 4 1998-10-01 Adelaide South Australia Business 127. ## 5 1999-01-01 Adelaide South Australia Business 137. ## 6 1999-04-01 Adelaide South Australia Business 200. |
r #?tourism head(tourism) |
## # A tsibble: 6 x 5 [1Q] ## # Key: Region, State, Purpose [1] ## Quarter Region State Purpose Trips ## <qtr> <chr> <chr> <chr> <dbl> ## 1 1998 Q1 Adelaide South Australia Business 135. ## 2 1998 Q2 Adelaide South Australia Business 110. ## 3 1998 Q3 Adelaide South Australia Business 166. ## 4 1998 Q4 Adelaide South Australia Business 127. ## 5 1999 Q1 Adelaide South Australia Business 137. ## 6 1999 Q2 Adelaide South Australia Business 200. |
```r ################################################################################ # Create a tsibble which is identical to the tourism tsibble from the tsibble package. # I will make the Keys Region, State and Purpose. I’ll also convert the Quarter character, into a yearquarter tourism_tsibble <- tourism_data %>% mutate(Quarter = yearquarter(Quarter)) %>% as_tsibble(index = Quarter, key = c(Region, State, Purpose)) |
head(tourism_tsibble) ``` |
## # A tsibble: 6 x 5 [1Q] ## # Key: Region, State, Purpose [1] ## Quarter Region State Purpose Trips ## <qtr> <chr> <chr> <chr> <dbl> ## 1 1998 Q1 Adelaide South Australia Business 135. ## 2 1998 Q2 Adelaide South Australia Business 110. ## 3 1998 Q3 Adelaide South Australia Business 166. ## 4 1998 Q4 Adelaide South Australia Business 127. ## 5 1999 Q1 Adelaide South Australia Business 137. ## 6 1999 Q2 Adelaide South Australia Business 200. |
r head(tourism) |
## # A tsibble: 6 x 5 [1Q] ## # Key: Region, State, Purpose [1] ## Quarter Region State Purpose Trips ## <qtr> <chr> <chr> <chr> <dbl> ## 1 1998 Q1 Adelaide South Australia Business 135. ## 2 1998 Q2 Adelaide South Australia Business 110. ## 3 1998 Q3 Adelaide South Australia Business 166. ## 4 1998 Q4 Adelaide South Australia Business 127. ## 5 1999 Q1 Adelaide South Australia Business 137. ## 6 1999 Q2 Adelaide South Australia Business 200. |
```r # Now they’re identical |
################################################################################ # Find what combination of Region and Purpose had the maximum number of overnight trips on average. Region_Purpose_trips <- tourism_tsibble %>% group_by(Region, Purpose) %>% summarise(avg_overnight_trips = mean(Trips, na.rm = TRUE)) %>% arrange(desc(avg_overnight_trips)) ``` |
## Warning: Current temporal ordering may yield unexpected results. ## ℹ Suggest to sort by `Region`, `Purpose`, `Quarter` first. |
```r # This gives me the max for each Region # Region_Purpose_trips_max <- Region_Purpose_trips %>% # filter(avg_overnight_trips == max(avg_overnight_trips)) |
max_avg_trips <- max(Region_Purpose_trips$avg_overnight_trips, na.rm = TRUE) |
Region_Purpose_trips_max <- Region_Purpose_trips %>% filter(avg_overnight_trips == max_avg_trips) |
Region_Purpose_trips_max ``` |
## # A tsibble: 1 x 4 [1Q] ## # Key: Region, Purpose [1] ## # Groups: Region [1] ## Region Purpose Quarter avg_overnight_trips ## <chr> <chr> <qtr> <dbl> ## 1 Melbourne Visiting 2017 Q4 985. |
```r ################################################################################ # Create a new tsibble which combines the Purposes and Regions, and just has total trips by State. |
# Group by State and sum trips total_trips_by_state <- tourism_tsibble %>% group_by(State) %>% summarise(total_trips = sum(Trips, na.rm = TRUE)) %>% ungroup() # To remove any remaining grouping structure |
# I’ll convert it to a tibble to remove Quarter as an index total_trips_by_state <- as_tibble(total_trips_by_state) |
total_trips_by_state <- total_trips_by_state %>% select(-Quarter) |
total_trips_by_state <- total_trips_by_state %>% group_by(State) %>% summarise(Trips = sum(total_trips, na.rm = TRUE)) %>% ungroup() # Now we have the total trips by state. ``` |
#####The combination of Region and Purpose with the maximum number of overnight trips on average is Melbourne, Visiting, 2017Q4, with 985.2784 trips on average. #####The last section (total_trips_by_state) lists the total number of trips by state, although it also includes the Quarter which is needed as a time-index for a tsibble. I’ve also made a tibble of just states and trips, just to exclude the Quarter breakdown. |
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series? What can you say about the seasonal patterns? Can you identify any unusual years?
us_employment %>%
filter(Title == 'Total Private') %>%
autoplot()
## Plot variable not specified, automatically selected `.vars = Employed`
us_employment %>%
filter(Title == 'Total Private') %>%
gg_season(Employed, labels = "both")
us_employment %>%
filter(Title == 'Total Private') %>%
gg_subseries(Employed, labels = "both")
## Warning in geom_line(...): Ignoring unknown parameters: `labels`
us_employment %>%
filter(Title == 'Total Private') %>%
gg_lag(Employed, labels = "both", geom="point")
## Warning in geom_point(...): Ignoring unknown parameters: `labels`
us_employment %>% ACF(Employed)
## # A tsibble: 3,838 x 3 [1M]
## # Key: Series_ID [148]
## Series_ID lag acf
## <chr> <cf_lag> <dbl>
## 1 CEU0500000001 1M 0.997
## 2 CEU0500000001 2M 0.993
## 3 CEU0500000001 3M 0.990
## 4 CEU0500000001 4M 0.986
## 5 CEU0500000001 5M 0.983
## 6 CEU0500000001 6M 0.980
## 7 CEU0500000001 7M 0.977
## 8 CEU0500000001 8M 0.974
## 9 CEU0500000001 9M 0.971
## 10 CEU0500000001 10M 0.968
## # ℹ 3,828 more rows
aus_production %>%
select(Bricks) %>%
autoplot()
## Plot variable not specified, automatically selected `.vars = Bricks`
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
aus_production %>%
gg_season(Bricks, labels = "both")
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_text()`).
## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_text()`).
aus_production %>%
gg_subseries(Bricks, labels = "both")
## Warning in geom_line(...): Ignoring unknown parameters: `labels`
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
aus_production %>%
gg_lag(Bricks, labels = "both", geom="point")
## Warning: Removed 20 rows containing missing values (gg_lag).
## Warning in geom_point(...): Ignoring unknown parameters: `labels`
aus_production %>% ACF(Bricks)
## # A tsibble: 22 x 2 [1Q]
## lag acf
## <cf_lag> <dbl>
## 1 1Q 0.900
## 2 2Q 0.815
## 3 3Q 0.813
## 4 4Q 0.828
## 5 5Q 0.720
## 6 6Q 0.642
## 7 7Q 0.655
## 8 8Q 0.692
## 9 9Q 0.609
## 10 10Q 0.556
## # ℹ 12 more rows
pelt %>%
select(Hare) %>%
autoplot()
## Plot variable not specified, automatically selected `.vars = Hare`
pelt %>% gg_subseries(Hare)
pelt %>% gg_lag(Hare, geom="point")
pelt %>% ACF(Hare)
## # A tsibble: 19 x 2 [1Y]
## lag acf
## <cf_lag> <dbl>
## 1 1Y 0.658
## 2 2Y 0.214
## 3 3Y -0.155
## 4 4Y -0.401
## 5 5Y -0.493
## 6 6Y -0.401
## 7 7Y -0.168
## 8 8Y 0.113
## 9 9Y 0.307
## 10 10Y 0.340
## 11 11Y 0.296
## 12 12Y 0.206
## 13 13Y 0.0372
## 14 14Y -0.153
## 15 15Y -0.285
## 16 16Y -0.295
## 17 17Y -0.202
## 18 18Y -0.0676
## 19 19Y 0.0956
#####The hunting of hares and harvesting them for their pelts has occured from the mid 1840’s to the mid 1930’s, at which point it seems to have stopped. To no surprise, there appears to be a cyclical and seasonal trend, with peaks occuring once a decade, and periods of a few years before the hunting begins again. There are 2 outlier years during which hare pelt harvesting was at its highest, in 1862 and in 1896.
us_gasoline %>% autoplot(Barrels)
us_gasoline %>%
gg_season(Barrels, labels = "both")
us_gasoline %>% gg_subseries(Barrels)
us_gasoline %>% gg_lag(Barrels, geom="point")
us_gasoline %>% ACF(Barrels)
## # A tsibble: 31 x 2 [1W]
## lag acf
## <cf_lag> <dbl>
## 1 1W 0.893
## 2 2W 0.882
## 3 3W 0.873
## 4 4W 0.866
## 5 5W 0.847
## 6 6W 0.844
## 7 7W 0.832
## 8 8W 0.831
## 9 9W 0.822
## 10 10W 0.808
## # ℹ 21 more rows
H02 <- PBS %>%
filter(ATC2 == "H02")
H02 %>% autoplot(Cost)
H02 %>% gg_season(Cost)
H02 %>% gg_subseries(Cost)
#H02 %>% gg_lag(Cost)
H02 %>% ACF(Cost)
## # A tsibble: 92 x 6 [1M]
## # Key: Concession, Type, ATC1, ATC2 [4]
## Concession Type ATC1 ATC2 lag acf
## <chr> <chr> <chr> <chr> <cf_lag> <dbl>
## 1 Concessional Co-payments H H02 1M 0.834
## 2 Concessional Co-payments H H02 2M 0.679
## 3 Concessional Co-payments H H02 3M 0.514
## 4 Concessional Co-payments H H02 4M 0.352
## 5 Concessional Co-payments H H02 5M 0.264
## 6 Concessional Co-payments H H02 6M 0.219
## 7 Concessional Co-payments H H02 7M 0.253
## 8 Concessional Co-payments H H02 8M 0.337
## 9 Concessional Co-payments H H02 9M 0.464
## 10 Concessional Co-payments H H02 10M 0.574
## # ℹ 82 more rows