Use ? (or help()) to find out about the data in each series.
#?aus_production
#?pelt
#?gafa_stock
#?vic_elec
What is the time interval of each series?
interval(aus_production) #time interval is quarter
## <interval[1]>
## [1] 1Q
interval(pelt) # tiem interbal is year
## <interval[1]>
## [1] 1Y
interval(gafa_stock) # time interval is irregular days
## <interval[1]>
## [1] !
interval(vic_elec) # time interval is half hour
## <interval[1]>
## [1] 30m
Use autoplot() to produce a time plot of each series.
autoplot(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
autoplot(pelt, Lynx)
autoplot(gafa_stock, Close)
# For the last plot, modify the axis labels and title.
autoplot(vic_elec, Demand) +
labs(title = "Demand for electricity in Victoria, Australia",
x = "Year",
y = "Demand")
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
peak_cday <- gafa_stock |>
group_by(Symbol) |>
filter(Close == max(Close))
peak_cday
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
#reading the tute1 csv in downloads folder
read_tute1 <- read_csv("tute1.csv")
## Rows: 100 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (3): Sales, AdBudget, GDP
## date (1): Quarter
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# b Convert the data to time series
tute1_timeseries <- read_tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
# c Construct time series plots of each of the three series
tute1_timeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid(). If you don’t include facet_grid(), the timeseries graph created has three separate lines of the adbudget, GDP and sales in a single graph. It is not cohesive.
tute1_timeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
The USgas package contains data on the demand for natural gas in the US.
Install the USgas package.
library(USgas)
## Warning: package 'USgas' was built under R version 4.5.2
tsibble_us_tot <- us_total |>
as_tsibble(index = year , key = state) |>
print()
## # A tsibble: 1,266 x 3 [1Y]
## # Key: state [53]
## year state y
## <int> <chr> <int>
## 1 1997 Alabama 324158
## 2 1998 Alabama 329134
## 3 1999 Alabama 337270
## 4 2000 Alabama 353614
## 5 2001 Alabama 332693
## 6 2002 Alabama 379343
## 7 2003 Alabama 350345
## 8 2004 Alabama 382367
## 9 2005 Alabama 353156
## 10 2006 Alabama 391093
## # ℹ 1,256 more rows
C. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).
new_engl <- c("Maine", "Vermont", "New Hampshire",
"Massachusetts", "Connecticut", "Rhode Island")
tsibble_us_tot |>
filter(state %in% new_engl) |>
ggplot(aes(x = year, y = y, colour = state)) +
geom_line(size = 1) +
labs(title = "New England Annual Gas Demand",
x = "Year",
y = "Total",
colour = "State") +
facet_grid(state ~ ., scales = "free_y") +
theme(strip.text = element_text(size = 4.5, angle = 45, hjust = 1))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Download tourism.xlsx from the book website and read it into R using readxl::read_excel().
read_tourism <- read_excel("tourism.xlsx")
print(read_tourism)
## # A tibble: 24,320 × 5
## Quarter Region State Purpose Trips
## <chr> <chr> <chr> <chr> <dbl>
## 1 1998-01-01 Adelaide South Australia Business 135.
## 2 1998-04-01 Adelaide South Australia Business 110.
## 3 1998-07-01 Adelaide South Australia Business 166.
## 4 1998-10-01 Adelaide South Australia Business 127.
## 5 1999-01-01 Adelaide South Australia Business 137.
## 6 1999-04-01 Adelaide South Australia Business 200.
## 7 1999-07-01 Adelaide South Australia Business 169.
## 8 1999-10-01 Adelaide South Australia Business 134.
## 9 2000-01-01 Adelaide South Australia Business 154.
## 10 2000-04-01 Adelaide South Australia Business 169.
## # ℹ 24,310 more rows
tour_tsib <- tourism |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(key = c(Region, Purpose), index = Quarter)
print(tour_tsib)
## # A tsibble: 24,320 x 5 [1Q]
## # Key: Region, Purpose [304]
## Quarter Region State Purpose Trips
## <qtr> <chr> <chr> <chr> <dbl>
## 1 1998 Q1 Adelaide South Australia Business 135.
## 2 1998 Q2 Adelaide South Australia Business 110.
## 3 1998 Q3 Adelaide South Australia Business 166.
## 4 1998 Q4 Adelaide South Australia Business 127.
## 5 1999 Q1 Adelaide South Australia Business 137.
## 6 1999 Q2 Adelaide South Australia Business 200.
## 7 1999 Q3 Adelaide South Australia Business 169.
## 8 1999 Q4 Adelaide South Australia Business 134.
## 9 2000 Q1 Adelaide South Australia Business 154.
## 10 2000 Q2 Adelaide South Australia Business 169.
## # ℹ 24,310 more rows
overnight_trip_avg <- tour_tsib |>
group_by(Region, Purpose) |>
summarise(Average = mean(Trips, na.rm = TRUE)) |>
filter(Average == max(Average))
print(overnight_trip_avg)
## # A tsibble: 76 x 4 [1Q]
## # Key: Region, Purpose [76]
## # Groups: Region [76]
## Region Purpose Quarter Average
## <chr> <chr> <qtr> <dbl>
## 1 Adelaide Visiting 2017 Q1 270.
## 2 Adelaide Hills Visiting 2002 Q4 81.1
## 3 Alice Springs Holiday 1998 Q3 76.5
## 4 Australia's Coral Coast Holiday 2014 Q3 198.
## 5 Australia's Golden Outback Business 2017 Q3 174.
## 6 Australia's North West Business 2016 Q3 297.
## 7 Australia's South West Holiday 2016 Q1 612.
## 8 Ballarat Visiting 2004 Q1 103.
## 9 Barkly Holiday 1998 Q3 37.9
## 10 Barossa Holiday 2006 Q1 51.0
## # ℹ 66 more rows
D. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
trip_tot_tsib <- tour_tsib |>
group_by(State) |>
summarise(Total_Trips = sum(Trips)) |>
as_tibble(index = State)
print(trip_tot_tsib)
## # A tibble: 640 × 3
## State Quarter Total_Trips
## <chr> <qtr> <dbl>
## 1 ACT 1998 Q1 551.
## 2 ACT 1998 Q2 416.
## 3 ACT 1998 Q3 436.
## 4 ACT 1998 Q4 450.
## 5 ACT 1999 Q1 379.
## 6 ACT 1999 Q2 558.
## 7 ACT 1999 Q3 449.
## 8 ACT 1999 Q4 595.
## 9 ACT 2000 Q1 600.
## 10 ACT 2000 Q2 557.
## # ℹ 630 more rows
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
From Employed:
There seems to be an upward trend in US employment, signaling more people in the US are being employed throughout the years.
The series has strong seasonal patterns of employment, revealing cycles of hiring. The unusual years are in downward spikes such as 2010 or 1991.
us_employment |>
filter(Title == "Total Private") |>
autoplot(Employed)
us_employment |>
filter(Title == "Total Private") |>
gg_season(Employed)
## Warning: `gg_season()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_season()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
us_employment |>
filter(Title == "Total Private") |>
gg_subseries(Employed)
## Warning: `gg_subseries()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_subseries()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
us_employment |>
filter(Title == "Total Private") |>
gg_lag(Employed)
## Warning: `gg_lag()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_lag()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
us_employment |>
filter(Title == "Total Private") |>
ACF(Employed) |>
autoplot()
From Bricks:
There seems to be an upward trend in US employment, signaling more people in the US are being employed throughout the years.
The series trend seems to be on a decline over the years. There is some seasonality trends which are higher in the Q2 and Q3. And it seems like the early 2000s imply an unusual decline
autoplot(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_season(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_subseries(aus_production, Bricks)
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_line()`).
gg_lag(aus_production, Bricks)
## Warning: Removed 20 rows containing missing values (gg_lag).
From Hare:
There doesn’t seem to be any upward or downward trends with this series.
The series trends is unclear upward or downward. The seasonal patterns aren’t very strong. There are many troughs in this series for plenty of years.
autoplot(pelt, Hare)
gg_subseries(pelt, Hare)
From Cost:
For this series, it can be said that there is a increase in costs over time.
The series trends is gradually increasing. The seasonal patterns aren’t very strong, but potentially a little higher presently in the winter time. There are a lot of sudden spikes in the series regards to price changes.
PBS |>
filter(ATC2 == "H02") |>
autoplot(Cost)
PBS |>
filter(ATC2 == "H02") |>
gg_season(Cost)
PBS |>
filter(ATC2 == "H02") |>
gg_subseries(Cost)
PBS |>
filter(ATC2 == "H02") |>
ACF(Cost) |>
autoplot()
From Barrels:
For this series, it is pretty firm with a gradual increase over time .
The seasonal patterns are pretty clear as you can see that in the summertime more us gasoline is used. We can see sharp declines in years such as around the 2010s and 2020s. 2020 is understandable due to the COVID-19 pandemic.
autoplot(us_gasoline, Barrels)
gg_season(us_gasoline, Barrels)
gg_subseries(us_gasoline, Barrels)
gg_lag(us_gasoline, Barrels)