library(fpp3)
Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.
The time interval for Bricks is quarterly
bricks <-
aus_production |>
select(Quarter, Bricks)
head(bricks)
bricks |>
autoplot(Bricks) +
geom_point()
The time interval for Lynx is yearly
lynx <-
pelt |>
select(Year, Lynx)
head(lynx)
lynx |>
autoplot(Lynx) +
geom_point()
The time interval for Close is daily
close <-
gafa_stock |>
select(Date, Close)
head(close)
close |>
autoplot(Close)
The time interval for Demand is every half-hour.
demand <-
vic_elec |>
select(Time, Demand)
head(demand)
demand |>
autoplot(Demand) +
labs(title = 'Electricity Demand for Victoria, Australia',
x = 'Time (30 Minutes)',
y = 'Demand (MWh)')
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
#peak <-
gafa_stock |>
group_by(Symbol) |>
filter(Close == max(Close)) |>
select(Symbol, Date, Close)
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
tute1 <- readr::read_csv("tute1.csv")
head(tute1)
mytimeseries <- tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
head(mytimeseries)
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid().
We can see that the three time series are on one plot instead of individual subplots, while sharing the same y-scale.
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line()
The USgas package contains data on the demand for natural gas in the US.
if(require("USgas") == FALSE){
install.packages("USgas")
}else{library(USgas)}
## Loading required package: USgas
us_gas_time_series <-
us_total |>
as_tsibble(key = state, index = year)
head(us_gas_time_series)
library(ggh4x)
us_gas_time_series |>
filter(state == c('Maine', 'Vermont', 'New Hampshire', 'Massachusetts', 'Connecticut', 'Rhode Island')) |>
ggplot(aes(x = year, y = y, color = state)) +
geom_point() +
geom_line() +
facet_grid(rows=vars(state),
scales = "free_y",
labeller = labeller(state = label_wrap_gen(10))) +
theme_bw() +
theme(legend.position="none")
library(readxl)
tourism <-
read_excel("tourism.xlsx")
head(tourism)
tourism_time_series <-
tourism |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter, key = c(Region, State, Purpose))
head(tourism_time_series)
top_region_purpose <-
tourism_time_series |>
group_by(Region, Purpose) |>
summarise(Average_Trips = mean(Trips)) |>
arrange(desc(Average_Trips))
head(top_region_purpose, 1)
tourism_combined <-
tourism |>
group_by(Quarter, State) |>
summarise(Total_Trips = sum(Trips), .groups = "drop") |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(key = State, index = Quarter)
head(tourism_combined)
Use the following graphics functions: autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() and explore features from the following time series: “Total Private” Employed from us_employment, Bricks from aus_production, Hare from pelt, “H02” Cost from PBS, and Barrels from us_gasoline.
We can see that there is an overall strong, positive trend from 1940-2020.
private_employment <-
us_employment |>
filter(Title == "Total Private")
private_employment |>
autoplot(Employed)
Comparing the current \(y_t\) to the lag \(y_{t-k}\) where k is a different period prior we again see a strong positive trend.
private_employment |>
gg_lag(Employed, geom = "point")
Another way is looking into the autocorrelation of each lag and we see a small decrease as we get further out from the current time period, but in a positive direction. This confirms the trend we are seeing. We also know that these values are significanlty different from zero and not white noise.
Between 1960-1980 we see there where a strong positive trend, then a cycle where a steep drop occurs and begins to increase again. After 1980 there’s a seasonality every 5-7 years of growth and drops in production.
brick_production <-
aus_production |>
select(Quarter, Bricks)
brick_production |>
autoplot(Bricks)
Breaking down each year and looking between quarters we see where Q1 to Q2/Q3 has an increase in brick production that then tapers off and drops by Q4. This shows there is a seasonality factor where the warmer months require more bricks to be produced
brick_production |>
gg_season(Bricks)
As we look at the lag, lag1 has a strong positive linear trend and begins to have a heteroskedasticity problem throughout the lags as we increase each lag + 1.
brick_production |>
gg_lag(Bricks, geom = "point")
We can not see any clear trend throughout the years. There is seasonality where roughly every five years there is a steep drop in trades and then growth again.
hare_trades <-
pelt |>
select(Year, Hare)
hare_trades |>
autoplot(Hare)
The lag plots shows us how there is no strong relationship between the current time and lags, which confirms no trends as mentioned earlier.
hare_trades |>
gg_lag(Hare, geom = "point")
The autocorrelation plot shows us the peaks and drops every five years we saw earlier, confirming the seasonality pattern.