2.3
Download the file tute1.csv from the book website, open it in Excel (or
some other spreadsheet application), and review its contents. You should
find four columns of information. Columns B through D each contain a
quarterly series, labelled Sales, AdBudget and GDP. Sales contains the
quarterly sales for a small company over the period 1981-2005. AdBudget
is the advertising budget and GDP is the gross domestic product. All
series have been adjusted for inflation.
a.
You can read the data into R with the following script:
df_tute1 <- readr::read_csv(tute1_csv)
head(df_tute1,20)
NA
b.
Convert the data to time series
mytimeseries <- df_tute1 |>
mutate(Quarter = yearquarter(Quarter)) |>
as_tsibble(index = Quarter)
c.
Construct time series plots of each of the three series
mytimeseries |>
pivot_longer(-Quarter) |>
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")

mytimeseries %>%
pivot_longer(-Quarter)%>%
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() #+

# facet_grid(name ~ ., scales = "free_y")
The plot is encompassed in one plot without the
facet_grid() function.
2.4
The USgas package contains data on the demand for
natural gas in the US.
- Install the
USgas package.
- Create a
tsibble from us_total with year
as the index and state as the key.
- Plot the annual natural gas consumption by state for the New England
area (comprising the states of Maine, Vermont, New Hampshire,
Massachusetts, Connecticut and Rhode Island).
i
str(USgas::us_total)
'data.frame': 1266 obs. of 3 variables:
$ year : int 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 ...
$ state: chr "Alabama" "Alabama" "Alabama" "Alabama" ...
$ y : int 324158 329134 337270 353614 332693 379343 350345 382367 353156 391093 ...
ii
Example
Forecasting
Principles & Practice: 2.1 tsibble objects
Template
mydata <- tsibble(
year = 2015:2019,
y=c(123,39,78,52,110),
index = year
)
mydata
mydata <- tsibble(
state = us_total$state,
year = us_total$year,
value = us_total$y,
index = year,
key = state
)%>%
filter(state %in% c("Maine", "Vermont", "New Hampshire", "Massachusetts", "Connecticut", "Rhode Island"))
iii
ggplot(mydata, aes(x = year, y = value, color = state)) +
geom_line() +
labs(
title = "Annual Natural Gas Consumption for New England Area (by state)",
x = "Year",
y = "Natural Gas Consumption",
color = "State"
)

2.5
a.
Download tourism.xlsx from the book website and read it into R
using readxl::read_excel().
PATH<-"C:/Users/Lenny/Documents/GitableGabe/Data624_Data/"
tourism_str <- paste(PATH,"tourism.xlsx", sep = "")
df_tourism <- readxl::read_excel(tourism_str)
rm(tourism_str)
tourism
b.
Create a tsibble which is identical to the tourism
tsibble from the tsibble package.
str(df_tourism)
tibble [24,320 × 5] (S3: tbl_df/tbl/data.frame)
$ Quarter: chr [1:24320] "1998-01-01" "1998-04-01" "1998-07-01" "1998-10-01" ...
$ Region : chr [1:24320] "Adelaide" "Adelaide" "Adelaide" "Adelaide" ...
$ State : chr [1:24320] "South Australia" "South Australia" "South Australia" "South Australia" ...
$ Purpose: chr [1:24320] "Business" "Business" "Business" "Business" ...
$ Trips : num [1:24320] 135 110 166 127 137 ...
Example
Forecasting
Principles & Practice: 2.1 tsibble objects
Template
prison<- read::read_csv("data/prison_population.csv") %>%
mutate(Quarter = yearquarter(date)) %>%
select(-date) %>%
as_tsibble(
index = Quarter,
key=c(state,gender,legal,indigenous)
)
tibble_tourism <- df_tourism %>%
mutate(Quarter = yearquarter(Quarter)) %>%
as_tsibble(index=Quarter,
key = c("Region", "State", "Purpose"))
tibble_tourism
NA
c.
Find what combination of Region and Purpose had the maximum number of
overnight trips on average.
tibble_tourism %>%
group_by(Region,Purpose)%>%
summarize(TripsAvg = mean(Trips))%>%
filter(TripsAvg == max(TripsAvg))%>%
arrange(desc(TripsAvg))
NA
d.
Create a new tsibble which combines the Purposes and
Regions, and just has total trips by State.
tibble_tourism_v2 <- tibble_tourism %>%
group_by(State)%>%
summarize(Total=sum(Trips))
tibble_tourism_v2
2.8
Use the following graphics functions: autoplot(),
gg_season(), gg_subseries(),
gg_lag(), ACF() and explore features from the
following time series: “Total Private” Employed from
us_employment, Bricks from
aus_production, Hare from pelt,
“H02” Cost from PBS, and Barrels
from us_gasoline.
- Can you spot any seasonality, cyclicity and trend?
- What do you learn about the series?
- What can you say about the seasonal patterns?
- Can you identify any unusual years?
Total Private
Example
vic_elec |> gg_season(Demand, period = "day") +
theme(legend.position = "none") +
labs(y="MWh", title="Electricity demand: Victoria")
us_employment
us_employment%>%
filter(Title=="Total Private")%>%
autoplot(Employed,period="month")
Warning: Ignoring unknown parameters: `period`

us_employment%>%
filter(Title=="Total Private")%>%
gg_season(Employed, polar = FALSE)

us_employment%>%
filter(Title=="Total Private")%>%
gg_season(Employed, polar = TRUE)

us_employment%>%
filter(Title=="Total Private")%>%
gg_subseries(Employed)

us_employment%>%
filter(Title=="Total Private")%>%
gg_lag(Employed)

us_employment%>%
filter(Title=="Total Private")%>%
ACF(us_employment$Employed)%>%
autoplot()

i
There is a clear upwards trend in small increments for the data.
ii
Growth has been consistent without any extreme spike or drop.
iii
No Seasonality is noted indicating there is not particular season
with an affect on employment positive or negative.
iv
A small dip around 2010 which I believe aligns with the
recession.
Bricks
aus_production%>%
select(Bricks)%>%
autoplot(period="quarter")
Plot variable not specified, automatically selected `.vars = Bricks`Warning: Ignoring unknown parameters: `period`

aus_production%>%
select(Bricks)%>%
gg_season( polar = FALSE)
Plot variable not specified, automatically selected `y = Bricks`

aus_production%>%
select(Bricks)%>%
gg_season( polar = TRUE)
Plot variable not specified, automatically selected `y = Bricks`

aus_production%>%
select(Bricks)%>%
gg_subseries()
Plot variable not specified, automatically selected `y = Bricks`

aus_production%>%
select(Bricks)%>%
gg_lag()
Plot variable not specified, automatically selected `y = Bricks`Warning: Removed 20 rows containing missing values (gg_lag).

aus_production%>%
select(Bricks)%>%
ACF(aus_production$Bricks)%>%
autoplot()

i
There is lots of cyclicity with frequent spikes and dips, but it does
not appear to be consistent to a time period.There is a positive upward
trend in the long term.
ii
The data being broken down to Quarters my influence how well we can
assess the potential seasonality. As is, there does seem to be one.
iii
There seems to be some seasonality as far as Q1 and Q3 is
concerned.
iv
The early 1980s has a significant dip so I would be curious to
understand what may have cause this.
Hare
pelt%>%
select(Hare)%>%
autoplot(period="year")
Plot variable not specified, automatically selected `.vars = Hare`Warning: Ignoring unknown parameters: `period`

#Not possible
# pelt%>%
# select(Hare)%>%
# gg_season( polar = FALSE)
#
# pelt%>%
# select(Hare)%>%
# gg_season( polar = TRUE)
pelt%>%
select(Hare)%>%
gg_subseries()
Plot variable not specified, automatically selected `y = Hare`

pelt%>%
select(Hare)%>%
gg_lag()
Plot variable not specified, automatically selected `y = Hare`

pelt%>%
select(Hare)%>%
ACF()%>%
autoplot()
Response variable not specified, automatically selected `var = Hare`

i
The data is definitely cyclical but its not possible to teal
seasonality since its at a annual basis.
ii
The data does not trend and varies a great deal. But seems to have a
pattern at a 5 year interval.
iv
Im curious what caused the peak in the early 1860s
Cost
PBS%>%
filter(ATC2=="H02")%>%
autoplot(Cost)

PBS %>%
filter(ATC2 == "H02") %>%
gg_season(Cost, polar = FALSE)

PBS %>%
filter(ATC2 == "H02") %>%
gg_subseries(Cost)

# PBS %>%
# filter(ATC2 == "H02") %>%
# gg_lag(Cost)
PBS %>%
filter(ATC2 == "H02") %>%
ACF(Cost)%>%
autoplot()

i
The data is hard to interpret but it appears to trend upwards with
cyclicity and seasonality. ### ii
the data is very volatile but spikes mainly end of year it
appears.
iii
The seasonality is at the end of the year.
iv
No year stands out, outside of the latest year having the highest
cost.
Barrels
us_gasoline%>%
select(Barrels)%>%
autoplot()
Plot variable not specified, automatically selected `.vars = Barrels`

us_gasoline%>%
select(Barrels)%>%
gg_season(polar = FALSE)
Plot variable not specified, automatically selected `y = Barrels`

us_gasoline%>%
select(Barrels)%>%
gg_season( polar = TRUE)
Plot variable not specified, automatically selected `y = Barrels`

us_gasoline%>%
select(Barrels)%>%
gg_subseries()
Plot variable not specified, automatically selected `y = Barrels`

us_gasoline%>%
select(Barrels)%>%
gg_lag()
Plot variable not specified, automatically selected `y = Barrels`

us_gasoline %>%
select(Barrels) %>%
ACF()%>%
autoplot()
Response variable not specified, automatically selected `var = Barrels`

i
Primarily an upward trend with a dip near the most recent year
ii
Its possible the barrels value is impacted by supply.
iii
There does not appear to be seasonality of cyclicity
iv
The most recent dip is interesting and I wonder if its just a data
collection issue.
