Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.
#Loading required R package and datasets:
library(fpp3)
data("aus_production")
data("pelt")
data("gafa_stock")
data("vic_elec")
Use ? (or help()) to find out about the data in each series.
help("aus_production")
What is the time interval of each series?
aus_production: quarterly (i.e. every three months) pelt: yearly (i.e. every 12 months) gafa_stock: daily (i.e. everyday) vic_elec: half-hourly (i.e. every 30 minutes)
Use autoplot() to produce a time plot of each series.
autoplot(aus_production,Bricks)+ggtitle("Quarterly Production of Bricks in Australia")
autoplot(pelt, Lynx) +
ggtitle("Canadian Lync Pelts traded 1845-1935")
autoplot(gafa_stock, Close) +
ggtitle("Closing GAFA Stock Prices from 2014-2018")
autoplot(vic_elec, Demand) +
ggtitle("Electricity Demand for Victoria, Australia")
For the last plot, modify the axis labels and title.
library(ggplot2)
autoplot(vic_elec,color="blue") + labs(title = "Electricity Demand in Victoria")+ ylab("Demand (MW)") # Time plot for Electricity Demand in Victoria
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
library(dplyr)
data(gafa_stock)
gafa_stock %>%
group_by(Symbol) %>%
filter(Close == max(Close)) %>% #Keeps rows where Close value = max close value
select(Symbol, Date, Close)
## # A tsibble: 4 x 3 [!]
## # Key: Symbol [4]
## # Groups: Symbol [4]
## Symbol Date Close
## <chr> <date> <dbl>
## 1 AAPL 2018-10-03 232.
## 2 AMZN 2018-09-04 2040.
## 3 FB 2018-07-25 218.
## 4 GOOG 2018-07-26 1268.
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
###a. You can read the data into R with the following script:
tute1 <- readr::read_csv("tute1.csv")
View(tute1)
Convert the data to time series
mytimeseries <- tute1 %>%
mutate(Quarter=yearquarter(Quarter)) %>%
as_tibble(index = Quarter)
###c. Construct time series plots of each of the three series
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y")
Check what happens when you don’t include facet_grid()
mytimeseries %>%
pivot_longer(-Quarter) %>%
ggplot(aes(x = Quarter, y = value, colour = name)) + geom_line()
Without facet_grid(), it makes it difficult to interpret the plot,
especially if the series has different magnitudes or trends.
The aus_arrivals data set comprises quarterly international arrivals to Australia from Japan, New Zealand, UK and the US.
Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries.
autoplot(aus_arrivals) +
xlab("Year") + ylab("") +
ggtitle("Total Quarterly Arrivals to Australia")
gg_season(aus_arrivals)
gg_subseries(aus_arrivals)
### b Can you identify any unusual observations?
UK data appears as the most seasonal; travel is high during Q1 and Q4. US peaks are Q1 and Q4.