1) Use the help function to explore what the series gafa_stock, PBS, vic_elec and pelt represent.

gafa_stock:

Historical stock prices for Google, Amazon, Facebook and Apple in $USD from 2014-18. A time series tsibble containing data for the opening price of the stock the highest trading price, lowest price, closing price, adjusted closing price and the amount traded.

PBS:

Monthly Medicare Australia prescription data, A monthly tsibble with the total number of scripts and the cost of scripts in $AUD.

vic_elec:

Half-hourly electricity demand for Victoria, Australia. A half-hourly tsibble with total electricity demand, temperature in Melbourne, and an indicator of if the day is a public holiday.

pelt:

Pelt trading records for the Hudson Bay Company for snowshoe Hare and Candian Lynx from 1845-1935. A time series tsibble with the number of showshoe pelts traded and the number of lynx pelts traded.

a. Use autoplot() to plot some of the series in these data sets.

I’ll look specifically at adjusted closing price for the gafa_stock dataset to give the best idea of stock performance over time.

For the PBS dataset there are a lot of keys which disaggregate the data and produce too many plots to read so I will only look at the average script price over time for Concession Scripts made via co-payment with ATC1 = A and ATC2 = A01.

For the vic_elec dataset I created a ratio of demand to temperature in order to get a better undertsanding of how demand changes with temperature not sperate from it over the course of time. I also limited it to a week in order to see more clearly what has happened over the course of a week.

For the pelt dataset I wanted to see what it would default to if I did not give the autoplot function any parameters. It defaulted to Hare pelt sales over time with no title. This is not a good way to use the autoplot function.

b. What is the time interval of each series?

gafa_stock is a daily time-series PBS is a monthly time-series vic_elec is a half-hourly time-series pelt is a yearly time-series

#help("gafa_stock")
#help(PBS)
#help("vic_elec")
#help("pelt")


autoplot(gafa_stock, Adj_Close) +
  labs(title = "Adjusted Close of GAFA stocks from 2014-2018",
       y = "Adjusted Close Price $USD")

autoplot(PBS %>%
  filter(Concession == "Concessional", Type == "Co-payments", ATC1 == "A", ATC2 == "A01"), Cost) +
  labs(title = "Cost of Concessional Co-pay Medicare Asutralia Scripts over time",
       subtitle = "ATC1 = A & ATC2 = A01",
       y = "Script Price $AUD")

autoplot(vic_elec %>% filter_index("2012-01-01", "2012-01-07" ~ .), Demand/Temperature) +
  labs(title = "Electricity Demand to Temperature Ratio for Victoria Australia",
       subtitle = "January 1 2012 thorugh January 7 2012",
       y = "Electricity Demand to Temperature")

autoplot(pelt)
## Plot variable not specified, automatically selected `.vars = Hare`

2) Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.

AAPL: 2018-10-03 AMZN: 2018-09-04 FB: 2018-07-25 GOOG: 2018-07-26

max_close <- gafa_stock %>% 
             group_by(Symbol) %>%
             filter(Close == max(Close)) 
max_close
## # A tsibble: 4 x 8 [!]
## # Key:       Symbol [4]
## # Groups:    Symbol [4]
##   Symbol Date        Open  High   Low Close Adj_Close   Volume
##   <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
## 1 AAPL   2018-10-03  230.  233.  230.  232.      230. 28654800
## 2 AMZN   2018-09-04 2026. 2050. 2013  2040.     2040.  5721100
## 3 FB     2018-07-25  216.  219.  214.  218.      218. 58954200
## 4 GOOG   2018-07-26 1251  1270. 1249. 1268.     1268.  2405600

3) Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.

a. You can read the data into R with the following script:

tute1 <- readr::read_csv("tute1.csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Quarter = col_date(format = ""),
##   Sales = col_double(),
##   AdBudget = col_double(),
##   GDP = col_double()
## )
View(tute1)

b.Convert the data to time series

mytimeseries <- tute1 %>%
  mutate(Quarter = yearmonth(Quarter)) %>%
  as_tsibble(index = Quarter)

c. Construct time series plots of each of the three series

When facet_grid is not included all 3 values are graphed on the same axis.

mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line() +
  facet_grid(name ~ ., scales = "free_y")

mytimeseries %>%
  pivot_longer(-Quarter) %>%
  ggplot(aes(x = Quarter, y = value, colour = name)) +
  geom_line()

4) The USgas package contains data on the demand for natural gas in the US.

a. Install the USgas package.

b. Create a tsibble from us_total with year as the index and state as the key.

c. Plot the annual natural gas consumption by state for the New England area (comprising the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island).

#install.packages("USgas")
library(USgas)

us_total_tsib <- tsibble::as_tsibble(us_total, index = "year",key="state")

st<-c("Maine","Vermont","New Hampshire", "Massachusetts","Connecticut","Rhode Island")

autoplot(filter(us_total_tsib,
                       state %in% st), y) +
  labs(title = "Yearly Total Natural Gas Consumption for New England",
       subtitle = "Maine, Vermont, New Hampshire, Massachusetts, Connecticut and Rhode Island",
       y = "Million Cubic Feet")

5)

a. Download tourism.xlsx from the book website and read it into R using readxl::read_excel().

b. Create a tsibble which is identical to the tourism tsibble from the tsibble package.

c. Find what combination of Region and Purpose had the maximum number of overnight trips on average.

Visiting Sidney had the maximum number of overnight trips on average.

d. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

I’m a little confused by the wording “Create a new tsibble which combines the Purposes and Regions, and just has total trips by State”. Not sure if this means create a tsibble that just has trips by state or to include purpose and region in the trips calculation which would not make it accurate to say it is trips by state because it would be by region and purpose. I decided to make a tsibble that accurately shows the trips by state over time.

tourism <- readxl::read_excel("tourism.xlsx")
tourism2<-tsibble::tourism

max_night <- aggregate(tourism[, 5], list(tourism$Region,tourism$Purpose), mean)%>%
              arrange(desc(Trips))

head(max_night)
##           Group.1  Group.2    Trips
## 1          Sydney Visiting 747.2700
## 2       Melbourne Visiting 618.8975
## 3          Sydney Business 602.0439
## 4 North Coast NSW  Holiday 587.8966
## 5          Sydney  Holiday 550.3269
## 6      Gold Coast  Holiday 528.3399
state_tourism <- aggregate(tourism[, 5], list(tourism$Quarter,tourism$State), sum)
names(state_tourism )[names(state_tourism ) == 'Group.1'] <- 'Quarter'
names(state_tourism )[names(state_tourism ) == 'Group.2'] <- 'State'

head(state_tourism)
##      Quarter State    Trips
## 1 1998-01-01   ACT 551.0019
## 2 1998-04-01   ACT 416.0256
## 3 1998-07-01   ACT 436.0290
## 4 1998-10-01   ACT 449.7984
## 5 1999-01-01   ACT 378.5728
## 6 1999-04-01   ACT 558.1781

8) Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):

autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() %>% autoplot()

Can you spot any seasonality, cyclicity and trend? What do you learn about the series?

It appears that there is a seasonal uptick in turnover starting in November but peaking in December. This makes sense as this is the holiday season and would be the busiest time of the year for retail.We can also see a steady trend up in retail turnover year over year. There is a pretty steady trend upward with consistent seasonality. I could identify one area of potential cyclicity and that is where the upward trend seems to flatten out from around 2008-2015 before picking up again with upward trend. A potential explanation for this could be the great recession which occured around this time and could have caused a stagnation in the increase of retail turnover.

set.seed(19865)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries, Turnover) +
  labs(title = "Retail Turnover",
       y = "$Million AUD")

gg_season(myseries,Turnover)

gg_subseries(myseries,Turnover)

gg_lag(myseries,Turnover)

#ACF(myseries) %>% autoplot(myseries)