DATA 624 - HOMEWORK 1
library(tidyverse)
library(fpp2)
library(readxl)
library(rio)
library(gridExtra)
library(ggpubr)
library(TSstudio)
1 Question - 2.1
Use the help function to explore what the series gold
, woolyrnq
and gas
represent.
gold
woolyrnq
gas
1.1 a.
Use autoplot()
to plot each of these in separate plots.
gold
autoplot(gold) +
ylab('Price in US Dollars') +
ggtitle('Time Series Autoplot: gold\nDaily Morning Gold Prices')
woolyrnq
autoplot(woolyrnq) +
ylab('Woollen Yarn Production in Tonnes') +
ggtitle('Time Series Autoplot: woolyrng\nQuarterly Production of Woollen Yarn in Australia')
gas
autoplot(gas) +
ylab('Gas Production') +
ggtitle('Time Series Autoplot: gas\nAustralian Monthly Gas Production')
1.2 b.
What is the frequency of each series? Hint: apply the frequency()
function.
gold
Answer: Frequency: Daily
## [1] 1
woolyrnq
Answer: Frequency: Quarterly
## [1] 4
gas
Answer: Frequency: Monthly
## [1] 12
1.3 c.
Use which.max() to spot the outlier in the gold series. Which observation was it?
gold
Answer: The outlier is the 770th observation, the value is 593.7.
## [1] 770
## [1] 593.7
2 Question - 2.2
Download the file tute1.csv
from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
2.1 a.
You can read the data into R with the following script:
tute1 <- read.csv('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/HW_1-WEEK_2/tute1.csv', header = TRUE)
#view(tute1)
tute1
2.2 b.
Convert the data to time series
## Sales AdBudget GDP
## 1981 Q1 1020.2 659.2 251.8
## 1981 Q2 889.2 589.0 290.9
## 1981 Q3 795.0 512.5 290.8
## 1981 Q4 1003.9 614.1 292.4
## 1982 Q1 1057.7 647.2 279.1
## 1982 Q2 944.4 602.0 254.0
2.3 c.
Construct time series plots of each of the three series
Check what happens when you don’t include facets=TRUE
3 Question - 2.3
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for dierent Australian states, and are stored in a MS-Excel file.
3.1 a.
You can read the data into R with the following script:
#retaildata <- read_excel('retail.xlsx', skip=1)
retaildata <- import('https://raw.githubusercontent.com/oggyluky11/DATA624-SPRING-2021/main/HW_1-WEEK_2/retail.xlsx', skip=1)
retaildata
3.2 b.
Select one of the time series as follows (but replace the column name with your own chosen column):
3.3 c.
Explore your chosen retail time series using the following functions: autoplot()
, ggseasonplot()
, ggsubseriesplot()
, gglagplot()
, ggAcf()
autoplot()
ggseasonplot()
ggsubseriesplot()
gglagplot()
ggAcf()
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
Seasonality can be spotted as shown in the lag plot that there is a strongly positive relationship between sales values and lag values through out lag 1 to lag 16, tipically in lag 12, reflecting very strong seasonality in the data.
Trend can be spotted as shown in the ACF plot that trended time series tend to have positive values that slowly decrease as the lags increase. The correlations are significantly different from zero confirming that the fluctuation is not white noise.
Cyclicity is not spotted in the autoplot as there is no obvious business cycle with duration of at least 2 years.
From the plots above, we learn that this time series has an inscreasing trend and strong monthly seasonality (frequency=12) and with time range from year 1982 to 2013.
4 Question - 2.6
Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales, usdeaths, bricksq, sunspotarea, gasoline.
a. Can you spot any seasonality, cyclicity and trend?
b. What do you learn about the series?
tsplot <- function(ts){
ts_name <- deparse(substitute(ts))
snl_err_handle <- function(fn){
return(
tryCatch({fn},
error = function(e){ggplot()+ggtitle('Seasonality plot not applicable') })
)
}
return(
ggarrange(
ggarrange(
autoplot(ts)+ggtitle(paste0('autoplot: ',ts_name)),
snl_err_handle(ggsubseriesplot(ts)+ggtitle(paste0('ggsubseriesplot: ',ts_name))),
ggAcf(ts)+ggtitle(paste0('ACF plot: ',ts_name)),
nrow=3
),
snl_err_handle(ggseasonplot(ts,polar = TRUE)+ggtitle(paste0('ggseasonplot: ',ts_name))),
ncol=2
)
)
}
4.1 hsales
a. Can you spot any seasonality, cyclicity and trend?
Answer:
Seasonality can be spotted with peaks at Marchs and troughs at Decembers.
Cyclicity can be spotted with business cycle of a period of 6-9 years.
There is no apparent trend in the data over this period.
b. What do you learn about the series?
Answer:
This series has seasonlity of peaks at every Marches and troughs at Decembers; The series also have cyclicity of every 6-9 years, the sales dropped to a trough in year 1975, and increased to a peak in year 1978, then again dropped to a trough in year 1982, then another peak in 1986, then another trough in 1991m then increased again. No apprarent trend is observed over this period.
4.2 usdeaths
a. Can you spot any seasonality, cyclicity and trend?
Answer:
Seasonality can be spotted with peaks at Julies and troughs at Februaries.
This is no apparent cyclicity in the data over this period.
There is no apparent trend in the data over this period.
b. What do you learn about the series?
Answer:
This series has seasonlity of peaks at every Julies and troughs at Februaries; According to the ACF plot, there is no apparaent decreasing trend of correslations which signals no trend in the data; There is no long-period cycle pattern as well which demostrate no apparent cyclicity during this period.
4.3 bricksq
a. Can you spot any seasonality, cyclicity and trend?
Answer:
Seasonality can be spotted with peaks at Q3s and troughs at Q1s.
Cyclicity can be spotted after year 1975 with business cycle of a period of approximately 9 years.
Increasing trend is spotted in the data over this period.
b. What do you learn about the series?
Answer:
This series has seasonlity of peaks at every Q3s and troughs at Q1s; The series also have cyclicity of every 9 years, starting year 1975. The ACF plot shows that the correlations are positive and slowly decreasing which signals an significant trend.
4.4 sunspotarea
a. Can you spot any seasonality, cyclicity and trend?
Answer:
Seasonality can not be spotted from the plots.
Cyclicity can be spotted with business cycle of a period of approximately 11-13 years.
Trend is not significant in the data over this period.
b. What do you learn about the series?
Answer:
This series does not demostrate apparent seasonlity. However,The series have cyclicity of every 11-13 years according to the autoplot. The ACF plot show no significant trend in the data during this period.
4.5 gasoline
a. Can you spot any seasonality, cyclicity and trend?
Anwer:
Seasonality can be spotted in the ACF plot peaks in end of the years and troughs in middle of years.
There is apparent Cyclicity spotted in the data.
Trend is spotted in the data over this period.
b. What do you learn about the series?
Answer:
This series demostrates annual seasonlity with peaks in end of the years and troughs in the middle of years. The series does not demostrate apparent cyclicity. The ACF plot show decreasing of the correlation which demostarte significant trend in the data during this period.