Data 624 HW1: Time Series
1 HW1: Time Series
Please submit exercises 2.1, 2.2, 2.3 and 2.6 from the Hyndman online Forecasting book. Please submit both your Rpubs link as well as attach the .rmd file with your code.
1.1 Ex. 2.1
Use the help function to explore what the series gold, woolyrnq and gas represent.
Use
autoplot()to plot each of these in separate plots.What is the frequency of each series? Hint: apply the
frequency()function.Use
which.max()to spot the outlier in thegoldseries. Which observation was it?
1.1.1 Part a
autoplot(gold) + # Daily morning gold prices in US dollars. 1 January 1985 – 31 March 1989.
ggtitle("Daily morning gold prices in US dollars (Jan 1985 to Mar 1989)") +
xlab("Days") +
ylab("US Dollar($)")autoplot(woolyrnq) + # Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994.
ggtitle("Quarterly production of woollen yarn in Australia (Mar 1965 – Sep 1994)") +
xlab("Year") +
ylab("Tonnes")autoplot(gas) + # Australian monthly gas production: 1956–1995.
ggtitle("Australian monthly gas production (1956–1995)") +
xlab("Year") +
ylab("Gas Volume")1.1.2 Part b
## [1] 1
## [1] 4
## [1] 12
The frequency of gold is 1.
The frequency of woolyrnq is 4.
The frequency of gas is 12.
1.1.3 Part c
## [1] 770
The outlier in the gold series found by the which.max() function is 770.
1.2 Ex. 2.2
Download the file tute1.csv from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
1.2.1 Part a
Read the data into R.
tute1 <- read.csv("https://raw.githubusercontent.com/shirley-wong/Data-624/main/HW1/tute1.csv", header=TRUE)
head(tute1)1.2.2 Part b
Convert the data to time series
## Sales AdBudget GDP
## 1981 Q1 1020.2 659.2 251.8
## 1981 Q2 889.2 589.0 290.9
## 1981 Q3 795.0 512.5 290.8
## 1981 Q4 1003.9 614.1 292.4
## 1982 Q1 1057.7 647.2 279.1
## 1982 Q2 944.4 602.0 254.0
1.2.3 Part c
Construct time series plots of each of the three series
Check what happens when you don’t include facets=TRUE:
- The three graphs combined into one.
1.3 Ex. 2.3
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
1.3.1 Part a
Read the data into R.
retail <- import("https://raw.githubusercontent.com/shirley-wong/Data-624/main/HW1/retail.xlsx",
skip=1) #this excel sheet has two header rows
head(retail)1.3.2 Part b
Select one of the time series as follows (but replace the column name with your own chosen column):
1.3.3 Part c
Explore your chosen retail time series using the following functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf().
#autoplot()
autoplot(myts) +
ggtitle("Turnover-Western Australia-Total(Industry) Time Series") +
xlab("Time") +
ylab("Sales")#ggseasonplot()
ggseasonplot(myts, polar=TRUE) +
ggtitle("Turnover-Western Australia-Total(Industry) Time Series") +
ylab("Sales")#ggsubseriesplot()
ggsubseriesplot(myts) +
ggtitle("Turnover-Western Australia-Total(Industry) Time Series") +
ylab("Sales")#gglagplot()
gglagplot(myts) +
ggtitle("Turnover-Western Australia-Total(Industry) Time Series") +
ylab("Sales")Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
The gglagplot shows the seasonality as the sales in December is much different.
The ggsubseriesplot shows the mean sales with the blue line. It also shows a higher sales in December.
The ggseasonplot also has a peak at December every year.
The spending trend spotted from the graphs is December.
1.4 Ex. 2.6
Use the following graphics functions: autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf() and explore features from the following time series: hsales , usdeaths , bricksq , sunspotarea , gasoline.
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
1.4.1 hsales
Q: Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
The home sales tends to be higher in March and then gradually decrease.
Winter (Nov to Feb) has a low sales trend.
1.4.2 usdeaths
Q: Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
It tends to have more deaths in the summer time, and it usually peaks in July.
It may be related to the summer vacation time from schools that teenage suicide may bring up the number of deaths during July.
1.4.3 bricksq
Q: Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
There is an increasing trend from 1950 to mid-1970s, then it started to have a cylicity pattern from mid-1970s and afterwards.
1.4.4 sunspotarea
#ggseasonplot(sunspotarea, polar=TRUE) --- data is not seasonal
#ggsubseriesplot(sunspotarea) --- data is not seasonal
gglagplot(sunspotarea)Q: Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
The data is not seasonal.
The data shows cyclicity with peak at around 1950s.
1.4.5 gasoline
Q: Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
Answer:
It shows an increasing trend from 1990 to 2005.
Seasonality is shown from the ggseasonplot. It shows peaks from week 28 to Week 38, about June to September, which is during Summer, and goes down during Winter.