DATA 624 Homework 1
Question 2.1
Use the help function to explore what the series gold
, woolyrnq
and gas
represent.
gold
is the daily morning gold prices in US dollars from 1 Jan 1985 to 31 March 1989. woolyrnq
is the quarterly production of woolen yarn in Australia in tons from March 1965 to September 1994. gas
is monthly gas production in Australia from 1956-1995.
- Use
autoplot()
to plot each of these in separate plots.
- What is the frequency of each series? Hint: apply the
frequency()
function.
[1] 1
[1] 4
[1] 12
gold
is an annual time series, woolyrnq
is a quarterly time series and gas
is a monthly time series
- Use
which.max()
to spot the outlier in the gold series. Which observation was it?
It is the 770 observation. The price of gold was 593.7
Queston 2.2
Download the file tute1.csv
from the book website, open it in Excel (or some other spreadsheet application), and review its contents. You should find four columns of information. Columns B through D each contain a quarterly series, labelled Sales, AdBudget and GDP. Sales contains the quarterly sales for a small company over the period 1981-2005. AdBudget is the advertising budget and GDP is the gross domestic product. All series have been adjusted for inflation.
- You can read the data into R with the following script:
if(!file.exists("tute1.csv")){
download.file("http://otexts.com/fpp2/extrafiles/tute1.csv", "tute1.csv")
}
tute1 <- read.csv("tute1.csv", header=TRUE)
View(tute1)
- Convert the data to time series
(The [,-1] removes the first column which contains the quarters as we don’t need them now.)
- Construct time series plots of each of the three series
Check what happens when you don’t include facets=TRUE.
It stacks the visualization instead of having small multiples.
Question 2.3
Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
- You can read the data into R with the following script:
if(!file.exists("retail.xlsx")){
download.file("https://otexts.com/fpp2/extrafiles/retail.xlsx", "retail.xlsx")
}
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
The second argument (skip=1) is required because the Excel sheet has two header rows.
- Select one of the time series as follows (but replace the column name with your own chosen column):
- Explore your chosen retail time series using the following functions:
autoplot()
, ggseasonplot()
, ggsubseriesplot()
, gglagplot()
, ggAcf()
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
There is a clear seasonality increase in retail sales from October to the end of the year. This is the Christmas shopping season. There is also a trend of increasing retail sales over time. The trend has been rising untill the 2000’s where it flatened out for roughly a decade. Since 2010 it looks like the trend increases again.
Question 2.6
Use the following graphics functions: autoplot()
, ggseasonplot()
, ggsubseriesplot()
, gglagplot()
, ggAcf()
and explore features from the following time series: hsales
, usdeaths
, bricksq
, sunspotarea
, gasoline
.
hsales
- Can you spot any seasonality, cyclicity and trend?
One-family home sales in the US tend to be highest in March. The ACF plot suggests that there is some annual cycle, but it is noisy.
- What do you learn about the series?
If I am a realtor I’m not going to be busy in the winter months. Early spring (march through may) will be my busy time of the year.
usdeaths
- Can you spot any seasonality, cyclicity and trend?
There appears to be a seasonal pattern to the data.
- What do you learn about the series?
Accidental deaths in the US tends to be highest in July.
bricksq
- Can you spot any seasonality, cyclicity and trend?
There isn’t much variation from Q2 through Q4. The trend was genrally increasing until about the 1980’s.
- What do you learn about the series?
Q1 is a slow quarter for Australian clay brick producers. There has been quite a bit more irregularity since 1975.
sunspotarea
- Can you spot any seasonality, cyclicity and trend?
There appears to be a cycle in the data. It looks like it’s about a 10 to 11 year cycle.
- What do you learn about the series?
If the pattern holds 2020 should be a year with low sunspot area.
gasoline
- Can you spot any seasonality, cyclicity and trend?
There is a trend and some seasonality to the data. I thought you would see some cyclical behavior coinsiding with U.S. regression dates but that is not present.
- What do you learn about the series?
The trend of the supply of gasoline has been generally increasing. It increases slightly during the summer months.