2.10 Question No. 1

  1. Use the help function to explore what the series gold, woolyrnq and gas represent.

ANSWER:

For question one, the author is asking to use exploratory data techniques to get insight and visualize each time series.

Gold

We can confirm that the Gold dataset is a Time Series dataset by applying the class method. The result “ts” confirms that the dataset is a time series.

## [1] "ts"

The function help gives more details about the time series.

## starting httpd help server ... done

Below is the output from the help documentation on the series:

Gold

Daily morning gold prices
Description
Daily morning gold prices in US dollars. 1 January 1985- 31 March 1989.

Usage
gold
Format
Time series data

The “summary” function gives us basic descriptive statistics for the time series including the number of null values (“NA’s”) which is 34.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   285.0   337.7   403.2   392.5   443.7   593.7      34

The “head” and “tail” functions show us the firt and last six entries as well as the frequency of the time series. In this case, it’s a monthly time series.

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1] 306.25 299.50 303.45 296.75 304.40 298.35
## Time Series:
## Start = 1103 
## End = 1108 
## Frequency = 1 
## [1]     NA     NA 391.25 383.30 384.00 382.30



The same exploratory analysis is performed on the other time series.

woolyrnq

## [1] "ts"

Quarterly production of woollen yarn in Australia
Description
Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994.

Usage
woolyrnq
Format
Time series data

Source Time Series Data Library. https://pkg.yangzhuoranyang.com/tsdl/

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3324    4882    5466    5658    6646    7819
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1965 6172 6709 6633 6660
## 1966 6786 6800
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1993      4588 5309 4732
## 1994 4837 6135 6396

We see with this time series that it’s a quarterly time series from 1965 though the 3rd quarter of 1994.



gas

## [1] "ts"

Australian monthly gas production
Description
Australian monthly gas production: 1956–1995.

Usage
gas
Format
Time series data

Source
Australian Bureau of Statistics.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1646    2675   16788   21415   38629   66600
##       Jan  Feb  Mar  Apr  May  Jun
## 1956 1709 1646 1794 1878 2173 2321
##        Mar   Apr   May   Jun   Jul   Aug
## 1995 46287 49013 56624 61739 66600 60054

We see that the time series is monthly from 1956 to 1995.

Next, we use autoplot() to plot each of these in separate plots.

GOLD:

WOOLYRNQ:

GAS:

b. What is the frequency of each series? Hint: apply the frequency() function

The frequency of the gold,woolyrnq,and gas time serieses is daily, quarterly, and monthly respectively.

## [1] 1
## [1] 4
## [1] 12

c. Use which.max() to spot the outlier in the gold series. Which observation was it?

## [1] "The observation at: 770 shows that the maximum price of gold (outlier) is: 593.7"

2.10 Question No. 2

For this question, the author tasks us with loading data from a csv file and converting it into a time series using the function, “ts”.

Here, the frequency argument of the ts function is set to “4” for quarterly analysis.

When we plot the series with autoplot and set the “facets” argument to “TRUE”, the plot shows different scales on the Y-axis for the Sales, AdBudget, and GDP data.

When the facets argument is set to FALSE, there’s a common y-axis for all three datasets.

2.10 Question No. 3

For this question, the author tasks us to read in an excel file containing Australian retail data.

The autoplot shows monthly liquor sales from April 1982 to December 2013. The plot shows an increasing trend in sales along with a strong seasonality component which grows in strength.

The ggseasonplot and ggsubseriesplot show the seasonality of liquor sales. The earlier years starting from 1982 are at the bottom of the plot with the later years at the top. The graph shows the yearly increase in sales and towards the last two months of the year, sales tend to spike.

Setting the polar attribute to TRUE for the ggseasonplot further deomonstrates the seasonal nature of liquor sales with a spike in December sales

Over the next two plots, the gglagplot and the ggACF plot, we see how observations are plotted against earlier observations or lags against itself. Lag 12 supports the finding that sales spike in the 12 month which is shown in the ggacf plot.

2.10 Question No. 6

Sales of one-family houses

Sales of one-family houses
Description
Monthly sales of new one-family houses sold in the USA since 1973.

Usage
hsales
Format
Time series data

Source
Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, John Wiley & Sons: New York. Chapter 3.

##      Jan Feb Mar Apr May Jun
## 1973  55  60  68  63  65  61
##      Jun Jul Aug Sep Oct Nov
## 1995  64  64  63  55  54  44
##         Peak     Trough
## 1 1973-11-01 1975-03-01
## 2 1980-01-01 1980-07-01
## 3 1981-07-01 1982-11-01
## 4 1990-07-01 1991-03-01

Analysis of Sales of one-family house

The above plots show that there is no trend for the sales of one family houses from 1973 to 1995. However, there are cyclic and seasonal trends. We can see the downward trends for sales matching downturns in the economy. From the seasonal plots, we can clearly see that sales are strongest in the months of March, April, and May. Finally from the lag plots, we can see that the most recent prior period (lag=1) is strongly correlated to the current period. Lags that are further out, esp. lags at 18 to 21, are not as well correlated. This shows that housing sales have recency bias.

Accidential Deaths in USA

Accidental deaths in USA
Description
Monthly accidental deaths in USA.

Usage
usdeaths
Format
Time series data

Source
Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, John Wiley & Sons: New York. Exercises 2.3 and 2.4.

##        Jan   Feb   Mar   Apr   May   Jun
## 1973  9007  8106  8928  9137 10017 10826
##        Jul   Aug   Sep   Oct   Nov   Dec
## 1978 10484  9827  9110  9070  8633  9240
## [1] 12

Analysis of Accidental deaths in USA

The strongest signal in the data is a seasonal trend for accidental deaths. Intuition tells us that since these are the summer months for most of the USA, more people are on the roads driving which is probably the leading cause of accidental deaths. There does not appear to be any cyclical or other trends.

Quarterly clay brick production

Quarterly clay brick production
Description
Australian quarterly clay brick production: 1956–1994.

Usage
bricksq
Format
Time series data

Source
Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, John Wiley & Sons: New York. Chapter 1 and Exercise 2.3.

##      Qtr1 Qtr2 Qtr3 Qtr4
## 1956  189  204  208  197
## 1957  187  214
##      Qtr1 Qtr2 Qtr3 Qtr4
## 1993       462  476  443
## 1994  421  472  494
## [1] 4

Analysis of Quarterly clay brick production

There is a long term growth trend for clay brick production from 1956 to 1994 along with a slight seasonal trend which shows higher production in quarters 2 and 3. The most interesting item is that there is a strong auto correlation with prior periods of time. A 20 period lag is still correlated above .5.

Annual average sunspot area (1875-2015)

Annual average sunspot area (1875-2015)
Description
Annual averages of the daily sunspot areas (in units of millionths of a hemisphere) for the full sun. Sunspots are magnetic regions that appear as dark spots on the surface of the sun. The Royal Greenwich Observatory compiled daily sunspot observations from May 1874 to 1976. Later data are from the US Air Force and the US National Oceanic and Atmospheric Administration. The data have been calibrated to be consistent across the whole history of observations.

Format
Annual time series of class ts.

Source
NASA

## Time Series:
## Start = 1875 
## End = 1880 
## Frequency = 1 
## [1] 213.13333 109.28333  92.85833  22.21667  36.33333 446.75000
## [1] "ts"
## Time Series:
## Start = 2010 
## End = 2015 
## Frequency = 1 
## [1]  214.2917  749.5667  796.8833  860.7750 1252.1500  618.8083
## [1] 1

Analysis of Annual average sunspot area (1875-2015)

Since the data is collected on an annual basis, there is no seasonal component to it. There is no trend, but there is a cycle in the data. It seems as though in odd numbered years, there is more sunspot activity. There also seems to be negative correlations between lags.

US finished motor gasoline product supplied

US finished motor gasoline product supplied.
Description
Weekly data beginning 2 February 1991, ending 20 January 2017. Units are “million barrels per day”.

Format
Time series object of class ts.

Source
US Energy Information Administration.

## Time Series:
## Start = 1991.1 
## End = 1991.19582477755 
## Frequency = 52.1785714285714 
## [1] 6.621 6.433 6.582 7.224 6.875 6.947
## Time Series:
## Start = 2016.95352498289 
## End = 2017.04934976044 
## Frequency = 52.1785714285714 
## [1] 9.269 9.278 8.465 8.470 8.069 8.039
## [1] 52.17857

Analysis of US finished motor gasoline product supplied

There is an increasing trend in the US finished motor gasoline product supplied from 1991 to 2017. The seasonality trend shows an uptick in the middle of a year during the summer months and lower in the winter months. Perhaps the most important item is the very strong auto correlation between the lags.

