1- Use the help function to explore what the series gold, woolyrnq and gas represent
Use autoplot() to plot each of these in separate plots.
What is the frequency of each series? Hint: apply the frequency() function.
Use which.max() to spot the outlier in the gold series. Which observation was it?
I will download and install the fpp2 package, which contains datasets and functions for forecasting using the principles outlined in the book “Forecasting: Principles and Practice” by Rob J Hyndman and George Athanasopoulos.
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.5 ──
## ✔ ggplot2 3.5.1 ✔ fma 2.5
## ✔ forecast 8.20 ✔ expsmooth 2.3
##
we’ll use the built-in gold dataset from the fpp2 package
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 306.25 299.50 303.45 296.75 304.40 298.35
If you need to work with the data in a tabular format, you can convert the time series object to a data frame but not recommened here since we have the time series data
## time price
## 1 1 306.25
## 2 2 299.50
## 3 3 303.45
## 4 4 296.75
## 5 5 304.40
## 6 6 298.35
The tsdisplay(gold) function generates a series of plots including a time series plot, an autocorrelation function (ACF) plot, and a partial autocorrelation function (PACF) plot, providing a comprehensive visual summary of the gold time series data’s patterns and dependencies.
Next step is to use the autoplot() function on the gold ts data.
if we need to explain it more we can add more details to the graph
autoplot(gold) + ggtitle("Plot for Gold Time Series") + xlab("Count of Day") + ylab("Price in US Dollars")Two notable features on this plot : the significant spike around the 700 mark, and the consistent dip starting at approximately day 30 or 40. Unlike the spikes in other valleys, this dip appears to maintain a consistent pattern.
The next task to to use the frequency function to determine the frequency of the old series.
gold_frequency <- frequency(gold)
print(paste0("The frequency for the gold time series data is: ", gold_frequency, "."))## [1] "The frequency for the gold time series data is: 1."
we found out that earlier when we run head(gold), but we
did that for extra practice and have fun,Now let’s “spot the outlier” in
this series.
max_index_of_large_value <- which.max(gold)
print(paste0("The outlier is: ", max_index_of_large_value, "."))## [1] "The outlier is: 770."
max_index_of_large_value <- which.max(gold)
value_of_outlier <- gold[max_index_of_large_value]
print(paste0("The value of the outlier at index ", max_index_of_large_value, " is: ", value_of_outlier, "."))## [1] "The value of the outlier at index 770 is: 593.7."
This output indicates that the value of the outlier (or maximum value) in the gold time series data is 593.7, located at index 770 in the vector. Adjust the gold vector and the which.max() function call with your actual time series data as needed to find and identify outliers or extreme values.
Let’s move on to woolyrnq data
We can see that the woolyrnq provides the data on the “quarterly production of wollen yarn in Australia” for the time period of March 1965 - September 1994. The wool is measured in tonnes.
## Qtr1 Qtr2 Qtr3 Qtr4
## 1965 6172 6709 6633 6660
## 1966 6786 6800
the autoplot function, which shows a decreasing overall trend in the
production of wool. I’m wondering what happened in 1975 to cause the
huge dip in production.
autoplot(woolyrnq) + ggtitle("Plot for Wool Timeseries") + xlab("Year") + ylab("Weight Produced in Tonnes")wf <- frequency(woolyrnq)
print(paste0("The frequency for the woolyrnq time series data is: ", wf, " or quarterly"))## [1] "The frequency for the woolyrnq time series data is: 4 or quarterly"
Lastly, we’ll take a look at the gas series, using the help and autoplot functions to see what we can gather.
We see that the gas time series data is the data for the Australia’s monthly gas production for 1956 through 1995. The auto plot shows what I think is a dramatic overall increase in the production of gas in Australia that began in 1970.
gasf <- frequency(gas)
print(paste0("The frequency for the gas time series data is: ", gasf, " or monthly"))## [1] "The frequency for the gas time series data is: 12 or monthly"