library(fpp2)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## -- Attaching packages ---------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.5 v fma 2.4
## v forecast 8.15 v expsmooth 2.3
##
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## v purrr 0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
Use the help function to explore what the series gold, woolyrnq and gas represent.
a.Use autoplot() to plot each of these in separate plots.
b.What is the frequency of each series? Hint: apply the frequency() function.
c.Use which.max() to spot the outlier in the gold series. Which observation was it?
#help(gold)
# Daily Morning gold prices in US dollars. 1 January 1985-31 March 1989
The help functions in gold series refers daily morning gold prices in US dollars for the time period spanning January 1, 1985 through March 31, 1989.
autoplot(gold,col="orange") + ggtitle("Plot for Gold Time Series") + xlab("Count of Day") + ylab("Price in US Dollars")

Frequency of gold
goldfreq <- frequency(gold)
print(goldfreq)
## [1] 1
The frequency is 1,since the timeseries is showing daily prices for gold.
Outlier in gold series. In year 1987 there was a 25% price hike in gold.
The outlier in the gold series is 770 th day, which is in year 1987.In year 1987 there was a 25% price hike in gold.
goldout<- which.max(gold)
print(goldout)
## [1] 770
Quarterly production of woollen yarn in Australia
woolyrnq provides the data on the “quarterly production of wollen yarn in Australia” for the time period of March 1965 - September 1994. The wool is measured in tonnes.
help(woolyrnq)
## starting httpd help server ... done
autoplot(woolyrnq,col="skyblue") + ggtitle("Plot for Wool Timeseries") + xlab("Year") + ylab("Weight Produced in Tonnes")

The graph displays a decreasing trend in wool production in Australia. A significant decrease is visible in the year 1975.
Wool Frequency
The frequency of wool production data is 4 or quarterly.
woolfreq <- frequency(woolyrnq)
print(woolfreq)
## [1] 4
woolout<- which.max(woolyrnq)
print(woolout)
## [1] 21
Australian monthly gas production
help(gas)
Gas time series data is the data for the Australia’s monthly gas production for 1956 through 1995. The auto plot displays increase in the production of gas in Australia that began in 1970.
autoplot(gas,col='blue') + ggtitle("Plot for Gas Time Series") + xlab("Year") + ylab("Amount Produced")

gasfreq <- frequency(gas)
print(gasfreq)
## [1] 12
The frequency of gas time series data is monthly.
2.3 Download some monthly Australian retail data from the book website. These represent retail sales in various categories for different Australian states, and are stored in a MS-Excel file.
a. You can read the data into R with the following script:
retaildata <- readxl::read_excel("C:\\Users\\malia\\Downloads\\retail.xlsx", skip=1)
b. Select one of the time series as follows (but replace the column name with your own chosen column):
I have selected the “New South Wales:Food Retailing” column. The column name from the excel file is renamed as my_col.
my_col <- ts(retaildata[, "A3349398A"], frequency = 12, start = c(1992, 4))
c. Explore your chosen retail time series using the following functions:
autoplot(), ggseasonplot(), ggsubseriesplot(), gglagplot(), ggAcf()
An upword trend can be noticed in the food retailing turnover.However, Cyclicity is not observed in the data.
autoplot(my_col ) + ggtitle("Turnover ; New South Wales ;Food Retailing") +
xlab("Year") + ylab("Sales")

Seasonal Plot
Seasonality can be noticed in the food reatiling data. We can observe that the food reatail turnover is going down in February. And turnover is upward in December. Holiday season may have an influence in the increased turnover.
ggseasonplot(my_col, year.labels = TRUE, year.labels.left = TRUE) +
ylab("Sales") + ggtitle("Seasonal Plot")
### Seasonal Subseries Plot
ggsubseriesplot(my_col) + ylab("Sales") +
ggtitle("Seasonal Subseries Plot")

The seasonal subseries plot also confirms that average food retail turnover is higher in the month of December than other months.
Lag plot
gglagplot(my_col,lag=12)

We can notice a strong positive relationship between timeseries and lagged timeseries in lag 12 and lag 24.
ggAcf(my_col)

We can observe a positive correlation between all the time series and lagged timeseries.The data series is both trended and seasonal.The auto correlation of small lags tend to be large and positive, and the auto correlation ls larger at the seasonal lag of 12.
The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years.
Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle?
help("plastics")
Plastic dataset has monthly sales of product A for a plastics manufacturer.
autoplot(plastics, col= "red") +ggtitle("Monthly Sales of Product A for a plastics manufacturer")+
xlab("Year") + ylab("Sales")

An increasing trend can be observed in the data.
b.Use a classical multiplicative decomposition to calculate the trend-cycle and seasonal indices.
decompose(plastics,type="multiplicative") %>%
autoplot() + xlab("Year") +
ggtitle("Multiplicative decomposition
of plastics dataset index")

Do the results support the graphical interpretation from part a?
The results support the graphical interpretation as we can identify the upward trend.The evenly spaced peaks refers seasonality as well.
Compute and plot the seasonally adjusted data.
mult_decomp <- plastics %>%
decompose(type="multiplicative")
autoplot(plastics, series="Original Data") +
autolayer(seasadj(mult_decomp), series="Seasonally Adjusted") +
ggtitle("Sales of Product A for a Plastics Manufacturer") +
ylab("Monthly Sales of Product A")

The seasonally adjusted plot shows the monthly sales of product A after removing seasonal data.The seasonally adjusted plot is combination of upward trend and remainder.
Change one observation to be an outlier (e.g., add 500 to one observation), and recompute the seasonally adjusted data. What is the effect of the outlier?
plastic_outlier <- plastics
plastic_outlier[50] <- plastic_outlier[50]+500
mult_decomp2 <- plastic_outlier %>%
decompose(type="multiplicative")
plastic_outlier %>%
decompose(type="multiplicative") %>%
autoplot() +
ggtitle("Sales of Product A for a Plastics Manufacturer")

autoplot(plastic_outlier, series="Original Data") +
autolayer(seasadj(mult_decomp2), series="Seasonally Adjusted") +
ggtitle("Sales of Product A for a Plastics Manufacturer") +
ylab("Monthly Sales of Product A")

After adding 500 to the 50th data point we can see a large spike in seasonally adjusted data.The addition has a relatively small effect on seasonal dataset because seasonality is uniform for each year and only one component was changed.
Does it make any difference if the outlier is near the end rather than in the middle of the time series?
plastic_outlier2 <- plastics
plastic_outlier2[60] <- plastic_outlier[60]+500
mult_decomp3 <- plastic_outlier2 %>%
decompose(type="multiplicative")
plastic_outlier2 %>%
decompose(type="multiplicative") %>%
autoplot() +
ggtitle("Sales of Product A for a Plastics Manufacturer")

autoplot(plastic_outlier2, series="Original Data") +
autolayer(seasadj(mult_decomp3), series="Seasonally Adjusted") +
ggtitle("Sales of Product A for a Plastics Manufacturer") +
ylab("Monthly Sales of Product A")

The outlier near the end of the time series has a smaller effect on the seasonality of this time series, which has similar effect as part e.The shape of the seasonality has slightly changed.