A time series of 2 years with daily observations
A time series of 6 months with business days observations
#A time series of 2 years with daily observations
rm(list=ls())
library(fpp2)
## Warning: package 'fpp2' was built under R version 3.5.2
## Loading required package: ggplot2
## Loading required package: forecast
## Warning: package 'forecast' was built under R version 3.5.2
## Loading required package: fma
## Warning: package 'fma' was built under R version 3.5.2
## Loading required package: expsmooth
## Warning: package 'expsmooth' was built under R version 3.5.2
data_vector <-rnorm(730)
Series1 <- ts(data_vector, start = 2018, deltat=1/365)
autoplot(Series1)
library(bizdays)
## Warning: package 'bizdays' was built under R version 3.5.2
##
## Attaching package: 'bizdays'
## The following object is masked from 'package:forecast':
##
## bizdays
## The following object is masked from 'package:stats':
##
## offset
library(timeDate)
## Warning: package 'timeDate' was built under R version 3.5.1
Calendar <- timeSequence(as.Date("2019-01-01"), as.Date("2019-06-30"))
busday <- Calendar[isBizday(Calendar)]
Series2 <- ts(rnorm(length(busday)), start = c(2019,1), end = c(2019,length(busday)), frequency = 365)
autoplot(Series2)
#Chicken
library(fpp2)
library(ggplot2)
#? chicken
autoplot(chicken) +
ggtitle("Price of chicken in US (constant dollars): 1924-1993")+
ylab("Price in $") +
xlab("Year")
frequency(chicken)
## [1] 1
which.max(chicken)
## [1] 22
The granularity of the data is at yearly, that can be identified by frequency = 1. By seeing the out of max function, we understood that max price is attained at 1946( 1924+ 22) Since then there is a decreasing trend except an outlier that happened approximately between 1972-1973 period
#dole
#? dole
#tsdisplay(dole)
plot(dole)
autoplot(dole) +
ggtitle("Unemployment benefits in Australia")+
ylab("Total Numberof People") +
xlab("Year")
frequency(dole)
## [1] 12
which.max(dole)
## [1] 439
ggseasonplot(dole, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("$ Total Numberof People") +
ggtitle("Unemployment benefits in Australia")
Here the data is at monthly level identified based on the frequency From 1974 till 1984 there is a sudden steep in the total number of people’s unemployment benefits by seeing the autoplots. By seeing the seasonplot at month level, there seems to be a cyclic pattern maintained at most of the month irrespective of the sudden increase in benefits utilization from 1974.
#usdeaths
#? usdeaths
plot(usdeaths)
autoplot(usdeaths) +
ggtitle("Accidental deaths in USA")+
ylab("Accidental death Total") +
xlab("Year")
frequency(usdeaths)
## [1] 12
which.max(usdeaths)
## [1] 7
ggseasonplot(usdeaths, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("$ Accidental death Total") +
ggtitle("Accidental deaths in USA")
By seeing the accidental deaths in USA from 1973-1979, there seems to be a regualar ups and downs while plotting the accidents at yearly level. So at a highlevel it looks like a seasonal pattern. This can be reiterated in seasonal plot as well. It looks like majorty of accidental death has happened from May till August,i.e in Summer and less death has happened between January and February.
#? gold
tsdisplay(gold)
autoplot(gold) +
ggtitle("Daily morning gold prices")+
ylab("Price in $") +
xlab("Days")
frequency(gold)
## [1] 1
which.max(gold)
## [1] 770
Here the daily morning gold price is showing a comprehensive positive trend till 800th day from Jan 1 1985, After that trend is is slowly decreasing. At 770th day from jan 1 1985,there is a sudden increase in price and it looks like an outlier. If there is any supportive justification for this sudden variation, we have to remove this outlier from the dataset.
#? h02
tsdisplay(h02)
autoplot(h02) +
ggtitle("drug sales in Australia from July 1991 to June 2008")+
ylab("Sale") +
xlab("Year")
frequency(h02)
## [1] 12
which.max(h02)
## [1] 162
ggseasonplot(h02, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("$ Total Sale") +
ggtitle("drug sales in Australia from July 1991 to June 2008")
By seeing the autoplot, there seems to be a seasonal pattern with the data. we have to further analyze the data at month level to see new insights. After seeing the seasonal plot, there is a clear proof of seasonlity in data across years in each month with sudden steep from jan to feb and then a positive trend from feb to Dec.
#? gasoline
tsdisplay(gasoline)
head(gasoline)
## Time Series:
## Start = 1991.1
## End = 1991.19582477755
## Frequency = 52.1785714285714
## [1] 6.621 6.433 6.582 7.224 6.875 6.947
autoplot(gasoline) +
ggtitle("Gasoline product supply in US")+
ylab("Sale") +
xlab("Year")
frequency(gasoline)
## [1] 52.17857
which.max(gasoline)
## [1] 1324
ggseasonplot(gasoline, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("$ Total Sale") +
ggtitle("Gasoline product supply in US")
ggAcf(gasoline)
By seeing the autoplot the gasoline product sales in US is trending positive over the years except between 2011-2013. This analysis is at yearly granular level. After seeing our data in weekly level,there is a seasonality between weeks 18 & 35 this looks like sales have more in summer in most of the years.
Read the data into R and define a ts object of your chosen column
Explore the defined time series using functions that described in the lecture (Ex: autoplot(), ggAcf(), gglagplot())
Can you spot any seasonality, cyclicity and trend? What do you learn about the series?
library("readxl")
## Warning: package 'readxl' was built under R version 3.5.1
# xls files
raw_data <- read_excel("C:\\Users\\nidhi\\Downloads\\retail.xlsx")
my_data <- raw_data[,2]
my_data <-my_data[-1,]
Turnover <- ts(my_data, start = c(1982, 4), frequency=12)
autoplot(Turnover) +
ggtitle("New SouthWales SuperMarket & Grocery Stores Turnover")+
ylab("Sale in Thousand $") +
xlab("Year")
ggseasonplot(Turnover, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("$ Total Sale in Thousand $") +
ggtitle("New SouthWales SuperMarket & Grocery Stores Turnover")
ggAcf(Turnover)
By seeing the autoplot, the Turnover is constantly increasing over the years. we can see constant ups and downs between years to get a doubt on seasonality. Inorder to finalize that I am checking the seasonality plot as well and in that it clearly follows a pattern from Jan to Dec for all the years. Autocorrelation is also larger by seeing the ggacf graph. and which reiterates the seasonality of the data.
plot(dj)
#?diff
length(dj)
## [1] 292
ddj <- diff(dj)
autoplot(ddj)
ggAcf(ddj)
Here the majority of spikes in ACF is under the blue dash lines, that we can classify this series as a white noise. We can identify the threshold by using the formula 2/sqrt(length of time series .i.e 2/(sqrt(292))=.117
Compare the differences between the arrivals from these four countries. Can you identify any unusual observations?
autoplot(arrivals)
frequency(arrivals)
## [1] 4
#?arrivals
autoplot(arrivals) +
ggtitle("International Arrivals to Australia")+
ylab("Arrival in Thousands") +
xlab("Year")
autoplot(arrivals,facet=T) +
ggtitle("International Arrivals to Australia")+
ylab("Arrival in Thousands") +
xlab("Year")
japan <- arrivals[, "Japan"]
NZ <- arrivals[, "NZ"]
UK <- arrivals[, "UK"]
US <- arrivals[, "US"]
ggseasonplot(japan, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("Arrival in Thousandsfrom Japan") +
ggtitle("International Arrivals to Australia")
ggseasonplot(NZ, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("Arrival in Thousands FROM NZ") +
ggtitle("International Arrivals to Australia")
ggseasonplot(UK, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("Arrival in Thousands from UK") +
ggtitle("International Arrivals to Australia")
ggseasonplot(US, year.labels=TRUE,
year.labels.left=TRUE) +
ylab("Arrival in Thousands from US") +
ggtitle("International Arrivals to Australia")
By seeing the autoplot,More tourists are coming from Newzealand. Since data is at quarter level,I splitted the dataset into different dataframe by countries to plot the dataset at quarter level. Japan has maintained a seasonal pattern where more visitors are coming in Q3 and less visitors are coming in Q2 Newzealand has maintained a good seasonal pattern with more arrivals happened on quarter 3. Similarly UK has a seasonal pattern with q4 having mkore visitors compared to a low in Q2. US has also maintained a season pattern except few outliers Q4 for couple of years.