Discussion #2
Go to (Data Market) [https://datamarket.com/data/list/?q=cat:ecc%20provider:tsdl]. Pick a time series and conduct additive and multiplicative decomposition. Which one worked better? How can you tell? How would you use the results in forecasting (or would you?)
I picked - Weekly closings of the Dow-Jones industrial average, July 1971 – August 1974.
# Loading libraries
library(forecast)
library(xts)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(TTR)
library(tseries)
# LOADING DATASET
# weekly-closings-of-the-dowjones.csv
dj<-read.csv(file.choose())
names(dj)
## [1] "Week" "close.price"
names(dj)[2]<-c("close.price")
head(dj)
## Week close.price
## 1 1971-W27 890.19
## 2 1971-W28 901.80
## 3 1971-W29 888.51
## 4 1971-W30 887.78
## 5 1971-W31 858.43
## 6 1971-W32 850.61
Plotting data
# Plot original series
plot(dj$close.price,type="l",xlab="Week",ylab="closing price",main="historical data")
# boxplots by month//Qtr
dj.ts.q<-ts(dj,frequency = 4)
dj.ts.m<-ts(dj,frequency = 12)
# boxplot(dj.ts.m[[1]],dj.ts.m["close.price"])
Given the weekly data, I am still trying to find out how to make a monthly/quarterly boxplot as discussed in class tonight. Any suggestions pals/ Prof?
Forecasting data
mul<-decompose(ts(dj[,2],frequency = 12), type="multiplicative")
add<-decompose(ts(dj[,2],frequency = 12), type="additive")
Plotting forecasts
# par(mfrow=c(1,2))
plot(mul)
plot(add)
Results
Which ones better ?
mul.me<-mean(na.omit(mul$random))
mul.mae<-mean(na.omit(abs(mul$random)))
mul.mse<-mean(na.omit(mul$random)^2)
mul.rmse<-sqrt(mul.mse)
mul.mape<-mean(abs(na.omit(mul$random/mul$x)))
add.me<-mean(na.omit(add$random))
add.mae<-mean(na.omit(abs(add$random)))
add.mse<-mean(na.omit(add$random)^2)
add.rmse<-sqrt(add.mse)
add.mape<-mean(abs(na.omit(add$random/add$x)))
r<-c("Mean Error","Mean Absolute Error", "Mean Squared Error", "Root Mean Sq Error", "Mean Absolute Percentage Error")
m<-c(mul.me,mul.mae,mul.mse,mul.rmse,mul.mape)
a<-c(add.me,add.mae,add.mse,add.rmse,add.mape)
bv<-c("Bias","Variance","Variance","Bias","Variance")
result<- data.frame(cbind(m,a,bv))
row.names(result)<-r
result
## m a
## Mean Error 1.00054271742545 0.604398370726569
## Mean Absolute Error 1.00054271742545 14.5855216524216
## Mean Squared Error 1.0014876621613 324.805167241356
## Root Mean Sq Error 1.0007435546439 18.0223518787465
## Mean Absolute Percentage Error 0.0010989590335019 0.0160749863365594
## bv
## Mean Error Bias
## Mean Absolute Error Variance
## Mean Squared Error Variance
## Root Mean Sq Error Bias
## Mean Absolute Percentage Error Variance
# calc_me<- function(objectname) {
# ME<-sum(na.omit(abs(objectname$random)))/length(objectname$random)
# print("Mean Error : ")
# return(ME)
# }
# calc_me(mul)
# calc_me(add)
Conclusions
So looking at the ME errors I believe multiplicative is better in this case, unless I have done something wrong here because visually the two do look same for this data atleast.
Further study of residuals shows that multiplicative (from the above table) model results in residuals with higher bias but really low variance comparatively. Thus multiplicative decomposition is preferred over addicitive.
Also a closer look at the plots of residuals of the two I see that the random and seasonal parts of the timeseries are much smaller in magnitude for multiplicative than additive. This further suggests that multiplicative model seems better in this case.