Discussion #2
Go to (Data Market) [https://datamarket.com/data/list/?q=cat:ecc%20provider:tsdl]. Pick a time series and conduct additive and multiplicative decomposition. Which one worked better? How can you tell? How would you use the results in forecasting (or would you?)

I picked - Weekly closings of the Dow-Jones industrial average, July 1971 – August 1974.

# Loading libraries
library(forecast)
library(xts)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(TTR)
library(tseries)

# LOADING DATASET
# weekly-closings-of-the-dowjones.csv
dj<-read.csv(file.choose())

names(dj)
## [1] "Week"        "close.price"
names(dj)[2]<-c("close.price")

head(dj)
##       Week close.price
## 1 1971-W27      890.19
## 2 1971-W28      901.80
## 3 1971-W29      888.51
## 4 1971-W30      887.78
## 5 1971-W31      858.43
## 6 1971-W32      850.61

Plotting data

# Plot original series
plot(dj$close.price,type="l",xlab="Week",ylab="closing price",main="historical data")

# boxplots by month//Qtr
dj.ts.q<-ts(dj,frequency = 4)
dj.ts.m<-ts(dj,frequency = 12)

# boxplot(dj.ts.m[[1]],dj.ts.m["close.price"])

Given the weekly data, I am still trying to find out how to make a monthly/quarterly boxplot as discussed in class tonight. Any suggestions pals/ Prof?

Forecasting data

mul<-decompose(ts(dj[,2],frequency = 12), type="multiplicative")
add<-decompose(ts(dj[,2],frequency = 12), type="additive")

Plotting forecasts

# par(mfrow=c(1,2))
plot(mul)

plot(add)

Results
Which ones better ?

mul.me<-mean(na.omit(mul$random))
mul.mae<-mean(na.omit(abs(mul$random)))
mul.mse<-mean(na.omit(mul$random)^2)
mul.rmse<-sqrt(mul.mse)
mul.mape<-mean(abs(na.omit(mul$random/mul$x)))

add.me<-mean(na.omit(add$random))
add.mae<-mean(na.omit(abs(add$random)))
add.mse<-mean(na.omit(add$random)^2)
add.rmse<-sqrt(add.mse)
add.mape<-mean(abs(na.omit(add$random/add$x)))

r<-c("Mean Error","Mean Absolute Error", "Mean Squared Error", "Root Mean Sq Error", "Mean Absolute Percentage Error")
m<-c(mul.me,mul.mae,mul.mse,mul.rmse,mul.mape)
a<-c(add.me,add.mae,add.mse,add.rmse,add.mape)
bv<-c("Bias","Variance","Variance","Bias","Variance")
result<- data.frame(cbind(m,a,bv))
row.names(result)<-r
result
##                                                 m                  a
## Mean Error                       1.00054271742545  0.604398370726569
## Mean Absolute Error              1.00054271742545   14.5855216524216
## Mean Squared Error                1.0014876621613   324.805167241356
## Root Mean Sq Error                1.0007435546439   18.0223518787465
## Mean Absolute Percentage Error 0.0010989590335019 0.0160749863365594
##                                      bv
## Mean Error                         Bias
## Mean Absolute Error            Variance
## Mean Squared Error             Variance
## Root Mean Sq Error                 Bias
## Mean Absolute Percentage Error Variance
# calc_me<- function(objectname) {
#   ME<-sum(na.omit(abs(objectname$random)))/length(objectname$random)
#   print("Mean Error : ")
#   return(ME)
# }

# calc_me(mul)
# calc_me(add)

Conclusions
So looking at the ME errors I believe multiplicative is better in this case, unless I have done something wrong here because visually the two do look same for this data atleast.

Further study of residuals shows that multiplicative (from the above table) model results in residuals with higher bias but really low variance comparatively. Thus multiplicative decomposition is preferred over addicitive.

Also a closer look at the plots of residuals of the two I see that the random and seasonal parts of the timeseries are much smaller in magnitude for multiplicative than additive. This further suggests that multiplicative model seems better in this case.