Discussion 3

crime <- read.csv("crimeinboston.csv")

caraccident=crime[crime$OFFENSE_CODE_GROUP  == "Motor Vehicle Accident Response",]
caraccident=caraccident[,c(3,8)]
caraccident$OCCURRED_ON_DATE=as.Date(caraccident$OCCURRED_ON_DATE)
#caraccident=caraccident %>% count(caraccident$OCCURRED_ON_DATE)
#data <- as.xts(caraccident[,2],order.by=as.Date(caraccident[,1]))
#caraccident <- apply.monthly(data,sum)
#names(caraccident)[1] <- "Car_Accidents"
#n<-dim(caraccident)[1]
#caraccident<-caraccident[2:(n-1),]

dates <- as.Date(caraccident$OCCURRED_ON_DATE, '%d/%m/%Y')
monyr <- as.yearmon(dates)
caraccident$monyr=as.yearmon(dates) 
caraccident=caraccident %>% count(monyr)
names(caraccident)[2] <- "Car_Accidents"

WCarAccident <- ts(caraccident[,1], start = c(2015,07), frequency = 12)

The begining and ending of data was removed as they are significant outliers and will affect the forecast model.

autoplot(WCarAccident)+ 
  ggtitle("Motor Vehicle Accident Response in Boston") +
  xlab("Dates") +
  ylab("Accidents")

To better understand the data, I used decompose to look at the seasonal and trend factors. Both decomposition shows there is an increase trend in 2016 to 2017 and slower decrease trend in 2017 to 2018. The seasonality componet is suprising, showing less accidents in the winter time.

dec1<-decompose(WCarAccident,type="additive")  #decompose additive
dec2<-decompose(WCarAccident,type="multiplicative") #decompose multiplicative
autoplot(dec1)

autoplot(dec2)

sesfc=ses(WCarAccident, h=12)

#forecast
autoplot(sesfc)+
  autolayer(fitted(sesfc), series="ses")+
  ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
  ylab("Accidents")

First, using the simple smoothing prediction produced a flat line prediction.

holtfc = holt(WCarAccident,  h=12)
autoplot(holtfc) +
  autolayer(fitted(holtfc), series="Holt's method") +
  ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
  ylab("Accidents") +
  guides(colour=guide_legend(title="Forecast"))

holtdampfc <- holt(WCarAccident, damped=TRUE, phi = 0.8, h=12)
autoplot(holtdampfc) +
  autolayer(fitted(holtdampfc), series="Damped Holt's method") +
  ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
  ylab("Accidents") +
  guides(colour=guide_legend(title="Forecast"))

Looking at the holt method, the first method without a dampener produced an uptick estimation. With dampening, the forecast is very similar to ses.

seasonalholtfc<-hw(WCarAccident, seasonal="additive")
autoplot(seasonalholtfc)+
  autolayer(fitted(seasonalholtfc), series="Seasonal Holt's method")+
   ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
  ylab("Accidents")

The range is very large using the Holt Winter forecast. The number of accident could reach as high as 1900 or as low as 400 base on the 80% CI. The point prediction continue to be around 900-1200 which is in range.

etsfc <- ets(WCarAccident)
summary(etsfc)

## ETS(M,A,N) 
## 
## Call:
##  ets(y = WCarAccident) 
## 
##   Smoothing parameters:
##     alpha = 0.2001 
##     beta  = 0.0202 
## 
##   Initial states:
##     l = 2015.3333 
##     b = 0.0833 
## 
##   sigma:  0
## 
##       AIC      AICc       BIC 
## -2113.311 -2111.547 -2104.867 
## 
## Training set error measures:
##                        ME         RMSE          MAE          MPE         MAPE
## Training set 3.979039e-14 5.222175e-13 4.604317e-13 1.964289e-15 2.282554e-14
##                      MASE     ACF1
## Training set 4.604317e-13 0.842781

autoplot(etsfc)

As shown above, the ets model provide a much better ME and MAE

etsfc %>% forecast(h=24) %>%
  autoplot() +
  ylab("Motor Vehicle Accident in Boston")

accuracy(sesfc)

##                      ME       RMSE        MAE        MPE       MAPE       MASE
## Training set 0.08126545 0.08229311 0.08126545 0.00402886 0.00402886 0.08126545
##                       ACF1
## Training set -0.0005411516

accuracy(holtfc)

##                         ME        RMSE          MAE           MPE        MAPE
## Training set -3.410605e-14 5.08423e-13 4.433787e-13 -1.695828e-15 2.19822e-14
##                      MASE     ACF1
## Training set 4.433787e-13 0.828114

accuracy(holtdampfc)

##                     ME       RMSE        MAE         MPE         MAPE
## Training set 0.0164461 0.01696504 0.01674054 0.000815348 0.0008299573
##                    MASE       ACF1
## Training set 0.01674054 -0.4400081

accuracy(seasonalholtfc)

##                        ME        RMSE          MAE         MPE         MAPE
## Training set 6.025402e-13 8.83545e-13 8.071765e-13 2.98587e-14 4.001106e-14
##                      MASE      ACF1
## Training set 8.071765e-13 0.8998066

accuracy(etsfc)

##                        ME         RMSE          MAE          MPE         MAPE
## Training set 3.979039e-14 5.222175e-13 4.604317e-13 1.964289e-15 2.282554e-14
##                      MASE     ACF1
## Training set 4.604317e-13 0.842781

As shown above, the Holt Winter is more accurate looking at RMSE, ME. This means seasonaly is a strong componet related to accidents. This make sense since if the weather is nice will lead to more driving and higher chance of accidents. However, as discussed before, the range produce by Holt Winter model is very wide. Other models produce a relatively high RMSE and MAE.

Discussion 3

Yu Mu

11/11/2020