crime <- read.csv("crimeinboston.csv")
caraccident=crime[crime$OFFENSE_CODE_GROUP == "Motor Vehicle Accident Response",]
caraccident=caraccident[,c(3,8)]
caraccident$OCCURRED_ON_DATE=as.Date(caraccident$OCCURRED_ON_DATE)
#caraccident=caraccident %>% count(caraccident$OCCURRED_ON_DATE)
#data <- as.xts(caraccident[,2],order.by=as.Date(caraccident[,1]))
#caraccident <- apply.monthly(data,sum)
#names(caraccident)[1] <- "Car_Accidents"
#n<-dim(caraccident)[1]
#caraccident<-caraccident[2:(n-1),]
dates <- as.Date(caraccident$OCCURRED_ON_DATE, '%d/%m/%Y')
monyr <- as.yearmon(dates)
caraccident$monyr=as.yearmon(dates)
caraccident=caraccident %>% count(monyr)
names(caraccident)[2] <- "Car_Accidents"
WCarAccident <- ts(caraccident[,1], start = c(2015,07), frequency = 12)
The begining and ending of data was removed as they are significant outliers and will affect the forecast model.
autoplot(WCarAccident)+
ggtitle("Motor Vehicle Accident Response in Boston") +
xlab("Dates") +
ylab("Accidents")
To better understand the data, I used decompose to look at the seasonal and trend factors. Both decomposition shows there is an increase trend in 2016 to 2017 and slower decrease trend in 2017 to 2018. The seasonality componet is suprising, showing less accidents in the winter time.
dec1<-decompose(WCarAccident,type="additive") #decompose additive
dec2<-decompose(WCarAccident,type="multiplicative") #decompose multiplicative
autoplot(dec1)
autoplot(dec2)
sesfc=ses(WCarAccident, h=12)
#forecast
autoplot(sesfc)+
autolayer(fitted(sesfc), series="ses")+
ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
ylab("Accidents")
First, using the simple smoothing prediction produced a flat line prediction.
holtfc = holt(WCarAccident, h=12)
autoplot(holtfc) +
autolayer(fitted(holtfc), series="Holt's method") +
ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
ylab("Accidents") +
guides(colour=guide_legend(title="Forecast"))
holtdampfc <- holt(WCarAccident, damped=TRUE, phi = 0.8, h=12)
autoplot(holtdampfc) +
autolayer(fitted(holtdampfc), series="Damped Holt's method") +
ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
ylab("Accidents") +
guides(colour=guide_legend(title="Forecast"))
Looking at the holt method, the first method without a dampener produced an uptick estimation. With dampening, the forecast is very similar to ses.
seasonalholtfc<-hw(WCarAccident, seasonal="additive")
autoplot(seasonalholtfc)+
autolayer(fitted(seasonalholtfc), series="Seasonal Holt's method")+
ggtitle("Motor Vehicle Accident in Boston") + xlab("Date (Week)") +
ylab("Accidents")
The range is very large using the Holt Winter forecast. The number of accident could reach as high as 1900 or as low as 400 base on the 80% CI. The point prediction continue to be around 900-1200 which is in range.
etsfc <- ets(WCarAccident)
summary(etsfc)
## ETS(M,A,N)
##
## Call:
## ets(y = WCarAccident)
##
## Smoothing parameters:
## alpha = 0.2001
## beta = 0.0202
##
## Initial states:
## l = 2015.3333
## b = 0.0833
##
## sigma: 0
##
## AIC AICc BIC
## -2113.311 -2111.547 -2104.867
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set 3.979039e-14 5.222175e-13 4.604317e-13 1.964289e-15 2.282554e-14
## MASE ACF1
## Training set 4.604317e-13 0.842781
autoplot(etsfc)
As shown above, the ets model provide a much better ME and MAE
etsfc %>% forecast(h=24) %>%
autoplot() +
ylab("Motor Vehicle Accident in Boston")
accuracy(sesfc)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.08126545 0.08229311 0.08126545 0.00402886 0.00402886 0.08126545
## ACF1
## Training set -0.0005411516
accuracy(holtfc)
## ME RMSE MAE MPE MAPE
## Training set -3.410605e-14 5.08423e-13 4.433787e-13 -1.695828e-15 2.19822e-14
## MASE ACF1
## Training set 4.433787e-13 0.828114
accuracy(holtdampfc)
## ME RMSE MAE MPE MAPE
## Training set 0.0164461 0.01696504 0.01674054 0.000815348 0.0008299573
## MASE ACF1
## Training set 0.01674054 -0.4400081
accuracy(seasonalholtfc)
## ME RMSE MAE MPE MAPE
## Training set 6.025402e-13 8.83545e-13 8.071765e-13 2.98587e-14 4.001106e-14
## MASE ACF1
## Training set 8.071765e-13 0.8998066
accuracy(etsfc)
## ME RMSE MAE MPE MAPE
## Training set 3.979039e-14 5.222175e-13 4.604317e-13 1.964289e-15 2.282554e-14
## MASE ACF1
## Training set 4.604317e-13 0.842781
As shown above, the Holt Winter is more accurate looking at RMSE, ME. This means seasonaly is a strong componet related to accidents. This make sense since if the weather is nice will lead to more driving and higher chance of accidents. However, as discussed before, the range produce by Holt Winter model is very wide. Other models produce a relatively high RMSE and MAE.