Neural Networks

John Watters

Neural Network, Accident Data

11/25/2020

##Data Acquisition

library(doParallel)

## Loading required package: foreach

## Loading required package: iterators

## Loading required package: parallel

library(neuralnet)
library(caret)

## Loading required package: ggplot2

## Loading required package: lattice

library(nnet)

#Reading the data:
accidents.df <- read.csv("E:/R/Accidentsnn.csv")

#Selecting the variables:
vars=c("ALCHL_I", "PROFIL_I_R", "VEH_INVL")
str(accidents.df)

## 'data.frame':    999 obs. of  5 variables:
##  $ ALCHL_I   : int  2 2 1 2 2 2 2 2 2 2 ...
##  $ PROFIL_I_R: int  0 1 0 0 1 0 0 1 1 0 ...
##  $ SUR_COND  : int  1 1 1 2 1 1 2 2 1 1 ...
##  $ VEH_INVL  : int  1 1 1 2 2 1 1 1 1 1 ...
##  $ MAX_SEV_IR: int  0 2 0 1 1 0 2 1 1 0 ...

Partitioning the Data

library(doParallel)
library(neuralnet)
library(caret)
library(nnet)

set.seed(1)  
train.index <- sample(c(1:dim(accidents.df)[1]), dim(accidents.df)[1]*0.6)  
training <- accidents.df[train.index, ]
validation <- accidents.df[-train.index, ]


set.seed(122)
training=sample(row.names(accidents.df), dim(accidents.df)[1]*0.6)
validation=setdiff(row.names(accidents.df), training)

Our seeds are set and we are using a 60/40 split of data (1*.06) with 599 observations in the training data and 400 in the validation data.

Dummy Variable Creation

library(doParallel)
library(neuralnet)
library(caret)
library(nnet)

#When y has multiple classes you need to dummify:
trainData <- cbind(accidents.df[training,c(vars)], 
                   class.ind(accidents.df[training,]$SUR_COND),
                   class.ind(accidents.df[training,]$MAX_SEV_IR))

names(trainData)=c(vars, 
    paste("SUR_COND_", c(1, 2, 3, 4, 9), sep=""), paste("MAX_SEV_IR_", c(0, 1, 2), sep=""))

validData <- cbind(accidents.df[validation,c(vars)], 
                   class.ind(accidents.df[validation,]$SUR_COND),
                   class.ind(accidents.df[validation,]$MAX_SEV_IR))

names(validData)=c(vars, 
    paste("SUR_COND_", c(1, 2, 3, 4, 9), sep=""), paste("MAX_SEV_IR_", c(0, 1, 2), sep=""))

dim(trainData)

## [1] 599  11

dim(validData)

## [1] 400  11

Above we added additional surface condition columns and injury columns to our training data and validation data sets. Our dummy variables are now set, and our output classifications will be set in the next step.

Neural Network Model

library(doParallel)
library(neuralnet)
library(caret)
library(nnet)

#Runing nn with 2 hidden nodes: 
#Use hidden= with a vector of integers specifying number of hidden nodes in each layer
nn.acc <- neuralnet(MAX_SEV_IR_0 + MAX_SEV_IR_1 + MAX_SEV_IR_2 ~ 
                  ALCHL_I + PROFIL_I_R + VEH_INVL + SUR_COND_1 + SUR_COND_2 
                  + SUR_COND_3 + SUR_COND_4, data = trainData, hidden = 2,rep=100,linear.output = FALSE)

#Displaying weights:
#nn.acc$weights

#Displaying predictions:
#prediction(nn.acc)

#Plotting the network:
plot(nn.acc, rep="best")

training.prediction=compute(nn.acc, trainData[,-c(8:11)])
training.class=apply(training.prediction$net.result,1,which.max)-1
confusionMatrix(as.factor(training.class), as.factor(accidents.df[training,]$MAX_SEV_IR))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   0   1   2
##          0 326   0  18
##          1   0 162  41
##          2  11   7  34
## 
## Overall Statistics
##                                          
##                Accuracy : 0.8715         
##                  95% CI : (0.842, 0.8972)
##     No Information Rate : 0.5626         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.7736         
##                                          
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 1 Class: 2
## Sensitivity            0.9674   0.9586  0.36559
## Specificity            0.9313   0.9047  0.96443
## Pos Pred Value         0.9477   0.7980  0.65385
## Neg Pred Value         0.9569   0.9823  0.89214
## Prevalence             0.5626   0.2821  0.15526
## Detection Rate         0.5442   0.2705  0.05676
## Detection Prevalence   0.5743   0.3389  0.08681
## Balanced Accuracy      0.9493   0.9316  0.66501

validation.prediction=compute(nn.acc, validData[,-c(8:11)])
validation.class=apply(validation.prediction$net.result,1,which.max)-1
confusionMatrix(as.factor(validation.class), as.factor(accidents.df[validation,]$MAX_SEV_IR))

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   0   1   2
##          0 205   0  19
##          1   0 126  22
##          2   9   4  15
## 
## Overall Statistics
##                                           
##                Accuracy : 0.865           
##                  95% CI : (0.8276, 0.8969)
##     No Information Rate : 0.535           
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.7633          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 1 Class: 2
## Sensitivity            0.9579   0.9692   0.2679
## Specificity            0.8978   0.9185   0.9622
## Pos Pred Value         0.9152   0.8514   0.5357
## Neg Pred Value         0.9489   0.9841   0.8898
## Prevalence             0.5350   0.3250   0.1400
## Detection Rate         0.5125   0.3150   0.0375
## Detection Prevalence   0.5600   0.3700   0.0700
## Balanced Accuracy      0.9279   0.9439   0.6150

In this model, we used our seven predictors as the seven nodes in the input layer and three neurons in the output layer, one representing each class; 0, 1, and 2 in the matrix with two hidden layers and set a repetition length of 100 (100 runs). Our output nodes are MAX_SEV_IR, the presence of injuries/fatalities variable, with 0 representing no injuries, 1 representing injuries, and 2 representing fatalities. As we look at the visualization for best runs, we can see the initial weights, inputs, and outputs from our input nodes through our hidden layers and to the output nodes. Our output layer obtains the input values from the hidden layer, taking a weighted sum of these input values and applying the function which output is displayed numerically on our plot (written as 1 / 1 + e-[all corresponding input values]). Finally, moving on to our Confusion Matrix, we see our overall accuracy at around 87% with the reference and prediction numbers showing our totals across the columns for each. By analyzing our true positive values in the matrix and our statistics by class breakdown, we can see that the Class 0 predictions had the most accurate returns. With a much higher detection rate and prevalence then Class 1 or 2, this checks with what I would expect to see. The least accurate predictions being the fatal class also checks as the prevalence is quite low (lucky for automobile drivers) and other stats like balanced accuracy are also significantly lower. The base accuracy of this model is much higher without considering the Class 2 neurons, but overall, this model is an effective predictor for accident outcomes.