DATA622: Homework 3

Perform an analysis of the dataset used in Homework #2 using the SVM algorithm. Compare the results with the results from previous homework.

For this week’s homework, I’m using UCI ML’s ‘Absenteeism’ data set. This data set has information about callouts in a Brazilian firm over a 3 year period.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.393   4.481   4.534   4.681   4.771   5.949

## 
## Call:
## svm.default(x = dummies_train[, -1], y = dummies_train$V1)
## 
## 
## Parameters:
##    SVM-Type:  eps-regression 
##  SVM-Kernel:  radial 
##        cost:  1 
##       gamma:  0.009433962 
##     epsilon:  0.1 
## 
## 
## Number of Support Vectors:  552

## [1] 119.0335

The performance of SVM on these training data is better than the random forest model in Homework 2 but slightly worse than the more complicated decision tree model

DATA622: Homework 3

by Thomas Hill