Synopsis

In this paper we are going to look at data that was collected using a fitbit type device. In the study subjects were asked to perform barbell lifts correctly and four ways incorrectly. These were different techniques were recorded in the variable class and labeled “A”, “B”, “C”, “D”, and “E”. We are going to try to create a machine learning model that will use the data collected from the device to correctly predict which way the subject performed the exercise. We will build two different models, then choose the best one to predict classes based on a separate set of data.

Processing Data

Loading Libraries, Setting Seed and Importing Data

Here we set the seed for the project. We did this to make sure that the model is reproducible. We then loaded libraries that we would use to manipulate, graph and create machine learning algorithms for the data. We also stored the data in the variables data. The data that we will use at the end of the paper to test our model is stored in the variable test_test.

set.seed(628436)
library(dplyr)
library(ggplot2)
library(gridExtra)
library(caret)

data <- read.csv("traindata.csv", stringsAsFactors = FALSE)
test_test <- read.csv("testdata.csv", stringsAsFactors = FALSE)

Exploring the Data

Partitioning the Data

Here we begin exploratory analysis on the data. First we break the data into two parts train and test. The train set contains 80% of the data and the test set contains the other 20%. The model will be built on the train set. We will then check the accuracy of our model by trying to predict the classe variable in the test set. We check the dimensions of both the train and test variables.

inTrain <- createDataPartition(y = data$classe, p = 0.8, list = FALSE)
train <- data[inTrain, ]
test <- data[-inTrain, ]
dim(train)

## [1] 15699   160

dim(test)

## [1] 3923  160

Removing Near Zero Values

We see that this is a large dataset. So we decide to remove variables that have near zero variance. These are variables that are pretty much uniform and will not have much effect on the model.

nzv <- nearZeroVar(test_test, saveMetrics=TRUE)
train <- train[,nzv$nzv==FALSE]
test <- test[,nzv$nzv==FALSE]

Glimpsing Variables

glimpse(train)

## Observations: 15,699
## Variables: 59
## $ X                    (int) 1, 2, 3, 5, 6, 7, 10, 12, 13, 14, 16, 17,...
## $ user_name            (chr) "carlitos", "carlitos", "carlitos", "carl...
## $ raw_timestamp_part_1 (int) 1323084231, 1323084231, 1323084231, 13230...
## $ raw_timestamp_part_2 (int) 788290, 808298, 820366, 196328, 304277, 3...
## $ cvtd_timestamp       (chr) "05/12/2011 11:23", "05/12/2011 11:23", "...
## $ num_window           (int) 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 1...
## $ roll_belt            (dbl) 1.41, 1.41, 1.42, 1.48, 1.45, 1.42, 1.45,...
## $ pitch_belt           (dbl) 8.07, 8.07, 8.07, 8.07, 8.06, 8.09, 8.17,...
## $ yaw_belt             (dbl) -94.4, -94.4, -94.4, -94.4, -94.4, -94.4,...
## $ total_accel_belt     (int) 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,...
## $ gyros_belt_x         (dbl) 0.00, 0.02, 0.00, 0.02, 0.02, 0.02, 0.03,...
## $ gyros_belt_y         (dbl) 0.00, 0.00, 0.00, 0.02, 0.00, 0.00, 0.00,...
## $ gyros_belt_z         (dbl) -0.02, -0.02, -0.02, -0.02, -0.02, -0.02,...
## $ accel_belt_x         (int) -21, -22, -20, -21, -21, -22, -21, -22, -...
## $ accel_belt_y         (int) 4, 4, 5, 2, 4, 3, 4, 2, 4, 4, 4, 4, 5, 5,...
## $ accel_belt_z         (int) 22, 22, 23, 24, 21, 21, 22, 23, 21, 21, 2...
## $ magnet_belt_x        (int) -3, -7, -2, -6, 0, -4, -3, -2, -3, -8, 0,...
## $ magnet_belt_y        (int) 599, 608, 600, 600, 603, 599, 609, 602, 6...
## $ magnet_belt_z        (int) -313, -311, -305, -302, -312, -311, -308,...
## $ roll_arm             (dbl) -128, -128, -128, -128, -128, -128, -128,...
## $ pitch_arm            (dbl) 22.5, 22.5, 22.5, 22.1, 22.0, 21.9, 21.6,...
## $ yaw_arm              (dbl) -161, -161, -161, -161, -161, -161, -161,...
## $ total_accel_arm      (int) 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 3...
## $ gyros_arm_x          (dbl) 0.00, 0.02, 0.02, 0.00, 0.02, 0.00, 0.02,...
## $ gyros_arm_y          (dbl) 0.00, -0.02, -0.02, -0.03, -0.03, -0.03, ...
## $ gyros_arm_z          (dbl) -0.02, -0.02, -0.02, 0.00, 0.00, 0.00, -0...
## $ accel_arm_x          (int) -288, -290, -289, -289, -289, -289, -288,...
## $ accel_arm_y          (int) 109, 110, 110, 111, 111, 111, 110, 111, 1...
## $ accel_arm_z          (int) -123, -125, -126, -123, -122, -125, -124,...
## $ magnet_arm_x         (int) -368, -369, -368, -374, -369, -373, -376,...
## $ magnet_arm_y         (int) 337, 337, 344, 337, 342, 336, 334, 343, 3...
## $ magnet_arm_z         (int) 516, 513, 513, 506, 513, 509, 516, 520, 5...
## $ roll_dumbbell        (dbl) 13.05217, 13.13074, 12.85075, 13.37872, 1...
## $ pitch_dumbbell       (dbl) -70.49400, -70.63751, -70.27812, -70.4285...
## $ yaw_dumbbell         (dbl) -84.87394, -84.71065, -85.14078, -84.8530...
## $ total_accel_dumbbell (int) 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 3...
## $ gyros_dumbbell_x     (dbl) 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,...
## $ gyros_dumbbell_y     (dbl) -0.02, -0.02, -0.02, -0.02, -0.02, -0.02,...
## $ gyros_dumbbell_z     (dbl) 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00,...
## $ accel_dumbbell_x     (int) -234, -233, -232, -233, -234, -232, -235,...
## $ accel_dumbbell_y     (int) 47, 47, 46, 48, 48, 47, 48, 47, 48, 48, 4...
## $ accel_dumbbell_z     (int) -271, -269, -270, -270, -269, -270, -270,...
## $ magnet_dumbbell_x    (int) -559, -555, -561, -554, -558, -551, -558,...
## $ magnet_dumbbell_y    (int) 293, 296, 298, 292, 294, 295, 291, 291, 3...
## $ magnet_dumbbell_z    (dbl) -65, -64, -63, -68, -66, -70, -69, -65, -...
## $ roll_forearm         (dbl) 28.4, 28.3, 28.3, 28.0, 27.9, 27.9, 27.7,...
## $ pitch_forearm        (dbl) -63.9, -63.9, -63.9, -63.9, -63.9, -63.9,...
## $ yaw_forearm          (dbl) -153, -153, -152, -152, -152, -152, -152,...
## $ total_accel_forearm  (int) 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 3...
## $ gyros_forearm_x      (dbl) 0.03, 0.02, 0.03, 0.02, 0.02, 0.02, 0.02,...
## $ gyros_forearm_y      (dbl) 0.00, 0.00, -0.02, 0.00, -0.02, 0.00, 0.0...
## $ gyros_forearm_z      (dbl) -0.02, -0.02, 0.00, -0.02, -0.03, -0.02, ...
## $ accel_forearm_x      (int) 192, 192, 196, 189, 193, 195, 190, 191, 1...
## $ accel_forearm_y      (int) 203, 203, 204, 206, 203, 205, 205, 203, 2...
## $ accel_forearm_z      (int) -215, -216, -213, -214, -215, -215, -215,...
## $ magnet_forearm_x     (int) -17, -18, -18, -17, -9, -18, -22, -11, -1...
## $ magnet_forearm_y     (dbl) 654, 661, 658, 655, 660, 659, 656, 657, 6...
## $ magnet_forearm_z     (dbl) 476, 473, 469, 473, 478, 470, 473, 478, 4...
## $ classe               (chr) "A", "A", "A", "A", "A", "A", "A", "A", "...

Removing Unnecessary Variables

After taking a look at the variables we decide to remove the first five column of the data, which contain ID type variables. These include the variables X, which just numbers all the observations. We also remove user_name, which records the name of the person performing the test. We also remove raw_timestamp_part_1, raw_timestamp_part_2, and cvtd_timestamp, which are variables that record the time and date the activities were performed. We do these transformations to both the train variable and the test variable because whatever you do to train set you have to do to the test set.

train <- select(train, -(1:5))
test <- select(test, -(1:5))

Changing classe to a Factor Variable

We then change the classe variable from a character variable to a factor variable.

train$classe <- as.factor(train$classe)
test$classe <- as.factor(test$classe)

We check the dimesions of the data frame again.

dim(train)

## [1] 15699    54

dim(test)

## [1] 3923   54

Models

We now have the manipulated the data so that we can use it to build our models.

Classification Tree Model

Building Model

We build the model using the folowing code. It is a Classification Tree model. We use knn imputing to impute any missing values. We also use ten fold cross-validation. So during modeling our dataset is split into ten folds. These folds represent 10 test sets, containing one instance of our original data selected randomly. The out of sample error is then used for each test set and applied to the overall model. This method should reduce the out of sample error on the whole model.

model_rpart <- train(classe ~ ., train, method = "rpart", preProcess = "knnImpute", trControl = trainControl(method = "cv", number = 10))
model_rpart

## CART 
## 
## 15699 samples
##    53 predictor
##     5 classes: 'A', 'B', 'C', 'D', 'E' 
## 
## Pre-processing: num_window nearest neighbor imputation (53),
##  centered (53), scaled (53) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 14129, 14129, 14128, 14128, 14129, 14129, ... 
## Resampling results across tuning parameters:
## 
##   cp          Accuracy   Kappa       Accuracy SD  Kappa SD  
##   0.03929684  0.5494083  0.42258991  0.02795134   0.04210466
##   0.05536271  0.4808658  0.31245587  0.06294576   0.10190795
##   0.11526480  0.3313507  0.07157808  0.04035547   0.06164077
## 
## Accuracy was used to select the optimal model using  the largest value.
## The final value used for the model was cp = 0.03929684.

When looking at this model we see that it used 10 fold cross-validation as its resampling too. The accuracy that it predicts to have is about 53.21%.

Testing Model

We then use the predict function to test the model on the test split of the dataset. We use a Confusion Matrix to compare our predictions with the actual values of classe in the test set.

predictions_rpart <- predict(model_rpart, test)
confusionMatrix(predictions_rpart, test$classe)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1027  320  323  272   73
##          B    9  255   20  123   60
##          C   77  184  341  221  156
##          D    0    0    0    0    0
##          E    3    0    0   27  432
## 
## Overall Statistics
##                                           
##                Accuracy : 0.5238          
##                  95% CI : (0.5081, 0.5396)
##     No Information Rate : 0.2845          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.3781          
##  Mcnemar's Test P-Value : < 2.2e-16       
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9203   0.3360  0.49854   0.0000   0.5992
## Specificity            0.6480   0.9330  0.80303   1.0000   0.9906
## Pos Pred Value         0.5097   0.5460  0.34831      NaN   0.9351
## Neg Pred Value         0.9534   0.8542  0.88349   0.8361   0.9165
## Prevalence             0.2845   0.1935  0.17436   0.1639   0.1838
## Detection Rate         0.2618   0.0650  0.08692   0.0000   0.1101
## Detection Prevalence   0.5136   0.1190  0.24955   0.0000   0.1178
## Balanced Accuracy      0.7841   0.6345  0.65078   0.5000   0.7949

From the confusion matrix we see that the predictions were 49.58% accurate. These predictions are within the 95% confidence interval that says that his model will accurately predict the classe variable. According to this confidence interval, with about 95% certainty we can say that this model will predict with between 48% and 51.16% percent accuracy.

Random Forrest Model

Next we build a random forest model

Building Model

We build the Random Forrest model with the following code. Again we use 10 fold cross-validation and impute missing values with knn Imputation.

model_rf <- train(classe ~ ., train, method = "ranger", preProcess = "knnImpute", trControl = trainControl(method = "cv", number = 10))

## Growing trees.. Progress: 64%. Estimated remaining time: 17 seconds.
## Growing trees.. Progress: 71%. Estimated remaining time: 12 seconds.
## Growing trees.. Progress: 85%. Estimated remaining time: 5 seconds.
## Growing trees.. Progress: 90%. Estimated remaining time: 3 seconds.
## Growing trees.. Progress: 74%. Estimated remaining time: 11 seconds.
## Growing trees.. Progress: 71%. Estimated remaining time: 12 seconds.
## Growing trees.. Progress: 82%. Estimated remaining time: 6 seconds.
## Growing trees.. Progress: 89%. Estimated remaining time: 3 seconds.
## Growing trees.. Progress: 93%. Estimated remaining time: 2 seconds.
## Growing trees.. Progress: 91%. Estimated remaining time: 3 seconds.

model_rf

## Random Forest 
## 
## 15699 samples
##    53 predictor
##     5 classes: 'A', 'B', 'C', 'D', 'E' 
## 
## Pre-processing: num_window nearest neighbor imputation (53),
##  centered (53), scaled (53) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 14130, 14128, 14129, 14129, 14128, 14127, ... 
## Resampling results across tuning parameters:
## 
##   mtry  Accuracy   Kappa      Accuracy SD  Kappa SD   
##    2    0.9957316  0.9946005  0.001902188  0.002406826
##   27    0.9978336  0.9972597  0.001651312  0.002088905
##   53    0.9958593  0.9947620  0.001783387  0.002256155
## 
## Accuracy was used to select the optimal model using  the largest value.
## The final value used for the model was mtry = 27.

This model shows that it used ten fold cross validation as its sampling tool. It predicts that it will be about 99.54% accurate.

Testing Model

We then test the model with the following code and compare our predicted outcomes with the actual outcomes using the Confusion Matrix.

predictions_rf <- predict(model_rf, test)
confusionMatrix(predictions_rf, test$classe)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1115    1    0    0    0
##          B    0  758    1    0    0
##          C    0    0  683    1    0
##          D    0    0    0  641    0
##          E    1    0    0    1  721
## 
## Overall Statistics
##                                          
##                Accuracy : 0.9987         
##                  95% CI : (0.997, 0.9996)
##     No Information Rate : 0.2845         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.9984         
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9991   0.9987   0.9985   0.9969   1.0000
## Specificity            0.9996   0.9997   0.9997   1.0000   0.9994
## Pos Pred Value         0.9991   0.9987   0.9985   1.0000   0.9972
## Neg Pred Value         0.9996   0.9997   0.9997   0.9994   1.0000
## Prevalence             0.2845   0.1935   0.1744   0.1639   0.1838
## Detection Rate         0.2842   0.1932   0.1741   0.1634   0.1838
## Detection Prevalence   0.2845   0.1935   0.1744   0.1634   0.1843
## Balanced Accuracy      0.9994   0.9992   0.9991   0.9984   0.9997

The Confusion Matrix shows that this model predicted with 99.95% accuracy. There was only one time where the predicted outcome and the actual outcome differed. According to the 95% confidence interval, this model will predict with 99.82% to 99.99% accuracy about 95% of the time.

GBM Model

We then build a GBM model.

Building Model

We build the GBM model with the following code. Again we use 10 fold-cross validation and knn Imputation.

model_gbm <- train(classe ~ ., train, method = "gbm", preProcess = "knnImpute", trControl = trainControl(method = "cv", number = 10))

## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1305
##      2        1.5240             nan     0.1000    0.0876
##      3        1.4651             nan     0.1000    0.0665
##      4        1.4200             nan     0.1000    0.0538
##      5        1.3845             nan     0.1000    0.0470
##      6        1.3527             nan     0.1000    0.0414
##      7        1.3264             nan     0.1000    0.0377
##      8        1.3018             nan     0.1000    0.0416
##      9        1.2745             nan     0.1000    0.0299
##     10        1.2543             nan     0.1000    0.0310
##     20        1.0926             nan     0.1000    0.0181
##     40        0.9127             nan     0.1000    0.0109
##     60        0.8004             nan     0.1000    0.0074
##     80        0.7199             nan     0.1000    0.0053
##    100        0.6543             nan     0.1000    0.0043
##    120        0.6002             nan     0.1000    0.0030
##    140        0.5539             nan     0.1000    0.0042
##    150        0.5315             nan     0.1000    0.0032
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1861
##      2        1.4873             nan     0.1000    0.1321
##      3        1.4026             nan     0.1000    0.1132
##      4        1.3317             nan     0.1000    0.0846
##      5        1.2772             nan     0.1000    0.0764
##      6        1.2294             nan     0.1000    0.0686
##      7        1.1857             nan     0.1000    0.0590
##      8        1.1480             nan     0.1000    0.0596
##      9        1.1109             nan     0.1000    0.0414
##     10        1.0837             nan     0.1000    0.0411
##     20        0.8516             nan     0.1000    0.0246
##     40        0.6231             nan     0.1000    0.0125
##     60        0.4862             nan     0.1000    0.0076
##     80        0.3974             nan     0.1000    0.0060
##    100        0.3287             nan     0.1000    0.0042
##    120        0.2786             nan     0.1000    0.0040
##    140        0.2329             nan     0.1000    0.0038
##    150        0.2135             nan     0.1000    0.0009
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2433
##      2        1.4574             nan     0.1000    0.1654
##      3        1.3542             nan     0.1000    0.1363
##      4        1.2711             nan     0.1000    0.1026
##      5        1.2071             nan     0.1000    0.0934
##      6        1.1487             nan     0.1000    0.0836
##      7        1.0963             nan     0.1000    0.0698
##      8        1.0506             nan     0.1000    0.0702
##      9        1.0070             nan     0.1000    0.0513
##     10        0.9740             nan     0.1000    0.0737
##     20        0.6973             nan     0.1000    0.0259
##     40        0.4482             nan     0.1000    0.0143
##     60        0.3192             nan     0.1000    0.0065
##     80        0.2399             nan     0.1000    0.0036
##    100        0.1900             nan     0.1000    0.0024
##    120        0.1514             nan     0.1000    0.0031
##    140        0.1206             nan     0.1000    0.0009
##    150        0.1104             nan     0.1000    0.0010
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1285
##      2        1.5231             nan     0.1000    0.0888
##      3        1.4643             nan     0.1000    0.0667
##      4        1.4205             nan     0.1000    0.0561
##      5        1.3843             nan     0.1000    0.0461
##      6        1.3541             nan     0.1000    0.0439
##      7        1.3261             nan     0.1000    0.0380
##      8        1.3013             nan     0.1000    0.0336
##      9        1.2769             nan     0.1000    0.0352
##     10        1.2542             nan     0.1000    0.0329
##     20        1.0911             nan     0.1000    0.0191
##     40        0.9108             nan     0.1000    0.0115
##     60        0.7999             nan     0.1000    0.0060
##     80        0.7199             nan     0.1000    0.0069
##    100        0.6550             nan     0.1000    0.0042
##    120        0.6003             nan     0.1000    0.0038
##    140        0.5541             nan     0.1000    0.0025
##    150        0.5340             nan     0.1000    0.0019
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1951
##      2        1.4837             nan     0.1000    0.1328
##      3        1.3979             nan     0.1000    0.1039
##      4        1.3311             nan     0.1000    0.0868
##      5        1.2756             nan     0.1000    0.0768
##      6        1.2268             nan     0.1000    0.0628
##      7        1.1863             nan     0.1000    0.0695
##      8        1.1433             nan     0.1000    0.0555
##      9        1.1085             nan     0.1000    0.0405
##     10        1.0818             nan     0.1000    0.0474
##     20        0.8534             nan     0.1000    0.0283
##     40        0.6188             nan     0.1000    0.0114
##     60        0.4875             nan     0.1000    0.0079
##     80        0.3908             nan     0.1000    0.0048
##    100        0.3234             nan     0.1000    0.0047
##    120        0.2741             nan     0.1000    0.0033
##    140        0.2302             nan     0.1000    0.0039
##    150        0.2126             nan     0.1000    0.0023
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2436
##      2        1.4559             nan     0.1000    0.1682
##      3        1.3496             nan     0.1000    0.1267
##      4        1.2705             nan     0.1000    0.1041
##      5        1.2045             nan     0.1000    0.1024
##      6        1.1416             nan     0.1000    0.0748
##      7        1.0949             nan     0.1000    0.0708
##      8        1.0497             nan     0.1000    0.0701
##      9        1.0062             nan     0.1000    0.0722
##     10        0.9618             nan     0.1000    0.0460
##     20        0.7050             nan     0.1000    0.0326
##     40        0.4565             nan     0.1000    0.0090
##     60        0.3335             nan     0.1000    0.0050
##     80        0.2540             nan     0.1000    0.0060
##    100        0.2001             nan     0.1000    0.0040
##    120        0.1619             nan     0.1000    0.0038
##    140        0.1299             nan     0.1000    0.0025
##    150        0.1171             nan     0.1000    0.0009
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1291
##      2        1.5213             nan     0.1000    0.0897
##      3        1.4625             nan     0.1000    0.0664
##      4        1.4181             nan     0.1000    0.0552
##      5        1.3817             nan     0.1000    0.0426
##      6        1.3533             nan     0.1000    0.0469
##      7        1.3239             nan     0.1000    0.0377
##      8        1.2989             nan     0.1000    0.0319
##      9        1.2776             nan     0.1000    0.0391
##     10        1.2523             nan     0.1000    0.0276
##     20        1.0934             nan     0.1000    0.0162
##     40        0.9138             nan     0.1000    0.0104
##     60        0.8007             nan     0.1000    0.0066
##     80        0.7175             nan     0.1000    0.0060
##    100        0.6545             nan     0.1000    0.0045
##    120        0.5999             nan     0.1000    0.0032
##    140        0.5552             nan     0.1000    0.0035
##    150        0.5331             nan     0.1000    0.0020
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1895
##      2        1.4873             nan     0.1000    0.1316
##      3        1.4016             nan     0.1000    0.1131
##      4        1.3307             nan     0.1000    0.0831
##      5        1.2768             nan     0.1000    0.0788
##      6        1.2270             nan     0.1000    0.0603
##      7        1.1877             nan     0.1000    0.0715
##      8        1.1441             nan     0.1000    0.0516
##      9        1.1112             nan     0.1000    0.0488
##     10        1.0795             nan     0.1000    0.0436
##     20        0.8469             nan     0.1000    0.0204
##     40        0.6175             nan     0.1000    0.0148
##     60        0.4851             nan     0.1000    0.0057
##     80        0.3904             nan     0.1000    0.0071
##    100        0.3224             nan     0.1000    0.0062
##    120        0.2719             nan     0.1000    0.0028
##    140        0.2326             nan     0.1000    0.0026
##    150        0.2152             nan     0.1000    0.0021
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2398
##      2        1.4553             nan     0.1000    0.1640
##      3        1.3513             nan     0.1000    0.1210
##      4        1.2750             nan     0.1000    0.1125
##      5        1.2050             nan     0.1000    0.0941
##      6        1.1467             nan     0.1000    0.0889
##      7        1.0916             nan     0.1000    0.0679
##      8        1.0484             nan     0.1000    0.0608
##      9        1.0102             nan     0.1000    0.0620
##     10        0.9701             nan     0.1000    0.0666
##     20        0.6937             nan     0.1000    0.0356
##     40        0.4470             nan     0.1000    0.0093
##     60        0.3260             nan     0.1000    0.0062
##     80        0.2487             nan     0.1000    0.0039
##    100        0.1944             nan     0.1000    0.0023
##    120        0.1551             nan     0.1000    0.0034
##    140        0.1220             nan     0.1000    0.0018
##    150        0.1118             nan     0.1000    0.0014
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1280
##      2        1.5242             nan     0.1000    0.0918
##      3        1.4638             nan     0.1000    0.0678
##      4        1.4192             nan     0.1000    0.0550
##      5        1.3840             nan     0.1000    0.0523
##      6        1.3505             nan     0.1000    0.0385
##      7        1.3251             nan     0.1000    0.0386
##      8        1.3007             nan     0.1000    0.0370
##      9        1.2753             nan     0.1000    0.0374
##     10        1.2520             nan     0.1000    0.0309
##     20        1.0934             nan     0.1000    0.0200
##     40        0.9121             nan     0.1000    0.0089
##     60        0.7991             nan     0.1000    0.0061
##     80        0.7168             nan     0.1000    0.0047
##    100        0.6511             nan     0.1000    0.0042
##    120        0.5983             nan     0.1000    0.0037
##    140        0.5524             nan     0.1000    0.0033
##    150        0.5326             nan     0.1000    0.0031
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1975
##      2        1.4842             nan     0.1000    0.1307
##      3        1.4008             nan     0.1000    0.1079
##      4        1.3321             nan     0.1000    0.0893
##      5        1.2762             nan     0.1000    0.0755
##      6        1.2280             nan     0.1000    0.0590
##      7        1.1882             nan     0.1000    0.0670
##      8        1.1468             nan     0.1000    0.0570
##      9        1.1112             nan     0.1000    0.0462
##     10        1.0811             nan     0.1000    0.0510
##     20        0.8519             nan     0.1000    0.0194
##     40        0.6268             nan     0.1000    0.0153
##     60        0.4806             nan     0.1000    0.0067
##     80        0.3903             nan     0.1000    0.0052
##    100        0.3243             nan     0.1000    0.0052
##    120        0.2712             nan     0.1000    0.0036
##    140        0.2296             nan     0.1000    0.0018
##    150        0.2122             nan     0.1000    0.0019
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2403
##      2        1.4553             nan     0.1000    0.1656
##      3        1.3523             nan     0.1000    0.1283
##      4        1.2708             nan     0.1000    0.1112
##      5        1.1987             nan     0.1000    0.0903
##      6        1.1420             nan     0.1000    0.0789
##      7        1.0918             nan     0.1000    0.0700
##      8        1.0468             nan     0.1000    0.0776
##      9        0.9993             nan     0.1000    0.0574
##     10        0.9628             nan     0.1000    0.0630
##     20        0.6911             nan     0.1000    0.0248
##     40        0.4510             nan     0.1000    0.0108
##     60        0.3294             nan     0.1000    0.0070
##     80        0.2488             nan     0.1000    0.0069
##    100        0.1963             nan     0.1000    0.0032
##    120        0.1542             nan     0.1000    0.0025
##    140        0.1253             nan     0.1000    0.0022
##    150        0.1112             nan     0.1000    0.0012
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1351
##      2        1.5220             nan     0.1000    0.0906
##      3        1.4634             nan     0.1000    0.0677
##      4        1.4194             nan     0.1000    0.0548
##      5        1.3829             nan     0.1000    0.0447
##      6        1.3531             nan     0.1000    0.0464
##      7        1.3240             nan     0.1000    0.0376
##      8        1.3004             nan     0.1000    0.0350
##      9        1.2765             nan     0.1000    0.0326
##     10        1.2557             nan     0.1000    0.0367
##     20        1.0920             nan     0.1000    0.0193
##     40        0.9134             nan     0.1000    0.0102
##     60        0.8034             nan     0.1000    0.0066
##     80        0.7213             nan     0.1000    0.0046
##    100        0.6563             nan     0.1000    0.0053
##    120        0.6013             nan     0.1000    0.0044
##    140        0.5547             nan     0.1000    0.0031
##    150        0.5335             nan     0.1000    0.0029
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1930
##      2        1.4846             nan     0.1000    0.1291
##      3        1.4012             nan     0.1000    0.1090
##      4        1.3304             nan     0.1000    0.0888
##      5        1.2736             nan     0.1000    0.0704
##      6        1.2283             nan     0.1000    0.0764
##      7        1.1812             nan     0.1000    0.0668
##      8        1.1390             nan     0.1000    0.0533
##      9        1.1057             nan     0.1000    0.0500
##     10        1.0745             nan     0.1000    0.0420
##     20        0.8535             nan     0.1000    0.0289
##     40        0.6285             nan     0.1000    0.0100
##     60        0.4957             nan     0.1000    0.0082
##     80        0.4028             nan     0.1000    0.0080
##    100        0.3316             nan     0.1000    0.0066
##    120        0.2808             nan     0.1000    0.0043
##    140        0.2383             nan     0.1000    0.0024
##    150        0.2201             nan     0.1000    0.0026
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2404
##      2        1.4564             nan     0.1000    0.1664
##      3        1.3541             nan     0.1000    0.1274
##      4        1.2724             nan     0.1000    0.1046
##      5        1.2055             nan     0.1000    0.0876
##      6        1.1500             nan     0.1000    0.0897
##      7        1.0957             nan     0.1000    0.0705
##      8        1.0513             nan     0.1000    0.0648
##      9        1.0111             nan     0.1000    0.0626
##     10        0.9718             nan     0.1000    0.0564
##     20        0.7041             nan     0.1000    0.0421
##     40        0.4548             nan     0.1000    0.0115
##     60        0.3337             nan     0.1000    0.0068
##     80        0.2484             nan     0.1000    0.0036
##    100        0.1928             nan     0.1000    0.0029
##    120        0.1547             nan     0.1000    0.0019
##    140        0.1254             nan     0.1000    0.0029
##    150        0.1115             nan     0.1000    0.0020
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1286
##      2        1.5223             nan     0.1000    0.0885
##      3        1.4643             nan     0.1000    0.0662
##      4        1.4204             nan     0.1000    0.0553
##      5        1.3839             nan     0.1000    0.0496
##      6        1.3521             nan     0.1000    0.0429
##      7        1.3245             nan     0.1000    0.0365
##      8        1.3009             nan     0.1000    0.0355
##      9        1.2758             nan     0.1000    0.0346
##     10        1.2536             nan     0.1000    0.0285
##     20        1.0904             nan     0.1000    0.0159
##     40        0.9128             nan     0.1000    0.0091
##     60        0.8035             nan     0.1000    0.0056
##     80        0.7189             nan     0.1000    0.0041
##    100        0.6538             nan     0.1000    0.0047
##    120        0.6003             nan     0.1000    0.0034
##    140        0.5547             nan     0.1000    0.0020
##    150        0.5345             nan     0.1000    0.0039
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1911
##      2        1.4856             nan     0.1000    0.1304
##      3        1.4014             nan     0.1000    0.1044
##      4        1.3360             nan     0.1000    0.0904
##      5        1.2790             nan     0.1000    0.0813
##      6        1.2278             nan     0.1000    0.0650
##      7        1.1860             nan     0.1000    0.0547
##      8        1.1506             nan     0.1000    0.0586
##      9        1.1142             nan     0.1000    0.0523
##     10        1.0817             nan     0.1000    0.0433
##     20        0.8592             nan     0.1000    0.0225
##     40        0.6279             nan     0.1000    0.0106
##     60        0.4946             nan     0.1000    0.0065
##     80        0.3962             nan     0.1000    0.0045
##    100        0.3251             nan     0.1000    0.0059
##    120        0.2695             nan     0.1000    0.0025
##    140        0.2301             nan     0.1000    0.0038
##    150        0.2113             nan     0.1000    0.0020
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2398
##      2        1.4582             nan     0.1000    0.1622
##      3        1.3566             nan     0.1000    0.1207
##      4        1.2812             nan     0.1000    0.1133
##      5        1.2104             nan     0.1000    0.0876
##      6        1.1545             nan     0.1000    0.0792
##      7        1.1048             nan     0.1000    0.0745
##      8        1.0567             nan     0.1000    0.0796
##      9        1.0080             nan     0.1000    0.0617
##     10        0.9696             nan     0.1000    0.0548
##     20        0.6894             nan     0.1000    0.0236
##     40        0.4529             nan     0.1000    0.0104
##     60        0.3291             nan     0.1000    0.0082
##     80        0.2447             nan     0.1000    0.0037
##    100        0.1913             nan     0.1000    0.0036
##    120        0.1528             nan     0.1000    0.0014
##    140        0.1256             nan     0.1000    0.0019
##    150        0.1135             nan     0.1000    0.0023
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1288
##      2        1.5241             nan     0.1000    0.0888
##      3        1.4650             nan     0.1000    0.0640
##      4        1.4211             nan     0.1000    0.0561
##      5        1.3844             nan     0.1000    0.0504
##      6        1.3526             nan     0.1000    0.0404
##      7        1.3262             nan     0.1000    0.0384
##      8        1.3017             nan     0.1000    0.0365
##      9        1.2773             nan     0.1000    0.0305
##     10        1.2574             nan     0.1000    0.0368
##     20        1.0951             nan     0.1000    0.0196
##     40        0.9120             nan     0.1000    0.0102
##     60        0.8020             nan     0.1000    0.0059
##     80        0.7210             nan     0.1000    0.0054
##    100        0.6566             nan     0.1000    0.0036
##    120        0.6036             nan     0.1000    0.0037
##    140        0.5564             nan     0.1000    0.0032
##    150        0.5353             nan     0.1000    0.0026
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1823
##      2        1.4876             nan     0.1000    0.1344
##      3        1.4013             nan     0.1000    0.1106
##      4        1.3300             nan     0.1000    0.0884
##      5        1.2740             nan     0.1000    0.0714
##      6        1.2287             nan     0.1000    0.0748
##      7        1.1831             nan     0.1000    0.0574
##      8        1.1461             nan     0.1000    0.0608
##      9        1.1078             nan     0.1000    0.0519
##     10        1.0761             nan     0.1000    0.0426
##     20        0.8559             nan     0.1000    0.0205
##     40        0.6170             nan     0.1000    0.0087
##     60        0.4890             nan     0.1000    0.0081
##     80        0.3938             nan     0.1000    0.0072
##    100        0.3224             nan     0.1000    0.0053
##    120        0.2726             nan     0.1000    0.0050
##    140        0.2341             nan     0.1000    0.0032
##    150        0.2144             nan     0.1000    0.0022
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2389
##      2        1.4560             nan     0.1000    0.1585
##      3        1.3548             nan     0.1000    0.1280
##      4        1.2735             nan     0.1000    0.1033
##      5        1.2089             nan     0.1000    0.0976
##      6        1.1477             nan     0.1000    0.0849
##      7        1.0952             nan     0.1000    0.0724
##      8        1.0499             nan     0.1000    0.0662
##      9        1.0086             nan     0.1000    0.0551
##     10        0.9740             nan     0.1000    0.0606
##     20        0.7022             nan     0.1000    0.0296
##     40        0.4591             nan     0.1000    0.0141
##     60        0.3309             nan     0.1000    0.0069
##     80        0.2550             nan     0.1000    0.0047
##    100        0.2007             nan     0.1000    0.0036
##    120        0.1604             nan     0.1000    0.0026
##    140        0.1285             nan     0.1000    0.0016
##    150        0.1157             nan     0.1000    0.0011
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1287
##      2        1.5219             nan     0.1000    0.0874
##      3        1.4626             nan     0.1000    0.0670
##      4        1.4179             nan     0.1000    0.0550
##      5        1.3825             nan     0.1000    0.0448
##      6        1.3525             nan     0.1000    0.0459
##      7        1.3234             nan     0.1000    0.0367
##      8        1.2990             nan     0.1000    0.0339
##      9        1.2750             nan     0.1000    0.0361
##     10        1.2528             nan     0.1000    0.0311
##     20        1.0921             nan     0.1000    0.0168
##     40        0.9126             nan     0.1000    0.0102
##     60        0.8020             nan     0.1000    0.0059
##     80        0.7206             nan     0.1000    0.0051
##    100        0.6554             nan     0.1000    0.0032
##    120        0.6034             nan     0.1000    0.0036
##    140        0.5563             nan     0.1000    0.0027
##    150        0.5347             nan     0.1000    0.0035
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1910
##      2        1.4851             nan     0.1000    0.1310
##      3        1.4002             nan     0.1000    0.1049
##      4        1.3330             nan     0.1000    0.0896
##      5        1.2767             nan     0.1000    0.0715
##      6        1.2309             nan     0.1000    0.0623
##      7        1.1911             nan     0.1000    0.0661
##      8        1.1501             nan     0.1000    0.0551
##      9        1.1163             nan     0.1000    0.0552
##     10        1.0829             nan     0.1000    0.0415
##     20        0.8545             nan     0.1000    0.0283
##     40        0.6141             nan     0.1000    0.0087
##     60        0.4796             nan     0.1000    0.0080
##     80        0.3920             nan     0.1000    0.0065
##    100        0.3296             nan     0.1000    0.0043
##    120        0.2764             nan     0.1000    0.0026
##    140        0.2325             nan     0.1000    0.0028
##    150        0.2164             nan     0.1000    0.0010
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2406
##      2        1.4559             nan     0.1000    0.1579
##      3        1.3557             nan     0.1000    0.1275
##      4        1.2739             nan     0.1000    0.1069
##      5        1.2064             nan     0.1000    0.0912
##      6        1.1475             nan     0.1000    0.0778
##      7        1.0986             nan     0.1000    0.0742
##      8        1.0503             nan     0.1000    0.0524
##      9        1.0165             nan     0.1000    0.0713
##     10        0.9726             nan     0.1000    0.0650
##     20        0.6950             nan     0.1000    0.0360
##     40        0.4558             nan     0.1000    0.0161
##     60        0.3239             nan     0.1000    0.0058
##     80        0.2526             nan     0.1000    0.0047
##    100        0.1977             nan     0.1000    0.0028
##    120        0.1592             nan     0.1000    0.0043
##    140        0.1265             nan     0.1000    0.0012
##    150        0.1149             nan     0.1000    0.0011
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1277
##      2        1.5225             nan     0.1000    0.0887
##      3        1.4641             nan     0.1000    0.0697
##      4        1.4186             nan     0.1000    0.0532
##      5        1.3833             nan     0.1000    0.0463
##      6        1.3516             nan     0.1000    0.0390
##      7        1.3261             nan     0.1000    0.0434
##      8        1.2996             nan     0.1000    0.0290
##      9        1.2802             nan     0.1000    0.0414
##     10        1.2529             nan     0.1000    0.0326
##     20        1.0924             nan     0.1000    0.0180
##     40        0.9136             nan     0.1000    0.0096
##     60        0.8008             nan     0.1000    0.0072
##     80        0.7185             nan     0.1000    0.0045
##    100        0.6544             nan     0.1000    0.0030
##    120        0.6008             nan     0.1000    0.0029
##    140        0.5539             nan     0.1000    0.0028
##    150        0.5329             nan     0.1000    0.0026
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1945
##      2        1.4851             nan     0.1000    0.1273
##      3        1.4019             nan     0.1000    0.1042
##      4        1.3348             nan     0.1000    0.0883
##      5        1.2787             nan     0.1000    0.0750
##      6        1.2311             nan     0.1000    0.0616
##      7        1.1908             nan     0.1000    0.0637
##      8        1.1505             nan     0.1000    0.0560
##      9        1.1156             nan     0.1000    0.0496
##     10        1.0844             nan     0.1000    0.0499
##     20        0.8412             nan     0.1000    0.0219
##     40        0.6174             nan     0.1000    0.0102
##     60        0.4846             nan     0.1000    0.0072
##     80        0.3886             nan     0.1000    0.0044
##    100        0.3242             nan     0.1000    0.0037
##    120        0.2743             nan     0.1000    0.0035
##    140        0.2286             nan     0.1000    0.0026
##    150        0.2131             nan     0.1000    0.0016
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2414
##      2        1.4558             nan     0.1000    0.1637
##      3        1.3536             nan     0.1000    0.1235
##      4        1.2763             nan     0.1000    0.1150
##      5        1.2036             nan     0.1000    0.0948
##      6        1.1439             nan     0.1000    0.0919
##      7        1.0879             nan     0.1000    0.0850
##      8        1.0358             nan     0.1000    0.0517
##      9        1.0023             nan     0.1000    0.0660
##     10        0.9618             nan     0.1000    0.0645
##     20        0.6987             nan     0.1000    0.0295
##     40        0.4530             nan     0.1000    0.0114
##     60        0.3233             nan     0.1000    0.0060
##     80        0.2490             nan     0.1000    0.0019
##    100        0.1927             nan     0.1000    0.0033
##    120        0.1542             nan     0.1000    0.0024
##    140        0.1216             nan     0.1000    0.0024
##    150        0.1091             nan     0.1000    0.0022
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1340
##      2        1.5215             nan     0.1000    0.0912
##      3        1.4618             nan     0.1000    0.0677
##      4        1.4164             nan     0.1000    0.0548
##      5        1.3801             nan     0.1000    0.0456
##      6        1.3504             nan     0.1000    0.0439
##      7        1.3209             nan     0.1000    0.0406
##      8        1.2953             nan     0.1000    0.0343
##      9        1.2710             nan     0.1000    0.0365
##     10        1.2480             nan     0.1000    0.0325
##     20        1.0856             nan     0.1000    0.0199
##     40        0.9073             nan     0.1000    0.0100
##     60        0.7984             nan     0.1000    0.0086
##     80        0.7154             nan     0.1000    0.0046
##    100        0.6503             nan     0.1000    0.0042
##    120        0.5958             nan     0.1000    0.0034
##    140        0.5507             nan     0.1000    0.0023
##    150        0.5300             nan     0.1000    0.0029
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.1924
##      2        1.4836             nan     0.1000    0.1325
##      3        1.3992             nan     0.1000    0.1090
##      4        1.3291             nan     0.1000    0.0899
##      5        1.2712             nan     0.1000    0.0743
##      6        1.2239             nan     0.1000    0.0742
##      7        1.1776             nan     0.1000    0.0512
##      8        1.1439             nan     0.1000    0.0603
##      9        1.1065             nan     0.1000    0.0505
##     10        1.0749             nan     0.1000    0.0440
##     20        0.8470             nan     0.1000    0.0205
##     40        0.6285             nan     0.1000    0.0139
##     60        0.4828             nan     0.1000    0.0070
##     80        0.3946             nan     0.1000    0.0065
##    100        0.3296             nan     0.1000    0.0020
##    120        0.2800             nan     0.1000    0.0020
##    140        0.2378             nan     0.1000    0.0026
##    150        0.2179             nan     0.1000    0.0028
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2419
##      2        1.4546             nan     0.1000    0.1629
##      3        1.3522             nan     0.1000    0.1243
##      4        1.2721             nan     0.1000    0.1025
##      5        1.2083             nan     0.1000    0.0869
##      6        1.1535             nan     0.1000    0.0871
##      7        1.0983             nan     0.1000    0.0896
##      8        1.0443             nan     0.1000    0.0686
##      9        1.0010             nan     0.1000    0.0586
##     10        0.9643             nan     0.1000    0.0643
##     20        0.6928             nan     0.1000    0.0337
##     40        0.4499             nan     0.1000    0.0148
##     60        0.3260             nan     0.1000    0.0086
##     80        0.2489             nan     0.1000    0.0031
##    100        0.1919             nan     0.1000    0.0019
##    120        0.1511             nan     0.1000    0.0014
##    140        0.1214             nan     0.1000    0.0010
##    150        0.1097             nan     0.1000    0.0008
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        1.6094             nan     0.1000    0.2375
##      2        1.4587             nan     0.1000    0.1613
##      3        1.3574             nan     0.1000    0.1341
##      4        1.2733             nan     0.1000    0.1057
##      5        1.2075             nan     0.1000    0.0880
##      6        1.1526             nan     0.1000    0.0908
##      7        1.0968             nan     0.1000    0.0727
##      8        1.0514             nan     0.1000    0.0755
##      9        1.0046             nan     0.1000    0.0549
##     10        0.9703             nan     0.1000    0.0707
##     20        0.6905             nan     0.1000    0.0289
##     40        0.4493             nan     0.1000    0.0181
##     60        0.3288             nan     0.1000    0.0044
##     80        0.2507             nan     0.1000    0.0057
##    100        0.1934             nan     0.1000    0.0044
##    120        0.1540             nan     0.1000    0.0012
##    140        0.1249             nan     0.1000    0.0022
##    150        0.1115             nan     0.1000    0.0023

model_gbm

## Stochastic Gradient Boosting 
## 
## 15699 samples
##    53 predictor
##     5 classes: 'A', 'B', 'C', 'D', 'E' 
## 
## Pre-processing: num_window nearest neighbor imputation (53),
##  centered (53), scaled (53) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 14128, 14131, 14128, 14130, 14130, 14130, ... 
## Resampling results across tuning parameters:
## 
##   interaction.depth  n.trees  Accuracy   Kappa      Accuracy SD
##   1                   50      0.7572414  0.6921248  0.010781740
##   1                  100      0.8310718  0.7861098  0.008251060
##   1                  150      0.8710122  0.8367512  0.007516531
##   2                   50      0.8888440  0.8592505  0.007208724
##   2                  100      0.9420982  0.9267382  0.006847320
##   2                  150      0.9650940  0.9558321  0.004713744
##   3                   50      0.9343269  0.9168797  0.005743863
##   3                  100      0.9721639  0.9647846  0.002881328
##   3                  150      0.9875147  0.9842065  0.001883892
##   Kappa SD   
##   0.013570491
##   0.010341014
##   0.009492510
##   0.009070638
##   0.008664593
##   0.005963822
##   0.007272041
##   0.003639594
##   0.002381420
## 
## Tuning parameter 'shrinkage' was held constant at a value of 0.1
## 
## Tuning parameter 'n.minobsinnode' was held constant at a value of 10
## Accuracy was used to select the optimal model using  the largest value.
## The final values used for the model were n.trees = 150,
##  interaction.depth = 3, shrinkage = 0.1 and n.minobsinnode = 10.

This model shows that it used ten fold cross validation to resample. It’s difficult to predict the accuracy, but we will test it on our test set and determine accuracy.

Testing Model

We test the model with the following code and compare our predicted outcomes with the actual outcomes using a confusion matrix.

predictions_gbm <- predict(model_gbm, test)
confusionMatrix(predictions_gbm, test$classe)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1114    5    0    0    0
##          B    2  742    1    3    2
##          C    0   12  681    5    2
##          D    0    0    1  634    6
##          E    0    0    1    1  711
## 
## Overall Statistics
##                                           
##                Accuracy : 0.9895          
##                  95% CI : (0.9858, 0.9925)
##     No Information Rate : 0.2845          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.9868          
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9982   0.9776   0.9956   0.9860   0.9861
## Specificity            0.9982   0.9975   0.9941   0.9979   0.9994
## Pos Pred Value         0.9955   0.9893   0.9729   0.9891   0.9972
## Neg Pred Value         0.9993   0.9946   0.9991   0.9973   0.9969
## Prevalence             0.2845   0.1935   0.1744   0.1639   0.1838
## Detection Rate         0.2840   0.1891   0.1736   0.1616   0.1812
## Detection Prevalence   0.2852   0.1912   0.1784   0.1634   0.1817
## Balanced Accuracy      0.9982   0.9875   0.9949   0.9919   0.9928

This Confusion Matrix shows that the GBM model predicted with 99.34% accuracy. According to the 95% confidence interval, this model will predict with between 99.03% and 99.57% accuracy about 95% of the time.

Picking and Testing Final Model

The GBM and the Random Forrest models both are about 99% accurate. We use both to predict the answers to the Courera quiz. They yield the same answers. When used to answer the questions to the quiz we received 100%. These models are both accurate predictors for the classe variable. Although they are highly accurate, they are difficult to interpret. They both use very sophisticated algorithms to build their models.

predictions_test_rf <- predict(model_rf, test_test)
predictions_test_gbm <- predict(model_gbm, test_test)
data.frame("Random Forrest Predictions" = predictions_test_rf, 
           "GBM Predictions" = predictions_test_gbm)

##    Random.Forrest.Predictions GBM.Predictions
## 1                           B               B
## 2                           A               A
## 3                           B               B
## 4                           A               A
## 5                           A               A
## 6                           E               E
## 7                           D               D
## 8                           B               B
## 9                           A               A
## 10                          A               A
## 11                          B               B
## 12                          C               C
## 13                          B               B
## 14                          A               A
## 15                          E               E
## 16                          E               E
## 17                          A               A
## 18                          B               B
## 19                          B               B
## 20                          B               B

Machine Learning Project

Charles Westby

11/14/2017

Synopsis

Processing Data

Loading Libraries, Setting Seed and Importing Data

Exploring the Data

Partitioning the Data

Removing Near Zero Values

Glimpsing Variables

Removing Unnecessary Variables

Changing classe to a Factor Variable

Models

Classification Tree Model

Building Model

Testing Model

Random Forrest Model

Building Model

Testing Model

GBM Model

Building Model

Testing Model

Picking and Testing Final Model