Wk4-Course Project.Practical Machine Learning

Assingment Scope

This is the course project of the Practical Machine Learning Curse. The report develops how the goals of this projects are acomplished: 1. The data is cleaned to avoid using NA variables. 2. The 19622 experiments for training are divided by 70/30 for create the model and for test the results and for measure the accuracy. 3. A first model using classification tree is created, but the accuracy is not enought. 4. A final model is created using random forest which computes a 99% of accuracy, which is requiered to obtain a 95% of confidence for predincting 20 cases.

In order to improve the performance, the model is training using a k-fold=5 and processing in parallel. 5) As the accuracy of the used model is of 99%, we predict the 20 cases with a 95% of confidence.Using devices such as Jawbone Up, Nike FuelBand, and Fitbit it is now possible to collect a large amount of data about personal activity relatively inexpensively. These type of devices are part of the quantified self movement – a group of enthusiasts who take measurements about themselves regularly to improve their health, to find patterns in their behavior, or because they are tech geeks.

One thing that people regularly do is quantify how much of a particular activity they do, but they rarely quantify how well they do it. In this project, your goal will be to use data from accelerometers on the belt, forearm, arm, and dumbell of 6 participants. They were asked to perform barbell lifts correctly and incorrectly in 5 different ways. More information is available from the website here: http://groupware.les.inf.puc-rio.br/har (see the section on the Weight Lifting Exercise Dataset).

Load libraries, read in data

First, the required library are loaded and the input data is read.

library(rpart)
library(rattle)

## Rattle: A free graphical interface for data science with R.
## Version 5.2.0 Copyright (c) 2006-2018 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.

library(parallel)
library(doParallel)

## Loading required package: foreach

## Loading required package: iterators

library(knitr)
library(caret)

## Loading required package: lattice

## Loading required package: ggplot2

library(rpart.plot)

#library(randomForest)
#library(corrplot)
set.seed(12345)

pml_training = read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv",  na.strings = c("NA", "#DIV/0!", ""), header = TRUE)
pml_testing = read.csv("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv",na.strings = c("NA", "#DIV/0!", ""), header = TRUE)

dim(pml_training)

## [1] 19622   160

dim(pml_testing)

## [1]  20 160

summary(pml_testing)

##        X            user_name raw_timestamp_part_1 raw_timestamp_part_2
##  Min.   : 1.00   adelmo  :1   Min.   :1.322e+09    Min.   : 36553      
##  1st Qu.: 5.75   carlitos:3   1st Qu.:1.323e+09    1st Qu.:268655      
##  Median :10.50   charles :1   Median :1.323e+09    Median :530706      
##  Mean   :10.50   eurico  :4   Mean   :1.323e+09    Mean   :512167      
##  3rd Qu.:15.25   jeremy  :8   3rd Qu.:1.323e+09    3rd Qu.:787738      
##  Max.   :20.00   pedro   :3   Max.   :1.323e+09    Max.   :920315      
##                                                                        
##           cvtd_timestamp new_window   num_window      roll_belt       
##  30/11/2011 17:11:4      no:20      Min.   : 48.0   Min.   : -5.9200  
##  05/12/2011 11:24:3                 1st Qu.:250.0   1st Qu.:  0.9075  
##  30/11/2011 17:12:3                 Median :384.5   Median :  1.1100  
##  05/12/2011 14:23:2                 Mean   :379.6   Mean   : 31.3055  
##  28/11/2011 14:14:2                 3rd Qu.:467.0   3rd Qu.: 32.5050  
##  02/12/2011 13:33:1                 Max.   :859.0   Max.   :129.0000  
##  (Other)         :5                                                   
##    pitch_belt         yaw_belt      total_accel_belt kurtosis_roll_belt
##  Min.   :-41.600   Min.   :-93.70   Min.   : 2.00    Mode:logical      
##  1st Qu.:  3.013   1st Qu.:-88.62   1st Qu.: 3.00    NA's:20           
##  Median :  4.655   Median :-87.85   Median : 4.00                      
##  Mean   :  5.824   Mean   :-59.30   Mean   : 7.55                      
##  3rd Qu.:  6.135   3rd Qu.:-63.50   3rd Qu.: 8.00                      
##  Max.   : 27.800   Max.   :162.00   Max.   :21.00                      
##                                                                        
##  kurtosis_picth_belt kurtosis_yaw_belt skewness_roll_belt
##  Mode:logical        Mode:logical      Mode:logical      
##  NA's:20             NA's:20           NA's:20           
##                                                          
##                                                          
##                                                          
##                                                          
##                                                          
##  skewness_roll_belt.1 skewness_yaw_belt max_roll_belt  max_picth_belt
##  Mode:logical         Mode:logical      Mode:logical   Mode:logical  
##  NA's:20              NA's:20           NA's:20        NA's:20       
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##  max_yaw_belt   min_roll_belt  min_pitch_belt min_yaw_belt  
##  Mode:logical   Mode:logical   Mode:logical   Mode:logical  
##  NA's:20        NA's:20        NA's:20        NA's:20       
##                                                             
##                                                             
##                                                             
##                                                             
##                                                             
##  amplitude_roll_belt amplitude_pitch_belt amplitude_yaw_belt
##  Mode:logical        Mode:logical         Mode:logical      
##  NA's:20             NA's:20              NA's:20           
##                                                             
##                                                             
##                                                             
##                                                             
##                                                             
##  var_total_accel_belt avg_roll_belt  stddev_roll_belt var_roll_belt 
##  Mode:logical         Mode:logical   Mode:logical     Mode:logical  
##  NA's:20              NA's:20        NA's:20          NA's:20       
##                                                                     
##                                                                     
##                                                                     
##                                                                     
##                                                                     
##  avg_pitch_belt stddev_pitch_belt var_pitch_belt avg_yaw_belt  
##  Mode:logical   Mode:logical      Mode:logical   Mode:logical  
##  NA's:20        NA's:20           NA's:20        NA's:20       
##                                                                
##                                                                
##                                                                
##                                                                
##                                                                
##  stddev_yaw_belt var_yaw_belt    gyros_belt_x     gyros_belt_y   
##  Mode:logical    Mode:logical   Min.   :-0.500   Min.   :-0.050  
##  NA's:20         NA's:20        1st Qu.:-0.070   1st Qu.:-0.005  
##                                 Median : 0.020   Median : 0.000  
##                                 Mean   :-0.045   Mean   : 0.010  
##                                 3rd Qu.: 0.070   3rd Qu.: 0.020  
##                                 Max.   : 0.240   Max.   : 0.110  
##                                                                  
##   gyros_belt_z      accel_belt_x     accel_belt_y     accel_belt_z    
##  Min.   :-0.4800   Min.   :-48.00   Min.   :-16.00   Min.   :-187.00  
##  1st Qu.:-0.1375   1st Qu.:-19.00   1st Qu.:  2.00   1st Qu.: -24.00  
##  Median :-0.0250   Median :-13.00   Median :  4.50   Median :  27.00  
##  Mean   :-0.1005   Mean   :-13.50   Mean   : 18.35   Mean   : -17.60  
##  3rd Qu.: 0.0000   3rd Qu.: -8.75   3rd Qu.: 25.50   3rd Qu.:  38.25  
##  Max.   : 0.0500   Max.   : 46.00   Max.   : 72.00   Max.   :  49.00  
##                                                                       
##  magnet_belt_x    magnet_belt_y   magnet_belt_z       roll_arm      
##  Min.   :-13.00   Min.   :566.0   Min.   :-426.0   Min.   :-137.00  
##  1st Qu.:  5.50   1st Qu.:578.5   1st Qu.:-398.5   1st Qu.:   0.00  
##  Median : 33.50   Median :600.5   Median :-313.5   Median :   0.00  
##  Mean   : 35.15   Mean   :601.5   Mean   :-346.9   Mean   :  16.42  
##  3rd Qu.: 46.25   3rd Qu.:631.2   3rd Qu.:-305.0   3rd Qu.:  71.53  
##  Max.   :169.00   Max.   :638.0   Max.   :-291.0   Max.   : 152.00  
##                                                                     
##    pitch_arm          yaw_arm        total_accel_arm var_accel_arm 
##  Min.   :-63.800   Min.   :-167.00   Min.   : 3.00   Mode:logical  
##  1st Qu.: -9.188   1st Qu.: -60.15   1st Qu.:20.25   NA's:20       
##  Median :  0.000   Median :   0.00   Median :29.50                 
##  Mean   : -3.950   Mean   :  -2.80   Mean   :26.40                 
##  3rd Qu.:  3.465   3rd Qu.:  25.50   3rd Qu.:33.25                 
##  Max.   : 55.000   Max.   : 178.00   Max.   :44.00                 
##                                                                    
##  avg_roll_arm   stddev_roll_arm var_roll_arm   avg_pitch_arm 
##  Mode:logical   Mode:logical    Mode:logical   Mode:logical  
##  NA's:20        NA's:20         NA's:20        NA's:20       
##                                                              
##                                                              
##                                                              
##                                                              
##                                                              
##  stddev_pitch_arm var_pitch_arm  avg_yaw_arm    stddev_yaw_arm
##  Mode:logical     Mode:logical   Mode:logical   Mode:logical  
##  NA's:20          NA's:20        NA's:20        NA's:20       
##                                                               
##                                                               
##                                                               
##                                                               
##                                                               
##  var_yaw_arm     gyros_arm_x      gyros_arm_y       gyros_arm_z     
##  Mode:logical   Min.   :-3.710   Min.   :-2.0900   Min.   :-0.6900  
##  NA's:20        1st Qu.:-0.645   1st Qu.:-0.6350   1st Qu.:-0.1800  
##                 Median : 0.020   Median :-0.0400   Median :-0.0250  
##                 Mean   : 0.077   Mean   :-0.1595   Mean   : 0.1205  
##                 3rd Qu.: 1.248   3rd Qu.: 0.2175   3rd Qu.: 0.5650  
##                 Max.   : 3.660   Max.   : 1.8500   Max.   : 1.1300  
##                                                                     
##   accel_arm_x      accel_arm_y      accel_arm_z       magnet_arm_x    
##  Min.   :-341.0   Min.   :-65.00   Min.   :-404.00   Min.   :-428.00  
##  1st Qu.:-277.0   1st Qu.: 52.25   1st Qu.:-128.50   1st Qu.:-373.75  
##  Median :-194.5   Median :112.00   Median : -83.50   Median :-265.00  
##  Mean   :-134.6   Mean   :103.10   Mean   : -87.85   Mean   : -38.95  
##  3rd Qu.:   5.5   3rd Qu.:168.25   3rd Qu.: -27.25   3rd Qu.: 250.50  
##  Max.   : 106.0   Max.   :245.00   Max.   :  93.00   Max.   : 750.00  
##                                                                       
##   magnet_arm_y     magnet_arm_z    kurtosis_roll_arm kurtosis_picth_arm
##  Min.   :-307.0   Min.   :-499.0   Mode:logical      Mode:logical      
##  1st Qu.: 205.2   1st Qu.: 403.0   NA's:20           NA's:20           
##  Median : 291.0   Median : 476.5                                       
##  Mean   : 239.4   Mean   : 369.8                                       
##  3rd Qu.: 358.8   3rd Qu.: 517.0                                       
##  Max.   : 474.0   Max.   : 633.0                                       
##                                                                        
##  kurtosis_yaw_arm skewness_roll_arm skewness_pitch_arm skewness_yaw_arm
##  Mode:logical     Mode:logical      Mode:logical       Mode:logical    
##  NA's:20          NA's:20           NA's:20            NA's:20         
##                                                                        
##                                                                        
##                                                                        
##                                                                        
##                                                                        
##  max_roll_arm   max_picth_arm  max_yaw_arm    min_roll_arm  
##  Mode:logical   Mode:logical   Mode:logical   Mode:logical  
##  NA's:20        NA's:20        NA's:20        NA's:20       
##                                                             
##                                                             
##                                                             
##                                                             
##                                                             
##  min_pitch_arm  min_yaw_arm    amplitude_roll_arm amplitude_pitch_arm
##  Mode:logical   Mode:logical   Mode:logical       Mode:logical       
##  NA's:20        NA's:20        NA's:20            NA's:20            
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##  amplitude_yaw_arm roll_dumbbell      pitch_dumbbell    yaw_dumbbell      
##  Mode:logical      Min.   :-111.118   Min.   :-54.97   Min.   :-103.3200  
##  NA's:20           1st Qu.:   7.494   1st Qu.:-51.89   1st Qu.: -75.2809  
##                    Median :  50.403   Median :-40.81   Median :  -8.2863  
##                    Mean   :  33.760   Mean   :-19.47   Mean   :  -0.9385  
##                    3rd Qu.:  58.129   3rd Qu.: 16.12   3rd Qu.:  55.8335  
##                    Max.   : 123.984   Max.   : 96.87   Max.   : 132.2337  
##                                                                           
##  kurtosis_roll_dumbbell kurtosis_picth_dumbbell kurtosis_yaw_dumbbell
##  Mode:logical           Mode:logical            Mode:logical         
##  NA's:20                NA's:20                 NA's:20              
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##  skewness_roll_dumbbell skewness_pitch_dumbbell skewness_yaw_dumbbell
##  Mode:logical           Mode:logical            Mode:logical         
##  NA's:20                NA's:20                 NA's:20              
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##  max_roll_dumbbell max_picth_dumbbell max_yaw_dumbbell min_roll_dumbbell
##  Mode:logical      Mode:logical       Mode:logical     Mode:logical     
##  NA's:20           NA's:20            NA's:20          NA's:20          
##                                                                         
##                                                                         
##                                                                         
##                                                                         
##                                                                         
##  min_pitch_dumbbell min_yaw_dumbbell amplitude_roll_dumbbell
##  Mode:logical       Mode:logical     Mode:logical           
##  NA's:20            NA's:20          NA's:20                
##                                                             
##                                                             
##                                                             
##                                                             
##                                                             
##  amplitude_pitch_dumbbell amplitude_yaw_dumbbell total_accel_dumbbell
##  Mode:logical             Mode:logical           Min.   : 1.0        
##  NA's:20                  NA's:20                1st Qu.: 7.0        
##                                                  Median :15.5        
##                                                  Mean   :17.2        
##                                                  3rd Qu.:29.0        
##                                                  Max.   :31.0        
##                                                                      
##  var_accel_dumbbell avg_roll_dumbbell stddev_roll_dumbbell
##  Mode:logical       Mode:logical      Mode:logical        
##  NA's:20            NA's:20           NA's:20             
##                                                           
##                                                           
##                                                           
##                                                           
##                                                           
##  var_roll_dumbbell avg_pitch_dumbbell stddev_pitch_dumbbell
##  Mode:logical      Mode:logical       Mode:logical         
##  NA's:20           NA's:20            NA's:20              
##                                                            
##                                                            
##                                                            
##                                                            
##                                                            
##  var_pitch_dumbbell avg_yaw_dumbbell stddev_yaw_dumbbell var_yaw_dumbbell
##  Mode:logical       Mode:logical     Mode:logical        Mode:logical    
##  NA's:20            NA's:20          NA's:20             NA's:20         
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##  gyros_dumbbell_x  gyros_dumbbell_y  gyros_dumbbell_z accel_dumbbell_x 
##  Min.   :-1.0300   Min.   :-1.1100   Min.   :-1.180   Min.   :-159.00  
##  1st Qu.: 0.1600   1st Qu.:-0.2100   1st Qu.:-0.485   1st Qu.:-140.25  
##  Median : 0.3600   Median : 0.0150   Median :-0.280   Median : -19.00  
##  Mean   : 0.2690   Mean   : 0.0605   Mean   :-0.266   Mean   : -47.60  
##  3rd Qu.: 0.4625   3rd Qu.: 0.1450   3rd Qu.:-0.165   3rd Qu.:  15.75  
##  Max.   : 1.0600   Max.   : 1.9100   Max.   : 1.100   Max.   : 185.00  
##                                                                        
##  accel_dumbbell_y accel_dumbbell_z magnet_dumbbell_x magnet_dumbbell_y
##  Min.   :-30.00   Min.   :-221.0   Min.   :-576.0    Min.   :-558.0   
##  1st Qu.:  5.75   1st Qu.:-192.2   1st Qu.:-528.0    1st Qu.: 259.5   
##  Median : 71.50   Median :  -3.0   Median :-508.5    Median : 316.0   
##  Mean   : 70.55   Mean   : -60.0   Mean   :-304.2    Mean   : 189.3   
##  3rd Qu.:151.25   3rd Qu.:  76.5   3rd Qu.:-317.0    3rd Qu.: 348.2   
##  Max.   :166.00   Max.   : 100.0   Max.   : 523.0    Max.   : 403.0   
##                                                                       
##  magnet_dumbbell_z  roll_forearm     pitch_forearm      yaw_forearm      
##  Min.   :-164.00   Min.   :-176.00   Min.   :-63.500   Min.   :-168.000  
##  1st Qu.: -33.00   1st Qu.: -40.25   1st Qu.:-11.457   1st Qu.: -93.375  
##  Median :  49.50   Median :  94.20   Median :  8.830   Median : -19.250  
##  Mean   :  71.40   Mean   :  38.66   Mean   :  7.099   Mean   :   2.195  
##  3rd Qu.:  96.25   3rd Qu.: 143.25   3rd Qu.: 28.500   3rd Qu.: 104.500  
##  Max.   : 368.00   Max.   : 176.00   Max.   : 59.300   Max.   : 159.000  
##                                                                          
##  kurtosis_roll_forearm kurtosis_picth_forearm kurtosis_yaw_forearm
##  Mode:logical          Mode:logical           Mode:logical        
##  NA's:20               NA's:20                NA's:20             
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##  skewness_roll_forearm skewness_pitch_forearm skewness_yaw_forearm
##  Mode:logical          Mode:logical           Mode:logical        
##  NA's:20               NA's:20                NA's:20             
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##  max_roll_forearm max_picth_forearm max_yaw_forearm min_roll_forearm
##  Mode:logical     Mode:logical      Mode:logical    Mode:logical    
##  NA's:20          NA's:20           NA's:20         NA's:20         
##                                                                     
##                                                                     
##                                                                     
##                                                                     
##                                                                     
##  min_pitch_forearm min_yaw_forearm amplitude_roll_forearm
##  Mode:logical      Mode:logical    Mode:logical          
##  NA's:20           NA's:20         NA's:20               
##                                                          
##                                                          
##                                                          
##                                                          
##                                                          
##  amplitude_pitch_forearm amplitude_yaw_forearm total_accel_forearm
##  Mode:logical            Mode:logical          Min.   :21.00      
##  NA's:20                 NA's:20               1st Qu.:24.00      
##                                                Median :32.50      
##                                                Mean   :32.05      
##                                                3rd Qu.:36.75      
##                                                Max.   :47.00      
##                                                                   
##  var_accel_forearm avg_roll_forearm stddev_roll_forearm var_roll_forearm
##  Mode:logical      Mode:logical     Mode:logical        Mode:logical    
##  NA's:20           NA's:20          NA's:20             NA's:20         
##                                                                         
##                                                                         
##                                                                         
##                                                                         
##                                                                         
##  avg_pitch_forearm stddev_pitch_forearm var_pitch_forearm avg_yaw_forearm
##  Mode:logical      Mode:logical         Mode:logical      Mode:logical   
##  NA's:20           NA's:20              NA's:20           NA's:20        
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##  stddev_yaw_forearm var_yaw_forearm gyros_forearm_x   gyros_forearm_y  
##  Mode:logical       Mode:logical    Min.   :-1.0600   Min.   :-5.9700  
##  NA's:20            NA's:20         1st Qu.:-0.5850   1st Qu.:-1.2875  
##                                     Median : 0.0200   Median : 0.0350  
##                                     Mean   :-0.0200   Mean   :-0.0415  
##                                     3rd Qu.: 0.2925   3rd Qu.: 2.0475  
##                                     Max.   : 1.3800   Max.   : 4.2600  
##                                                                        
##  gyros_forearm_z   accel_forearm_x  accel_forearm_y  accel_forearm_z 
##  Min.   :-1.2600   Min.   :-212.0   Min.   :-331.0   Min.   :-282.0  
##  1st Qu.:-0.0975   1st Qu.:-114.8   1st Qu.:   8.5   1st Qu.:-199.0  
##  Median : 0.2300   Median :  86.0   Median : 138.0   Median :-148.5  
##  Mean   : 0.2610   Mean   :  38.8   Mean   : 125.3   Mean   : -93.7  
##  3rd Qu.: 0.7625   3rd Qu.: 166.2   3rd Qu.: 268.0   3rd Qu.: -31.0  
##  Max.   : 1.8000   Max.   : 232.0   Max.   : 406.0   Max.   : 179.0  
##                                                                      
##  magnet_forearm_x magnet_forearm_y magnet_forearm_z   problem_id   
##  Min.   :-714.0   Min.   :-787.0   Min.   :-32.0    Min.   : 1.00  
##  1st Qu.:-427.2   1st Qu.:-328.8   1st Qu.:275.2    1st Qu.: 5.75  
##  Median :-189.5   Median : 487.0   Median :491.5    Median :10.50  
##  Mean   :-159.2   Mean   : 191.8   Mean   :460.2    Mean   :10.50  
##  3rd Qu.:  41.5   3rd Qu.: 720.8   3rd Qu.:661.5    3rd Qu.:15.25  
##  Max.   : 532.0   Max.   : 800.0   Max.   :884.0    Max.   :20.00  
##

Both created datasets have 160 variables. Those variables have plenty of NA, that can be removed with the cleaning procedures below. The Near Zero variance (NZV) variables are also removed and the ID variables as well.

Cleaning Data

There are several variables (columns) with NA value. These colums are removed using the function is.na to test if the sum of column is or not NA before removing

training1<- pml_training[,colSums(is.na(pml_training)) == 0]
testing1<- pml_testing[,colSums(is.na(pml_testing)) == 0]

The first seven columns are removed before they give information about the people who did the test, and timestamps, which are not related with the classification we are trying to predict.

training<- training1[,-c(1:7)]
testing<- testing1[,-c(1:7)]
dim(training)

## [1] 19622    53

#how many sambles we have for each classe
table(training$classe)

## 
##    A    B    C    D    E 
## 5580 3797 3422 3216 3607

There are 19622 experiments with 53 variables for training and validation of our models, and 20 rows for testing

Data Partition

The training set is used for training and for validation, in 75/25 proportion.

inTrain = createDataPartition(training$classe, p = 0.75)[[1]]
training_part = training[ inTrain,]
valid_part = training[-inTrain,]

Predictive Model using classification trees

A classification tree model is created using 13737 experiments of the training set. The tree is plotted.

model_CT <- train(classe~., data=training_part, method="rpart")
fancyRpartPlot(model_CT$finalModel)

We predict values using the valid set and we calculate the confussion matrix with the accurary results

predict_validation<- predict(model_CT, newdata = valid_part)
cm_ct<-confusionMatrix(predict_validation,valid_part$classe)
cm_ct$cm_ct$overall['Accuracy']

## NULL

The accuracy result is low, of 49% with a 95% CI of(48%-50%).

Predictive Model using Random Forest

We create a new model using random forest. As the training would be very slow, I follow the instructions of the next link https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-randomForestPerformance.md. A cluster is created and the resampling method is changing for using k-fold cross-validation with number=5.

#use k_fold=5  in cross_validation to improve the performance
cluster <- makeCluster(detectCores() - 1) # convention to leave 1 core for OS
registerDoParallel(cluster)
trainControl_function <-trainControl(method = "cv",number = 5, allowParallel = TRUE) 
model_rf <- train(classe~., data=training_part, method="rf",trControl = trainControl_function)
print(model_rf$finalmodel)

## NULL

stop of paralling computing.

stopCluster(cluster)  
registerDoSEQ()

I predicted values of valid set and calculate the confussion matrix with the accurary results.

predict_validation_rf<- predict(model_rf, newdata = valid_part)
cm_rf<-confusionMatrix(predict_validation_rf,valid_part$classe)
cm_rf$overall['Accuracy']

##  Accuracy 
## 0.9946982

The accuracy result is 99%, enough to get the prediction of the 20 values. As you can see in the next entry, this is the accuracy required to obtain a 95% of confidence in the prediction of 20 values. https://github.com/lgreski/datasciencectacontent/blob/master/markdown/pml-requiredModelAccuracy.md

This is the plot of the model error rate by number of trees and 20 most important variables (out of 52)

varImp(model_rf)

## rf variable importance
## 
##   only 20 most important variables shown (out of 52)
## 
##                      Overall
## roll_belt             100.00
## pitch_forearm          61.39
## yaw_belt               56.38
## magnet_dumbbell_y      45.55
## pitch_belt             43.89
## roll_forearm           43.45
## magnet_dumbbell_z      42.75
## accel_dumbbell_y       22.32
## accel_forearm_x        17.75
## roll_dumbbell          16.94
## magnet_dumbbell_x      16.24
## magnet_belt_z          15.04
## accel_belt_z           14.81
## total_accel_dumbbell   13.89
## magnet_forearm_z       13.86
## accel_dumbbell_z       13.10
## magnet_belt_y          11.84
## yaw_arm                11.13
## gyros_belt_z           10.86
## magnet_belt_x          10.69

Predicting using the test set

The random forest model is now used to predict the manner in which the people will do the exercise. The final results are saved in a file.

predict_test<- predict(model_rf, testing)
predict_test

##  [1] B A B A A E D B A A B C B A E E A B B B
## Levels: A B C D E

write.csv(predict_test, file = "predict_test.csv")