Summary

In this work, I analyze a Human Activity Recognition dataset provided by:

Velloso, E.; Bulling, A.; Gellersen, H.; Ugulino, W.; Fuks, H. Qualitative Activity Recognition of Weight Lifting Exercises. Proceedings of 4th International Conference in Cooperation with SIGCHI (Augmented Human ’13) . Stuttgart, Germany: ACM SIGCHI, 2013.

with the aim to determine how well a weight lifting exercise is being performed. To this aim, I use data from sensors located in the belt, arm, forearm and dumbbell used by the test subject. The data will be analyzed, summarized and used to train a random forest model which will allow me to classify new data points in one of five available categories.

Analysis

For simplicity, all necessary libraries will be loaded now. If you wish to reproduce this work, make sure you have installed these libraries using the install.packages command.

library(caret)
library(dplyr)
library(ggplot2)
library(reshape2) # used for its melt function for the confusion matrix plot

Loading the data

I will load the data and choose only a handful of columns. The reason for this is that many columns are full of NAs, empty or do not represent variables important to the problem of determining how well the activity is being performed (for example raw_timestamp_part1, raw_timestamp_part2, new_window, etc). I made this decision after looking at head(mydf). I do not print this command here to save space.

mydf <- read.csv("./data/pml-training.csv")

mydf <- mydf %>% select(user_name, roll_belt, pitch_belt, yaw_belt, 
                        total_accel_belt, gyros_belt_x, gyros_belt_y,
                        gyros_belt_z,
                        accel_belt_x, accel_belt_y, accel_belt_z,
                        magnet_belt_x, magnet_belt_y, magnet_belt_z,
                        roll_arm, pitch_arm, yaw_arm, 
                        total_accel_arm, gyros_arm_x, gyros_arm_y, gyros_arm_z,
                        accel_arm_x, accel_arm_y, accel_arm_z,
                        magnet_arm_x, magnet_arm_y, magnet_arm_z,
                        roll_dumbbell, pitch_dumbbell, yaw_dumbbell, 
                        total_accel_dumbbell, gyros_dumbbell_x, gyros_dumbbell_y, 
                        gyros_dumbbell_z,
                        accel_dumbbell_x, accel_dumbbell_y, accel_dumbbell_z,
                        magnet_dumbbell_x, magnet_dumbbell_y, magnet_dumbbell_z,
                        roll_forearm, pitch_forearm, yaw_forearm, 
                        total_accel_forearm, gyros_forearm_x, gyros_forearm_y, 
                        gyros_forearm_z,
                        accel_forearm_x, accel_forearm_y, accel_forearm_z,
                        magnet_forearm_x, magnet_forearm_y, magnet_forearm_z,
                        classe)

Data Partition

Next, I will separate the data into training and testing sets using the createDataPartition command from the caret package. I will train my model on the former and test its accuracy on the later.

set.seed(123)
aux <- createDataPartition(mydf$classe, p = 0.75, list = FALSE)
dtrain <- mydf[aux, ]
dtest <- mydf[-aux, ]

Pre-processing and modeling

The dimensions of our dtrain dataset are 14718 rows and 54 columns. As you can imagine, training a prediction algorithm with 52 numerical variables (user_name and the outcome, classe, are factor variables) would require a lot of computational time. For that reason, I will use Principal Component Analysis (PCA) to summarize the information contained in my dtrain data frame.

# We find the pre-processing model using the preProcess function from caret
ppm <- preProcess(dtrain[ , 2:53], method = "pca")
trainPC <- predict(ppm, dtrain[ , 2:53])

I will now train a random forest using my new data frame trainPC. I will use cross validation within the train command from the caret package. From the ?trainControl command, I learnt that method is the resampling method chosen, which in my case is cross-validation (coded as "cv"). The number of folds or resampling iterations is set by default as 10.

mdl <- train(dtrain$classe ~ ., data = trainPC, method = "rf", 
             trControl = trainControl(method = "cv"))
## Loading required package: randomForest
## Warning: package 'randomForest' was built under R version 3.2.2
## randomForest 4.6-12
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
mdl
## Random Forest 
## 
## 14718 samples
##    24 predictor
##     5 classes: 'A', 'B', 'C', 'D', 'E' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 13247, 13245, 13247, 13248, 13247, 13245, ... 
## Resampling results across tuning parameters:
## 
##   mtry  Accuracy   Kappa      Accuracy SD  Kappa SD   
##    2    0.9749972  0.9683676  0.004454494  0.005637988
##   13    0.9702422  0.9623589  0.006007687  0.007604784
##   25    0.9668457  0.9580620  0.006119953  0.007739255
## 
## Accuracy was used to select the optimal model using  the largest value.
## The final value used for the model was mtry = 2.
mdl$finalModel
## 
## Call:
##  randomForest(x = x, y = y, mtry = param$mtry) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 2
## 
##         OOB estimate of  error rate: 2.3%
## Confusion matrix:
##      A    B    C    D    E class.error
## A 4146   13   17    6    3 0.009318996
## B   56 2752   35    1    4 0.033707865
## C    4   33 2507   20    3 0.023373588
## D    3    1   97 2306    5 0.043946932
## E    2    7   14   15 2668 0.014042868

Testing the model and accuracy

Now that I have my model, I will use it to predict the classe variable in the test data set. I will then use the confusionMatrix command from caret to determine the out-of-sample error.

testPC <- predict(ppm, dtest[ , 2:53])
prediction <- predict(mdl, newdata = testPC)
confusionMatrix(prediction, dtest$classe)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1392    8    1    1    0
##          B    2  926   10    0    1
##          C    0   13  837   39   11
##          D    0    0    5  764    5
##          E    1    2    2    0  884
## 
## Overall Statistics
##                                          
##                Accuracy : 0.9794         
##                  95% CI : (0.975, 0.9832)
##     No Information Rate : 0.2845         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.9739         
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9978   0.9758   0.9789   0.9502   0.9811
## Specificity            0.9972   0.9967   0.9844   0.9976   0.9988
## Pos Pred Value         0.9929   0.9862   0.9300   0.9871   0.9944
## Neg Pred Value         0.9991   0.9942   0.9955   0.9903   0.9958
## Prevalence             0.2845   0.1935   0.1743   0.1639   0.1837
## Detection Rate         0.2838   0.1888   0.1707   0.1558   0.1803
## Detection Prevalence   0.2859   0.1915   0.1835   0.1578   0.1813
## Balanced Accuracy      0.9975   0.9862   0.9817   0.9739   0.9899

As it can be seen in the confusion matrix above (and in the Figure below), the model has an out-of-sample overall accuracy of 97.94%. In other words, the out-of-sample error is 2.06%. It is also remarkable that the sensitivity, i.e. the probability of positively identify the classe given that a data point belongs to it, is above 95% for all classes. The specificity, i.e. the probability that the method determines that a point does not belong to a classe when it does not, is above 98% in all cases.

confmatrix <- as.matrix(confusionMatrix(prediction, dtest$classe))
confmatrix <- apply(confmatrix, 2, function(x) round(x/sum(x), 3))
p <- ggplot(melt(confmatrix), aes(x=Var1, y=Var2, fill=value)) + 
        geom_tile()
p + xlab("Prediction") + ylab("Reference") + ggtitle("Confusion Matrix")

Apendix

Exploratory Data Analysis

str(mydf)
## 'data.frame':    19622 obs. of  54 variables:
##  $ user_name           : Factor w/ 6 levels "adelmo","carlitos",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ roll_belt           : num  1.41 1.41 1.42 1.48 1.48 1.45 1.42 1.42 1.43 1.45 ...
##  $ pitch_belt          : num  8.07 8.07 8.07 8.05 8.07 8.06 8.09 8.13 8.16 8.17 ...
##  $ yaw_belt            : num  -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
##  $ total_accel_belt    : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ gyros_belt_x        : num  0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 0.03 ...
##  $ gyros_belt_y        : num  0 0 0 0 0.02 0 0 0 0 0 ...
##  $ gyros_belt_z        : num  -0.02 -0.02 -0.02 -0.03 -0.02 -0.02 -0.02 -0.02 -0.02 0 ...
##  $ accel_belt_x        : int  -21 -22 -20 -22 -21 -21 -22 -22 -20 -21 ...
##  $ accel_belt_y        : int  4 4 5 3 2 4 3 4 2 4 ...
##  $ accel_belt_z        : int  22 22 23 21 24 21 21 21 24 22 ...
##  $ magnet_belt_x       : int  -3 -7 -2 -6 -6 0 -4 -2 1 -3 ...
##  $ magnet_belt_y       : int  599 608 600 604 600 603 599 603 602 609 ...
##  $ magnet_belt_z       : int  -313 -311 -305 -310 -302 -312 -311 -313 -312 -308 ...
##  $ roll_arm            : num  -128 -128 -128 -128 -128 -128 -128 -128 -128 -128 ...
##  $ pitch_arm           : num  22.5 22.5 22.5 22.1 22.1 22 21.9 21.8 21.7 21.6 ...
##  $ yaw_arm             : num  -161 -161 -161 -161 -161 -161 -161 -161 -161 -161 ...
##  $ total_accel_arm     : int  34 34 34 34 34 34 34 34 34 34 ...
##  $ gyros_arm_x         : num  0 0.02 0.02 0.02 0 0.02 0 0.02 0.02 0.02 ...
##  $ gyros_arm_y         : num  0 -0.02 -0.02 -0.03 -0.03 -0.03 -0.03 -0.02 -0.03 -0.03 ...
##  $ gyros_arm_z         : num  -0.02 -0.02 -0.02 0.02 0 0 0 0 -0.02 -0.02 ...
##  $ accel_arm_x         : int  -288 -290 -289 -289 -289 -289 -289 -289 -288 -288 ...
##  $ accel_arm_y         : int  109 110 110 111 111 111 111 111 109 110 ...
##  $ accel_arm_z         : int  -123 -125 -126 -123 -123 -122 -125 -124 -122 -124 ...
##  $ magnet_arm_x        : int  -368 -369 -368 -372 -374 -369 -373 -372 -369 -376 ...
##  $ magnet_arm_y        : int  337 337 344 344 337 342 336 338 341 334 ...
##  $ magnet_arm_z        : int  516 513 513 512 506 513 509 510 518 516 ...
##  $ roll_dumbbell       : num  13.1 13.1 12.9 13.4 13.4 ...
##  $ pitch_dumbbell      : num  -70.5 -70.6 -70.3 -70.4 -70.4 ...
##  $ yaw_dumbbell        : num  -84.9 -84.7 -85.1 -84.9 -84.9 ...
##  $ total_accel_dumbbell: int  37 37 37 37 37 37 37 37 37 37 ...
##  $ gyros_dumbbell_x    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ gyros_dumbbell_y    : num  -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 ...
##  $ gyros_dumbbell_z    : num  0 0 0 -0.02 0 0 0 0 0 0 ...
##  $ accel_dumbbell_x    : int  -234 -233 -232 -232 -233 -234 -232 -234 -232 -235 ...
##  $ accel_dumbbell_y    : int  47 47 46 48 48 48 47 46 47 48 ...
##  $ accel_dumbbell_z    : int  -271 -269 -270 -269 -270 -269 -270 -272 -269 -270 ...
##  $ magnet_dumbbell_x   : int  -559 -555 -561 -552 -554 -558 -551 -555 -549 -558 ...
##  $ magnet_dumbbell_y   : int  293 296 298 303 292 294 295 300 292 291 ...
##  $ magnet_dumbbell_z   : num  -65 -64 -63 -60 -68 -66 -70 -74 -65 -69 ...
##  $ roll_forearm        : num  28.4 28.3 28.3 28.1 28 27.9 27.9 27.8 27.7 27.7 ...
##  $ pitch_forearm       : num  -63.9 -63.9 -63.9 -63.9 -63.9 -63.9 -63.9 -63.8 -63.8 -63.8 ...
##  $ yaw_forearm         : num  -153 -153 -152 -152 -152 -152 -152 -152 -152 -152 ...
##  $ total_accel_forearm : int  36 36 36 36 36 36 36 36 36 36 ...
##  $ gyros_forearm_x     : num  0.03 0.02 0.03 0.02 0.02 0.02 0.02 0.02 0.03 0.02 ...
##  $ gyros_forearm_y     : num  0 0 -0.02 -0.02 0 -0.02 0 -0.02 0 0 ...
##  $ gyros_forearm_z     : num  -0.02 -0.02 0 0 -0.02 -0.03 -0.02 0 -0.02 -0.02 ...
##  $ accel_forearm_x     : int  192 192 196 189 189 193 195 193 193 190 ...
##  $ accel_forearm_y     : int  203 203 204 206 206 203 205 205 204 205 ...
##  $ accel_forearm_z     : int  -215 -216 -213 -214 -214 -215 -215 -213 -214 -215 ...
##  $ magnet_forearm_x    : int  -17 -18 -18 -16 -17 -9 -18 -9 -16 -22 ...
##  $ magnet_forearm_y    : num  654 661 658 658 655 660 659 660 653 656 ...
##  $ magnet_forearm_z    : num  476 473 469 469 473 478 470 474 476 473 ...
##  $ classe              : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
summary(mydf)
##     user_name      roll_belt        pitch_belt          yaw_belt      
##  adelmo  :3892   Min.   :-28.90   Min.   :-55.8000   Min.   :-180.00  
##  carlitos:3112   1st Qu.:  1.10   1st Qu.:  1.7600   1st Qu.: -88.30  
##  charles :3536   Median :113.00   Median :  5.2800   Median : -13.00  
##  eurico  :3070   Mean   : 64.41   Mean   :  0.3053   Mean   : -11.21  
##  jeremy  :3402   3rd Qu.:123.00   3rd Qu.: 14.9000   3rd Qu.:  12.90  
##  pedro   :2610   Max.   :162.00   Max.   : 60.3000   Max.   : 179.00  
##  total_accel_belt  gyros_belt_x        gyros_belt_y       gyros_belt_z    
##  Min.   : 0.00    Min.   :-1.040000   Min.   :-0.64000   Min.   :-1.4600  
##  1st Qu.: 3.00    1st Qu.:-0.030000   1st Qu.: 0.00000   1st Qu.:-0.2000  
##  Median :17.00    Median : 0.030000   Median : 0.02000   Median :-0.1000  
##  Mean   :11.31    Mean   :-0.005592   Mean   : 0.03959   Mean   :-0.1305  
##  3rd Qu.:18.00    3rd Qu.: 0.110000   3rd Qu.: 0.11000   3rd Qu.:-0.0200  
##  Max.   :29.00    Max.   : 2.220000   Max.   : 0.64000   Max.   : 1.6200  
##   accel_belt_x       accel_belt_y     accel_belt_z     magnet_belt_x  
##  Min.   :-120.000   Min.   :-69.00   Min.   :-275.00   Min.   :-52.0  
##  1st Qu.: -21.000   1st Qu.:  3.00   1st Qu.:-162.00   1st Qu.:  9.0  
##  Median : -15.000   Median : 35.00   Median :-152.00   Median : 35.0  
##  Mean   :  -5.595   Mean   : 30.15   Mean   : -72.59   Mean   : 55.6  
##  3rd Qu.:  -5.000   3rd Qu.: 61.00   3rd Qu.:  27.00   3rd Qu.: 59.0  
##  Max.   :  85.000   Max.   :164.00   Max.   : 105.00   Max.   :485.0  
##  magnet_belt_y   magnet_belt_z       roll_arm         pitch_arm      
##  Min.   :354.0   Min.   :-623.0   Min.   :-180.00   Min.   :-88.800  
##  1st Qu.:581.0   1st Qu.:-375.0   1st Qu.: -31.77   1st Qu.:-25.900  
##  Median :601.0   Median :-320.0   Median :   0.00   Median :  0.000  
##  Mean   :593.7   Mean   :-345.5   Mean   :  17.83   Mean   : -4.612  
##  3rd Qu.:610.0   3rd Qu.:-306.0   3rd Qu.:  77.30   3rd Qu.: 11.200  
##  Max.   :673.0   Max.   : 293.0   Max.   : 180.00   Max.   : 88.500  
##     yaw_arm          total_accel_arm  gyros_arm_x        gyros_arm_y     
##  Min.   :-180.0000   Min.   : 1.00   Min.   :-6.37000   Min.   :-3.4400  
##  1st Qu.: -43.1000   1st Qu.:17.00   1st Qu.:-1.33000   1st Qu.:-0.8000  
##  Median :   0.0000   Median :27.00   Median : 0.08000   Median :-0.2400  
##  Mean   :  -0.6188   Mean   :25.51   Mean   : 0.04277   Mean   :-0.2571  
##  3rd Qu.:  45.8750   3rd Qu.:33.00   3rd Qu.: 1.57000   3rd Qu.: 0.1400  
##  Max.   : 180.0000   Max.   :66.00   Max.   : 4.87000   Max.   : 2.8400  
##   gyros_arm_z       accel_arm_x       accel_arm_y      accel_arm_z     
##  Min.   :-2.3300   Min.   :-404.00   Min.   :-318.0   Min.   :-636.00  
##  1st Qu.:-0.0700   1st Qu.:-242.00   1st Qu.: -54.0   1st Qu.:-143.00  
##  Median : 0.2300   Median : -44.00   Median :  14.0   Median : -47.00  
##  Mean   : 0.2695   Mean   : -60.24   Mean   :  32.6   Mean   : -71.25  
##  3rd Qu.: 0.7200   3rd Qu.:  84.00   3rd Qu.: 139.0   3rd Qu.:  23.00  
##  Max.   : 3.0200   Max.   : 437.00   Max.   : 308.0   Max.   : 292.00  
##   magnet_arm_x     magnet_arm_y     magnet_arm_z    roll_dumbbell    
##  Min.   :-584.0   Min.   :-392.0   Min.   :-597.0   Min.   :-153.71  
##  1st Qu.:-300.0   1st Qu.:  -9.0   1st Qu.: 131.2   1st Qu.: -18.49  
##  Median : 289.0   Median : 202.0   Median : 444.0   Median :  48.17  
##  Mean   : 191.7   Mean   : 156.6   Mean   : 306.5   Mean   :  23.84  
##  3rd Qu.: 637.0   3rd Qu.: 323.0   3rd Qu.: 545.0   3rd Qu.:  67.61  
##  Max.   : 782.0   Max.   : 583.0   Max.   : 694.0   Max.   : 153.55  
##  pitch_dumbbell     yaw_dumbbell      total_accel_dumbbell
##  Min.   :-149.59   Min.   :-150.871   Min.   : 0.00       
##  1st Qu.: -40.89   1st Qu.: -77.644   1st Qu.: 4.00       
##  Median : -20.96   Median :  -3.324   Median :10.00       
##  Mean   : -10.78   Mean   :   1.674   Mean   :13.72       
##  3rd Qu.:  17.50   3rd Qu.:  79.643   3rd Qu.:19.00       
##  Max.   : 149.40   Max.   : 154.952   Max.   :58.00       
##  gyros_dumbbell_x    gyros_dumbbell_y   gyros_dumbbell_z 
##  Min.   :-204.0000   Min.   :-2.10000   Min.   : -2.380  
##  1st Qu.:  -0.0300   1st Qu.:-0.14000   1st Qu.: -0.310  
##  Median :   0.1300   Median : 0.03000   Median : -0.130  
##  Mean   :   0.1611   Mean   : 0.04606   Mean   : -0.129  
##  3rd Qu.:   0.3500   3rd Qu.: 0.21000   3rd Qu.:  0.030  
##  Max.   :   2.2200   Max.   :52.00000   Max.   :317.000  
##  accel_dumbbell_x  accel_dumbbell_y  accel_dumbbell_z  magnet_dumbbell_x
##  Min.   :-419.00   Min.   :-189.00   Min.   :-334.00   Min.   :-643.0   
##  1st Qu.: -50.00   1st Qu.:  -8.00   1st Qu.:-142.00   1st Qu.:-535.0   
##  Median :  -8.00   Median :  41.50   Median :  -1.00   Median :-479.0   
##  Mean   : -28.62   Mean   :  52.63   Mean   : -38.32   Mean   :-328.5   
##  3rd Qu.:  11.00   3rd Qu.: 111.00   3rd Qu.:  38.00   3rd Qu.:-304.0   
##  Max.   : 235.00   Max.   : 315.00   Max.   : 318.00   Max.   : 592.0   
##  magnet_dumbbell_y magnet_dumbbell_z  roll_forearm       pitch_forearm   
##  Min.   :-3600     Min.   :-262.00   Min.   :-180.0000   Min.   :-72.50  
##  1st Qu.:  231     1st Qu.: -45.00   1st Qu.:  -0.7375   1st Qu.:  0.00  
##  Median :  311     Median :  13.00   Median :  21.7000   Median :  9.24  
##  Mean   :  221     Mean   :  46.05   Mean   :  33.8265   Mean   : 10.71  
##  3rd Qu.:  390     3rd Qu.:  95.00   3rd Qu.: 140.0000   3rd Qu.: 28.40  
##  Max.   :  633     Max.   : 452.00   Max.   : 180.0000   Max.   : 89.80  
##   yaw_forearm      total_accel_forearm gyros_forearm_x  
##  Min.   :-180.00   Min.   :  0.00      Min.   :-22.000  
##  1st Qu.: -68.60   1st Qu.: 29.00      1st Qu.: -0.220  
##  Median :   0.00   Median : 36.00      Median :  0.050  
##  Mean   :  19.21   Mean   : 34.72      Mean   :  0.158  
##  3rd Qu.: 110.00   3rd Qu.: 41.00      3rd Qu.:  0.560  
##  Max.   : 180.00   Max.   :108.00      Max.   :  3.970  
##  gyros_forearm_y     gyros_forearm_z    accel_forearm_x   accel_forearm_y 
##  Min.   : -7.02000   Min.   : -8.0900   Min.   :-498.00   Min.   :-632.0  
##  1st Qu.: -1.46000   1st Qu.: -0.1800   1st Qu.:-178.00   1st Qu.:  57.0  
##  Median :  0.03000   Median :  0.0800   Median : -57.00   Median : 201.0  
##  Mean   :  0.07517   Mean   :  0.1512   Mean   : -61.65   Mean   : 163.7  
##  3rd Qu.:  1.62000   3rd Qu.:  0.4900   3rd Qu.:  76.00   3rd Qu.: 312.0  
##  Max.   :311.00000   Max.   :231.0000   Max.   : 477.00   Max.   : 923.0  
##  accel_forearm_z   magnet_forearm_x  magnet_forearm_y magnet_forearm_z
##  Min.   :-446.00   Min.   :-1280.0   Min.   :-896.0   Min.   :-973.0  
##  1st Qu.:-182.00   1st Qu.: -616.0   1st Qu.:   2.0   1st Qu.: 191.0  
##  Median : -39.00   Median : -378.0   Median : 591.0   Median : 511.0  
##  Mean   : -55.29   Mean   : -312.6   Mean   : 380.1   Mean   : 393.6  
##  3rd Qu.:  26.00   3rd Qu.:  -73.0   3rd Qu.: 737.0   3rd Qu.: 653.0  
##  Max.   : 291.00   Max.   :  672.0   Max.   :1480.0   Max.   :1090.0  
##  classe  
##  A:5580  
##  B:3797  
##  C:3422  
##  D:3216  
##  E:3607  
## 

Solution to the second part of the assignment

I share the code for this part of the asignment. However, I will not make the answers visible.

# Read the data
mytest <- read.csv("./data/pml-testing.csv")

# Select the appropriate variables
mytest2 <- mytest %>% select(user_name, roll_belt, pitch_belt, yaw_belt, 
                        total_accel_belt, gyros_belt_x, gyros_belt_y, 
                        gyros_belt_z,
                        accel_belt_x, accel_belt_y, accel_belt_z,
                        magnet_belt_x, magnet_belt_y, magnet_belt_z,
                        roll_arm, pitch_arm, yaw_arm, 
                        total_accel_arm, gyros_arm_x, gyros_arm_y, gyros_arm_z,
                        accel_arm_x, accel_arm_y, accel_arm_z,
                        magnet_arm_x, magnet_arm_y, magnet_arm_z,
                        roll_dumbbell, pitch_dumbbell, yaw_dumbbell, 
                        total_accel_dumbbell, gyros_dumbbell_x, gyros_dumbbell_y, 
                        gyros_dumbbell_z,
                        accel_dumbbell_x, accel_dumbbell_y, accel_dumbbell_z,
                        magnet_dumbbell_x, magnet_dumbbell_y, magnet_dumbbell_z,
                        roll_forearm, pitch_forearm, yaw_forearm, 
                        total_accel_forearm, gyros_forearm_x, gyros_forearm_y, 
                        gyros_forearm_z,
                        accel_forearm_x, accel_forearm_y, accel_forearm_z,
                        magnet_forearm_x, magnet_forearm_y, magnet_forearm_z)

# Predict the values of the principal components
mytest2PC <- predict(ppm, newdata = mytest2[ , 2:53])

# Save the solution
answer <- data.frame(problem_id = mytest$problem_id)
answer$prediction <- predict(mdl, newdata = mytest2PC)

# Show the solution
answer

And this is the code I used to save the answers in text files. It is a slight variation as that proposed by the instructor of the course.

# We force the prediction to character format
answer$prediction <- as.character(answer$prediction)

# We save the answers
n = length(answer$problem_id)

for(i in 1:n){
    filename = paste0("./answer/problem_id_",i,".txt")
    write.table(answer$prediction[i], file = filename, quote = FALSE, 
                row.names = FALSE, col.names = FALSE)
}

I have submitted it for evaluation. I received 18 out of 20 points. Predictions for problem_id_6 and problem_id_11 were incorrect. From this, the accuracy could be estimated to be 90%.