Predicting Exercise Quality

John’s Hopkins University Practical Machine Learning on Coursera

Logan J Travis 2014-08-24

Executive Summary

Can the growing abundance of small, inexpensive sensor determine exercise quality? Many consumer devices measure quantity such as number of steps, distance traveled, and exercise repetitions. Through the use of random forest modeling, the same devices show incredibly high (>99%) predictive accuracy for exercise quality from instantaneous measurements.

This paper expands upon data and analyses provided by Groupware@LES to explore predictive models for exercise quality. Through exploration and comparison of multiple models, it concludes with a 99.4% accurate prediction algorithm for classifying standing dumbbell curls into five groups including one perfect execution and four common mistakes.

Get Data

Download training and test datasets to ./data/ if not present. Data provided as part of the John’s Hopkins University Practical Maching Learning class on Coursera by Groupware@LES from their Human Activity Recognition research project.

## Loading required package: lattice
## Loading required package: ggplot2
## [1] "File found at ./data/pml-training.csv"
## [1] "File found at ./data/pml-testing.csv"

Explore Training Dataset

The data collect by Groupware@LES includes six users, five classes with one perfect execution of the standing bicep curl exercise, and 152 measures. With over 19,000 observations, an initial exploration of the data provides needed insight into model development.

Explanation for “new” window rows

The paper submitted by Groupware@LES details their sliding window method in section 5.1 Feature Extraction and Selection. They calculated statistical measures for windows ranging from 0.5 to 2.5 seconds with 0.5 seconds of overlap (excluding the 0.5 second window).

However, the test data does not utilize such windows and includes only instantaneous measures. This does not align with the methodology explored in the paper though it does match their ultimate goal: continuous feedback to users. The models developed in this experiment will attempt to achieve that goal so do not utilize the statistical measures as predictors.

Reading/Spliting Data

Reading the training data excluding statistical measures yields 60 columns including observation number, time/date/window, and class (“classe” in the dataset). Split 75/25 into training and control, the clean data structure reduces to 54 columns:

## 'data.frame':    14718 obs. of  54 variables:
##  $ user_name           : Factor w/ 6 levels "adelmo","carlitos",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ roll_belt           : num  1.41 1.48 1.45 1.42 1.42 1.43 1.45 1.45 1.43 1.42 ...
##  $ pitch_belt          : num  8.07 8.07 8.06 8.09 8.13 8.16 8.17 8.18 8.18 8.2 ...
##  $ yaw_belt            : num  -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
##  $ total_accel_belt    : num  3 3 3 3 3 3 3 3 3 3 ...
##  $ gyros_belt_x        : num  0.02 0.02 0.02 0.02 0.02 0.02 0.03 0.03 0.02 0.02 ...
##  $ gyros_belt_y        : num  0 0.02 0 0 0 0 0 0 0 0 ...
##  $ gyros_belt_z        : num  -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 0 -0.02 -0.02 0 ...
##  $ accel_belt_x        : num  -22 -21 -21 -22 -22 -20 -21 -21 -22 -22 ...
##  $ accel_belt_y        : num  4 2 4 3 4 2 4 2 2 4 ...
##  $ accel_belt_z        : num  22 24 21 21 21 24 22 23 23 21 ...
##  $ magnet_belt_x       : num  -7 -6 0 -4 -2 1 -3 -5 -2 -3 ...
##  $ magnet_belt_y       : num  608 600 603 599 603 602 609 596 602 606 ...
##  $ magnet_belt_z       : num  -311 -302 -312 -311 -313 -312 -308 -317 -319 -309 ...
##  $ roll_arm            : num  -128 -128 -128 -128 -128 -128 -128 -128 -128 -128 ...
##  $ pitch_arm           : num  22.5 22.1 22 21.9 21.8 21.7 21.6 21.5 21.5 21.4 ...
##  $ yaw_arm             : num  -161 -161 -161 -161 -161 -161 -161 -161 -161 -161 ...
##  $ total_accel_arm     : num  34 34 34 34 34 34 34 34 34 34 ...
##  $ gyros_arm_x         : num  0.02 0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 ...
##  $ gyros_arm_y         : num  -0.02 -0.03 -0.03 -0.03 -0.02 -0.03 -0.03 -0.03 -0.03 -0.02 ...
##  $ gyros_arm_z         : num  -0.02 0 0 0 0 -0.02 -0.02 0 0 -0.02 ...
##  $ accel_arm_x         : num  -290 -289 -289 -289 -289 -288 -288 -290 -288 -287 ...
##  $ accel_arm_y         : num  110 111 111 111 111 109 110 110 111 111 ...
##  $ accel_arm_z         : num  -125 -123 -122 -125 -124 -122 -124 -123 -123 -124 ...
##  $ magnet_arm_x        : num  -369 -374 -369 -373 -372 -369 -376 -366 -363 -372 ...
##  $ magnet_arm_y        : num  337 337 342 336 338 341 334 339 343 338 ...
##  $ magnet_arm_z        : num  513 506 513 509 510 518 516 509 520 509 ...
##  $ roll_dumbbell       : num  13.1 13.4 13.4 13.1 12.8 ...
##  $ pitch_dumbbell      : num  -70.6 -70.4 -70.8 -70.2 -70.3 ...
##  $ yaw_dumbbell        : num  -84.7 -84.9 -84.5 -85.1 -85.1 ...
##  $ total_accel_dumbbell: num  37 37 37 37 37 37 37 37 37 37 ...
##  $ gyros_dumbbell_x    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ gyros_dumbbell_y    : num  -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 ...
##  $ gyros_dumbbell_z    : num  0 0 0 0 0 0 0 0 0 -0.02 ...
##  $ accel_dumbbell_x    : num  -233 -233 -234 -232 -234 -232 -235 -233 -233 -234 ...
##  $ accel_dumbbell_y    : num  47 48 48 47 46 47 48 47 47 48 ...
##  $ accel_dumbbell_z    : num  -269 -270 -269 -270 -272 -269 -270 -269 -270 -269 ...
##  $ magnet_dumbbell_x   : num  -555 -554 -558 -551 -555 -549 -558 -564 -554 -552 ...
##  $ magnet_dumbbell_y   : num  296 292 294 295 300 292 291 299 291 302 ...
##  $ magnet_dumbbell_z   : num  -64 -68 -66 -70 -74 -65 -69 -64 -65 -69 ...
##  $ roll_forearm        : num  28.3 28 27.9 27.9 27.8 27.7 27.7 27.6 27.5 27.2 ...
##  $ pitch_forearm       : num  -63.9 -63.9 -63.9 -63.9 -63.8 -63.8 -63.8 -63.8 -63.8 -63.9 ...
##  $ yaw_forearm         : num  -153 -152 -152 -152 -152 -152 -152 -152 -152 -151 ...
##  $ total_accel_forearm : num  36 36 36 36 36 36 36 36 36 36 ...
##  $ gyros_forearm_x     : num  0.02 0.02 0.02 0.02 0.02 0.03 0.02 0.02 0.02 0 ...
##  $ gyros_forearm_y     : num  0 0 -0.02 0 -0.02 0 0 -0.02 0.02 0 ...
##  $ gyros_forearm_z     : num  -0.02 -0.02 -0.03 -0.02 0 -0.02 -0.02 -0.02 -0.03 -0.03 ...
##  $ accel_forearm_x     : num  192 189 193 195 193 193 190 193 191 193 ...
##  $ accel_forearm_y     : num  203 206 203 205 205 204 205 205 203 205 ...
##  $ accel_forearm_z     : num  -216 -214 -215 -215 -213 -214 -215 -214 -215 -215 ...
##  $ magnet_forearm_x    : num  -18 -17 -9 -18 -9 -16 -22 -17 -11 -15 ...
##  $ magnet_forearm_y    : num  661 655 660 659 660 653 656 657 657 655 ...
##  $ magnet_forearm_z    : num  473 473 478 470 474 476 473 465 478 472 ...
##  $ classe              : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
## NULL

Ploting Features Grouped by User

The clean training data still includes nearly 15,000 observations. Feature plots below compare the primary features (excluding raw sensor output) for the four sensors on belt, arm, dumbbell, and forearm. No linear relationships appear though several categorical splits - especially by user - suggest a tree or random forest model will yield useful predictions.

  • Belt Features
    plot of chunk plotBelt
  • Arm Features
    plot of chunk plotArm
  • Dumbbell Features
    plot of chunk plotDbell
  • Forearm Features
    plot of chunk plotFore

Modeling Trees

Evaluating the possible predictive power of trees shows first that a standard tree suffers from poor accuracy. The high level of noise apparent in the graphs results in numerous errors. Bagging improves performance dramatically by averaging multiple trees. Note: both standard and bagged tree models used four K-folds for cross validation.

The clear significance of the user_name feature warns of over-fitting; accuracy would drop for new users without calibration. It’s difficult to estimate the error though the marked difference between standard and bagged trees - since an individual user may deviate widely from the “average” tree - indicates a potentially huge error.

  • Standard Tree (rpart)

    ## Loading required package: rpart
    ##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
    ##         0.4981         0.3445         0.4900         0.5062         0.2843 
    ## AccuracyPValue  McnemarPValue 
    ##         0.0000            NaN
  • Bagged Tree (treebag)

    ## Loading required package: ipred
    ## Loading required package: plyr
    ##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
    ##         0.9995         0.9994         0.9990         0.9998         0.2843 
    ## AccuracyPValue  McnemarPValue 
    ##         0.0000            NaN

Comparing to Random Forrest

Bagged trees create a highly (>99%) accurate model. Yet, random forest proves even better yielding only two errors when predicting against the training data. Note: cross-validation performed using out-of-bag with four repeats due to the improved performance over K-fold for random forest models.

The model certainly over-fits for the given users. Real-world use would require calibration to individual users so the added processing time over bagged trees represents a minimal cost.

## Loading required package: randomForest
## Warning: package 'randomForest' was built under R version 3.1.1
## randomForest 4.6-10
## Type rfNews() to see new features/changes/bug fixes.
## $table
##           Reference
## Prediction    A    B    C    D    E
##          A 4185    0    0    0    0
##          B    0 2848    0    0    0
##          C    0    0 2567    0    0
##          D    0    0    0 2412    0
##          E    0    0    0    0 2706
## 
## $overall
##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
##         1.0000         1.0000         0.9997         1.0000         0.2843 
## AccuracyPValue  McnemarPValue 
##         0.0000            NaN

Determing Next Steps

The final model will run a random forest given the insight gained from exploration:

  • All features show non-linear or weakly-linear (i.e. high variance) relationships.
  • Tree modeling performed poorly without bagging.
  • Model-based approaches might better fit individual users’ tendencies toward specific mistakes. However, the training dataset lacks prior probabilities as each class of error was created intentionally.
  • The random forest model produced slightly better accuracy than bagged trees with little added cost considering the need to calibrate each user.

Establish Final Model

Note: Cross-validation using out-of-bag increased to ten repeats.

# Set control for train function
# Uses out-of-bag cross-validation
tr <- trainControl(method = "oob", number = 10, allowParallel = TRUE)

# Create random forest final model
model <- train(classe ~ ., data = trn, method = "rf", trControl = tr)

# Print final model
print(model)
## Random Forest 
## 
## 14718 samples
##    53 predictors
##     5 classes: 'A', 'B', 'C', 'D', 'E' 
## 
## No pre-processing
## Resampling: Out of Bag Resampling 
## 
## Summary of sample sizes:  
## 
## Resampling results across tuning parameters:
## 
##   mtry  Accuracy  Kappa
##   2     1         1    
##   30    1         1    
##   60    1         1    
## 
## Accuracy was used to select the optimal model using  the largest value.
## The final value used for the model was mtry = 29.

Test Against Control Data

The final model yields a 99.4% accuracy. The drop estimates the out of sample error by predicting against a previously unused control subset from the overall training dataset. As noted previously, the real-world error when applied to new users will increase without calibration. The graph attempts to visualize model fit by counting the number results for each reference-prediction pair (log10 used to improve scaling against low error rate). Large counts fall along the matched reference-prediction pairs errors revealing which classes suffer prediction overlap.

# Predict control results and print confusion matrix
ctlPred <- predict(model, ctl)
print(confusionMatrix(ctlPred, ctl$classe))
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1390    5    0    0    0
##          B    3  939    4    0    0
##          C    2    5  849    4    1
##          D    0    0    2  800    1
##          E    0    0    0    0  899
## 
## Overall Statistics
##                                         
##                Accuracy : 0.994         
##                  95% CI : (0.992, 0.996)
##     No Information Rate : 0.284         
##     P-Value [Acc > NIR] : <2e-16        
##                                         
##                   Kappa : 0.993         
##  Mcnemar's Test P-Value : NA            
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity             0.996    0.989    0.993    0.995    0.998
## Specificity             0.999    0.998    0.997    0.999    1.000
## Pos Pred Value          0.996    0.993    0.986    0.996    1.000
## Neg Pred Value          0.999    0.997    0.999    0.999    1.000
## Prevalence              0.284    0.194    0.174    0.164    0.184
## Detection Rate          0.283    0.191    0.173    0.163    0.183
## Detection Prevalence    0.284    0.193    0.176    0.164    0.183
## Balanced Accuracy       0.997    0.994    0.995    0.997    0.999
# Plot accuracy with point size = log10(count)
ctlPlot <- ddply(data.frame(ctl, ctlPred), .(user_name, classe, ctlPred),
                 summarize, count = length(ctlPred))
plot <- qplot(classe, ctlPred, data = ctlPlot, size = log10(count),
              ylab = "prediction", main = "Test Against Control")
plot + geom_abline(slope = 1, intercept = 0, linetype = "dotted")

plot of chunk modelTest

# Cleanup variables
rm(ctlPred, ctlPlot, plot)

Conclusion

The final model can predict with incredible accuracy the error-class for standing dumbbell curls. Even more impressive, it does so using instantaneous measures not statistical summaries. Users could receive immediate feedback on the quality of their exercises.

New user would need to calibrate their measures by performing each error under testing conditions. Though limiting - as is the processing time for the random forest model - this still provides an opportunity for a sensor-based approach to teaching proper weight-lifting form.