Logan J Travis 2014-08-24
Can the growing abundance of small, inexpensive sensor determine exercise quality? Many consumer devices measure quantity such as number of steps, distance traveled, and exercise repetitions. Through the use of random forest modeling, the same devices show incredibly high (>99%) predictive accuracy for exercise quality from instantaneous measurements.
This paper expands upon data and analyses provided by Groupware@LES to explore predictive models for exercise quality. Through exploration and comparison of multiple models, it concludes with a 99.4% accurate prediction algorithm for classifying standing dumbbell curls into five groups including one perfect execution and four common mistakes.
Download training and test datasets to ./data/ if not present. Data provided as part of the John’s Hopkins University Practical Maching Learning class on Coursera by Groupware@LES from their Human Activity Recognition research project.
## Loading required package: lattice
## Loading required package: ggplot2
## [1] "File found at ./data/pml-training.csv"
## [1] "File found at ./data/pml-testing.csv"
The data collect by Groupware@LES includes six users, five classes with one perfect execution of the standing bicep curl exercise, and 152 measures. With over 19,000 observations, an initial exploration of the data provides needed insight into model development.
The paper submitted by Groupware@LES details their sliding window method in section 5.1 Feature Extraction and Selection. They calculated statistical measures for windows ranging from 0.5 to 2.5 seconds with 0.5 seconds of overlap (excluding the 0.5 second window).
However, the test data does not utilize such windows and includes only instantaneous measures. This does not align with the methodology explored in the paper though it does match their ultimate goal: continuous feedback to users. The models developed in this experiment will attempt to achieve that goal so do not utilize the statistical measures as predictors.
Reading the training data excluding statistical measures yields 60 columns including observation number, time/date/window, and class (“classe” in the dataset). Split 75/25 into training and control, the clean data structure reduces to 54 columns:
## 'data.frame': 14718 obs. of 54 variables:
## $ user_name : Factor w/ 6 levels "adelmo","carlitos",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ roll_belt : num 1.41 1.48 1.45 1.42 1.42 1.43 1.45 1.45 1.43 1.42 ...
## $ pitch_belt : num 8.07 8.07 8.06 8.09 8.13 8.16 8.17 8.18 8.18 8.2 ...
## $ yaw_belt : num -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
## $ total_accel_belt : num 3 3 3 3 3 3 3 3 3 3 ...
## $ gyros_belt_x : num 0.02 0.02 0.02 0.02 0.02 0.02 0.03 0.03 0.02 0.02 ...
## $ gyros_belt_y : num 0 0.02 0 0 0 0 0 0 0 0 ...
## $ gyros_belt_z : num -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 0 -0.02 -0.02 0 ...
## $ accel_belt_x : num -22 -21 -21 -22 -22 -20 -21 -21 -22 -22 ...
## $ accel_belt_y : num 4 2 4 3 4 2 4 2 2 4 ...
## $ accel_belt_z : num 22 24 21 21 21 24 22 23 23 21 ...
## $ magnet_belt_x : num -7 -6 0 -4 -2 1 -3 -5 -2 -3 ...
## $ magnet_belt_y : num 608 600 603 599 603 602 609 596 602 606 ...
## $ magnet_belt_z : num -311 -302 -312 -311 -313 -312 -308 -317 -319 -309 ...
## $ roll_arm : num -128 -128 -128 -128 -128 -128 -128 -128 -128 -128 ...
## $ pitch_arm : num 22.5 22.1 22 21.9 21.8 21.7 21.6 21.5 21.5 21.4 ...
## $ yaw_arm : num -161 -161 -161 -161 -161 -161 -161 -161 -161 -161 ...
## $ total_accel_arm : num 34 34 34 34 34 34 34 34 34 34 ...
## $ gyros_arm_x : num 0.02 0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 ...
## $ gyros_arm_y : num -0.02 -0.03 -0.03 -0.03 -0.02 -0.03 -0.03 -0.03 -0.03 -0.02 ...
## $ gyros_arm_z : num -0.02 0 0 0 0 -0.02 -0.02 0 0 -0.02 ...
## $ accel_arm_x : num -290 -289 -289 -289 -289 -288 -288 -290 -288 -287 ...
## $ accel_arm_y : num 110 111 111 111 111 109 110 110 111 111 ...
## $ accel_arm_z : num -125 -123 -122 -125 -124 -122 -124 -123 -123 -124 ...
## $ magnet_arm_x : num -369 -374 -369 -373 -372 -369 -376 -366 -363 -372 ...
## $ magnet_arm_y : num 337 337 342 336 338 341 334 339 343 338 ...
## $ magnet_arm_z : num 513 506 513 509 510 518 516 509 520 509 ...
## $ roll_dumbbell : num 13.1 13.4 13.4 13.1 12.8 ...
## $ pitch_dumbbell : num -70.6 -70.4 -70.8 -70.2 -70.3 ...
## $ yaw_dumbbell : num -84.7 -84.9 -84.5 -85.1 -85.1 ...
## $ total_accel_dumbbell: num 37 37 37 37 37 37 37 37 37 37 ...
## $ gyros_dumbbell_x : num 0 0 0 0 0 0 0 0 0 0 ...
## $ gyros_dumbbell_y : num -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 ...
## $ gyros_dumbbell_z : num 0 0 0 0 0 0 0 0 0 -0.02 ...
## $ accel_dumbbell_x : num -233 -233 -234 -232 -234 -232 -235 -233 -233 -234 ...
## $ accel_dumbbell_y : num 47 48 48 47 46 47 48 47 47 48 ...
## $ accel_dumbbell_z : num -269 -270 -269 -270 -272 -269 -270 -269 -270 -269 ...
## $ magnet_dumbbell_x : num -555 -554 -558 -551 -555 -549 -558 -564 -554 -552 ...
## $ magnet_dumbbell_y : num 296 292 294 295 300 292 291 299 291 302 ...
## $ magnet_dumbbell_z : num -64 -68 -66 -70 -74 -65 -69 -64 -65 -69 ...
## $ roll_forearm : num 28.3 28 27.9 27.9 27.8 27.7 27.7 27.6 27.5 27.2 ...
## $ pitch_forearm : num -63.9 -63.9 -63.9 -63.9 -63.8 -63.8 -63.8 -63.8 -63.8 -63.9 ...
## $ yaw_forearm : num -153 -152 -152 -152 -152 -152 -152 -152 -152 -151 ...
## $ total_accel_forearm : num 36 36 36 36 36 36 36 36 36 36 ...
## $ gyros_forearm_x : num 0.02 0.02 0.02 0.02 0.02 0.03 0.02 0.02 0.02 0 ...
## $ gyros_forearm_y : num 0 0 -0.02 0 -0.02 0 0 -0.02 0.02 0 ...
## $ gyros_forearm_z : num -0.02 -0.02 -0.03 -0.02 0 -0.02 -0.02 -0.02 -0.03 -0.03 ...
## $ accel_forearm_x : num 192 189 193 195 193 193 190 193 191 193 ...
## $ accel_forearm_y : num 203 206 203 205 205 204 205 205 203 205 ...
## $ accel_forearm_z : num -216 -214 -215 -215 -213 -214 -215 -214 -215 -215 ...
## $ magnet_forearm_x : num -18 -17 -9 -18 -9 -16 -22 -17 -11 -15 ...
## $ magnet_forearm_y : num 661 655 660 659 660 653 656 657 657 655 ...
## $ magnet_forearm_z : num 473 473 478 470 474 476 473 465 478 472 ...
## $ classe : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
## NULL
The clean training data still includes nearly 15,000 observations. Feature plots below compare the primary features (excluding raw sensor output) for the four sensors on belt, arm, dumbbell, and forearm. No linear relationships appear though several categorical splits - especially by user - suggest a tree or random forest model will yield useful predictions.
Evaluating the possible predictive power of trees shows first that a standard tree suffers from poor accuracy. The high level of noise apparent in the graphs results in numerous errors. Bagging improves performance dramatically by averaging multiple trees. Note: both standard and bagged tree models used four K-folds for cross validation.
The clear significance of the user_name feature warns of over-fitting; accuracy would drop for new users without calibration. It’s difficult to estimate the error though the marked difference between standard and bagged trees - since an individual user may deviate widely from the “average” tree - indicates a potentially huge error.
Standard Tree (rpart)
## Loading required package: rpart
## Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull
## 0.4981 0.3445 0.4900 0.5062 0.2843
## AccuracyPValue McnemarPValue
## 0.0000 NaNBagged Tree (treebag)
## Loading required package: ipred
## Loading required package: plyr
## Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull
## 0.9995 0.9994 0.9990 0.9998 0.2843
## AccuracyPValue McnemarPValue
## 0.0000 NaNBagged trees create a highly (>99%) accurate model. Yet, random forest proves even better yielding only two errors when predicting against the training data. Note: cross-validation performed using out-of-bag with four repeats due to the improved performance over K-fold for random forest models.
The model certainly over-fits for the given users. Real-world use would require calibration to individual users so the added processing time over bagged trees represents a minimal cost.
## Loading required package: randomForest
## Warning: package 'randomForest' was built under R version 3.1.1
## randomForest 4.6-10
## Type rfNews() to see new features/changes/bug fixes.
## $table
## Reference
## Prediction A B C D E
## A 4185 0 0 0 0
## B 0 2848 0 0 0
## C 0 0 2567 0 0
## D 0 0 0 2412 0
## E 0 0 0 0 2706
##
## $overall
## Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull
## 1.0000 1.0000 0.9997 1.0000 0.2843
## AccuracyPValue McnemarPValue
## 0.0000 NaN
The final model will run a random forest given the insight gained from exploration:
Note: Cross-validation using out-of-bag increased to ten repeats.
# Set control for train function
# Uses out-of-bag cross-validation
tr <- trainControl(method = "oob", number = 10, allowParallel = TRUE)
# Create random forest final model
model <- train(classe ~ ., data = trn, method = "rf", trControl = tr)
# Print final model
print(model)
## Random Forest
##
## 14718 samples
## 53 predictors
## 5 classes: 'A', 'B', 'C', 'D', 'E'
##
## No pre-processing
## Resampling: Out of Bag Resampling
##
## Summary of sample sizes:
##
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa
## 2 1 1
## 30 1 1
## 60 1 1
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 29.
The final model yields a 99.4% accuracy. The drop estimates the out of sample error by predicting against a previously unused control subset from the overall training dataset. As noted previously, the real-world error when applied to new users will increase without calibration. The graph attempts to visualize model fit by counting the number results for each reference-prediction pair (log10 used to improve scaling against low error rate). Large counts fall along the matched reference-prediction pairs errors revealing which classes suffer prediction overlap.
# Predict control results and print confusion matrix
ctlPred <- predict(model, ctl)
print(confusionMatrix(ctlPred, ctl$classe))
## Confusion Matrix and Statistics
##
## Reference
## Prediction A B C D E
## A 1390 5 0 0 0
## B 3 939 4 0 0
## C 2 5 849 4 1
## D 0 0 2 800 1
## E 0 0 0 0 899
##
## Overall Statistics
##
## Accuracy : 0.994
## 95% CI : (0.992, 0.996)
## No Information Rate : 0.284
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.993
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: A Class: B Class: C Class: D Class: E
## Sensitivity 0.996 0.989 0.993 0.995 0.998
## Specificity 0.999 0.998 0.997 0.999 1.000
## Pos Pred Value 0.996 0.993 0.986 0.996 1.000
## Neg Pred Value 0.999 0.997 0.999 0.999 1.000
## Prevalence 0.284 0.194 0.174 0.164 0.184
## Detection Rate 0.283 0.191 0.173 0.163 0.183
## Detection Prevalence 0.284 0.193 0.176 0.164 0.183
## Balanced Accuracy 0.997 0.994 0.995 0.997 0.999
# Plot accuracy with point size = log10(count)
ctlPlot <- ddply(data.frame(ctl, ctlPred), .(user_name, classe, ctlPred),
summarize, count = length(ctlPred))
plot <- qplot(classe, ctlPred, data = ctlPlot, size = log10(count),
ylab = "prediction", main = "Test Against Control")
plot + geom_abline(slope = 1, intercept = 0, linetype = "dotted")
# Cleanup variables
rm(ctlPred, ctlPlot, plot)
The final model can predict with incredible accuracy the error-class for standing dumbbell curls. Even more impressive, it does so using instantaneous measures not statistical summaries. Users could receive immediate feedback on the quality of their exercises.
New user would need to calibrate their measures by performing each error under testing conditions. Though limiting - as is the processing time for the random forest model - this still provides an opportunity for a sensor-based approach to teaching proper weight-lifting form.