Human activity recognition (HAR) has gained popularity over the years with the introduction of wearable devices, such as a Jawbone, producing large amounts of machine made data. It is very common for prediction algorithms to attempt to decipher what type of activity is being conducted, e.g. standing, walking, running, etc. However, in this analysis, we looked at a single activity carried out in 5 different ways, 1 correct way, and 4 incorrect ways. In order to do so, we utilized a study and underlying data set on this topic, which can be found here: (http://groupware.les.inf.puc-rio.br/har). The process for collecting the data, can be summed up with the below quote.
“Six young health participants were asked to perform one set of 10 repetitions of the unilateral dumbbell biceps curl in five different fashions: exactly according to the specification (Class A), throwing the elbows to the front (Class B), lifting the dumbbell only halfway (Class C), lowering the dumbbell only halfway (Class D) and throwing the hips to the front (Class E).” [1]
These ‘fashions’ are the subject of the prediction algorithm; more simply put, based on the movements from the utilized accelerometers used in the study, in what way were the participants executing the bicep curls.
The final model uses principle component analysis (PCA) and a random forest algorithm to predict each outcome. Furthermore, it has a very high out-of-sample accuracy of 96.2%.
In order to conduct this machine learning methodology, we need to load the caret, rattle, and the randomForest packages - note they have been previously installed with all their dependencies. Additionally, there are 2 external R scripts that we will utilize to work with physical files. More information on these can be found in the Appendix.
suppressMessages(library(caret))
suppressMessages(library(rattle))
suppressMessages(library(randomForest))
source('download data set.R')
source('submit files.R')
Using the above noted sources, we need to load the data into a data frame and replace any errant data points with NA values, in particular, empty strings and divide by zero errors. Additionally, it appears the first seven, maybe even eight, attributes are for informational purposes only; in turn, we will strip those attributes from the both the training and test sets.
training.set <- read.csv('source data/training set.csv',na.strings=c('NA','DIV/0!',''))[,7:160]
testing.set <- read.csv('source data/testing set.csv',na.strings=c('NA','DIV/0!',''))[,7:160]
In order to conduct cross validation in future stages, we need to break down the training set further in to two working sets, a functional training set and a set in which we will test our model fitting - we will call it the probe set. 75% will be in the former, and 25% will be in the latter.
set.seed(19801104)
inTrain = createDataPartition(training.set$classe,p=0.75,list = FALSE)
Once we set the partition and apply it to each set, you can see the dimensional breakdown of the observations and variables below.
probe.set = training.set [-inTrain, ]
dim(probe.set)
## [1] 4904 154
training.set = training.set [inTrain, ]
dim(training.set)
## [1] 14718 154
After a cursory review of the training set, many attributes are missing values. Since this is does not work well with prediction models, we have the option of trying to impute the data, or simply removing the variables in their entirety. Since there are over 150 variables in the set, we will start by removing them. When we do this we are left with just 53 variables to work with during the modeling process.
complete.vector <- apply(!is.na(training.set),2,sum) >= nrow(training.set)
training.set <- training.set[,complete.vector]
probe.set <- probe.set[,complete.vector]
testing.set <- testing.set[,complete.vector]
Two other cleaning exercises that we conducted were removing variables with little to no variance across the set, using the nearZeroVar() function, and pairing down the variables that shared a very high (95%+) correlation. It is important to note, that neither of these methods changed the training set in any significant way.
We can take a quick look at the data using the common str function. We can see there are integer and double precision numbers in the set. There is also three-dimensional data representations, i.e. x, y, and z axis data, with both positive and negative values.
str(training.set, list.len=20)
## 'data.frame': 14718 obs. of 54 variables:
## $ num_window : int 11 11 11 12 12 12 12 12 12 12 ...
## $ roll_belt : num 1.41 1.41 1.42 1.48 1.48 1.45 1.42 1.42 1.43 1.45 ...
## $ pitch_belt : num 8.07 8.07 8.07 8.05 8.07 8.06 8.09 8.13 8.16 8.17 ...
## $ yaw_belt : num -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
## $ total_accel_belt : int 3 3 3 3 3 3 3 3 3 3 ...
## $ gyros_belt_x : num 0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 0.03 ...
## $ gyros_belt_y : num 0 0 0 0 0.02 0 0 0 0 0 ...
## $ gyros_belt_z : num -0.02 -0.02 -0.02 -0.03 -0.02 -0.02 -0.02 -0.02 -0.02 0 ...
## $ accel_belt_x : int -21 -22 -20 -22 -21 -21 -22 -22 -20 -21 ...
## $ accel_belt_y : int 4 4 5 3 2 4 3 4 2 4 ...
## $ accel_belt_z : int 22 22 23 21 24 21 21 21 24 22 ...
## $ magnet_belt_x : int -3 -7 -2 -6 -6 0 -4 -2 1 -3 ...
## $ magnet_belt_y : int 599 608 600 604 600 603 599 603 602 609 ...
## $ magnet_belt_z : int -313 -311 -305 -310 -302 -312 -311 -313 -312 -308 ...
## $ roll_arm : num -128 -128 -128 -128 -128 -128 -128 -128 -128 -128 ...
## $ pitch_arm : num 22.5 22.5 22.5 22.1 22.1 22 21.9 21.8 21.7 21.6 ...
## $ yaw_arm : num -161 -161 -161 -161 -161 -161 -161 -161 -161 -161 ...
## $ total_accel_arm : int 34 34 34 34 34 34 34 34 34 34 ...
## $ gyros_arm_x : num 0 0.02 0.02 0.02 0 0.02 0 0.02 0.02 0.02 ...
## $ gyros_arm_y : num 0 -0.02 -0.02 -0.03 -0.03 -0.03 -0.03 -0.02 -0.03 -0.03 ...
## [list output truncated]
To take a quick look at the importance of certain variables, we can fit a model using the rpart method and we see the roll_belt attribute seems to play a large part in classifying the movement. Other variables and their thresholds can be seen in the breakdown below.
ex.fit <- train(classe ~ .,method='rpart',data=training.set)
ex.fit$finalModel
## n= 14718
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 14718 10533 A (0.28 0.19 0.17 0.16 0.18)
## 2) roll_belt< 130.5 13475 9300 A (0.31 0.21 0.19 0.18 0.11)
## 4) pitch_forearm< -34.15 1165 7 A (0.99 0.006 0 0 0) *
## 5) pitch_forearm>=-34.15 12310 9293 A (0.25 0.23 0.21 0.2 0.12)
## 10) magnet_dumbbell_y< 438.5 10346 7391 A (0.29 0.18 0.24 0.19 0.11)
## 20) roll_forearm< 122.5 6433 3800 A (0.41 0.18 0.18 0.17 0.06) *
## 21) roll_forearm>=122.5 3913 2615 C (0.082 0.18 0.33 0.23 0.18) *
## 11) magnet_dumbbell_y>=438.5 1964 984 B (0.032 0.5 0.047 0.23 0.19) *
## 3) roll_belt>=130.5 1243 10 E (0.008 0 0 0 0.99) *
Since model breaks down the features to only 5 from 53 and not decision path can reach the D activity type, we will not use this initially for the prediction, but we can see the visual paths of the decision tree in the following visual.
fancyRpartPlot(ex.fit$finalModel, main='', sub='')
In order to trim down the variables to more manageable number for a prediction algorithm, we will conduct a Principle Component Analysis to isolate components, or usable features, which will capture 80% of the variance of the training set.
pc <- preProcess(training.set[,-54],method='pca',thres=0.80)
The below rotation code is not evaluated at this point in the publication, but it can be viewed in the Appendix. Regardless, the Principle Component Analysis that was executed trimmed the relevant features down to 13 from 53.
pc$rotation
At this point we are content with 13 solid variables to use for predicting the $classe HAR weight lifting movement. Since we we conducted the PCA on the training set, we now need to apply the same methods to the probe and testing sets, as seen below.
training.pc <- predict(pc,training.set[,-54])
probe.pc <- predict(pc,probe.set[,-54])
testing.pc <- predict(pc,testing.set[,-54])
We can now train the model using our PCA variables using a Random Forest algorithm. We will utilize the the trainControl function in the caret package to set the cross validation methodology to use 3 k-folds across the training set. Then we set the fit object to train the classe variable based on all predictors.
tc <- trainControl(method = 'cv', number = 3)
fit <- train(training.set$classe ~ ., method='rf', data=training.pc, allowParallel=TRUE, prox=TRUE, tcControl=tc)
This process took a bit of time, 15 plus minutes to build the fitting. It was processed on a 64 bit machine with 16 GB of RAM and utilizing R version 3.2.1 (2015-06-18).
Finally, we get to predict based on our fitting. We can use the predict function with the probe set as an argument to conduct this process.
prediction <- predict(fit, newdata=probe.pc)
Now, we can compare the actual classe values of the probe set to those that we predicted. As seen below the model is very effective with an accuracy of 96% and a Kappa of 0.95. The algorithm seemed to have the most difficulty try to decipher between lowering the barbell half way (D), versus lifting the barbell halfway (C). Irrespective, the out of sample error rate is approximately 4%. A confusion matrix is printed below with a complete breakdown of the prediction.
confusionMatrix(prediction,probe.set$classe)
## Confusion Matrix and Statistics
##
## Reference
## Prediction A B C D E
## A 1364 23 2 8 2
## B 8 889 7 1 9
## C 12 30 829 32 11
## D 11 4 17 761 6
## E 0 3 0 2 873
##
## Overall Statistics
##
## Accuracy : 0.9617
## 95% CI : (0.9559, 0.9669)
## No Information Rate : 0.2845
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9515
## Mcnemar's Test P-Value : 5.85e-08
##
## Statistics by Class:
##
## Class: A Class: B Class: C Class: D Class: E
## Sensitivity 0.9778 0.9368 0.9696 0.9465 0.9689
## Specificity 0.9900 0.9937 0.9790 0.9907 0.9988
## Pos Pred Value 0.9750 0.9726 0.9070 0.9524 0.9943
## Neg Pred Value 0.9912 0.9850 0.9935 0.9895 0.9930
## Prevalence 0.2845 0.1935 0.1743 0.1639 0.1837
## Detection Rate 0.2781 0.1813 0.1690 0.1552 0.1780
## Detection Prevalence 0.2853 0.1864 0.1864 0.1629 0.1790
## Balanced Accuracy 0.9839 0.9652 0.9743 0.9686 0.9838
Since the model has been defined with such a high degree of accuracy, we can compare it to the testing set. When we do so, the accuracy remains sufficient as we predicted each of the 20 testing observations correctly.
submission <- predict(fit, newdata=testing.pc)
submission
## [1] B A B A A E D B A A B C B A E E A B B B
## Levels: A B C D E
write.files(submission)
[1] Velloso, E.; Bulling, A.; Gellersen, H.; Ugulino, W.; Fuks, H. Qualitative Activity Recognition of Weight Lifting Exercises. Proceedings of 4th International Conference in Cooperation with SIGCHI (Augmented Human ’13) . Stuttgart, Germany: ACM SIGCHI, 2013.
The write.files function used for the submission is below.
write.files
## function (x)
## {
## if (!file.exists("submission")) {
## dir.create("submission")
## }
## n = length(x)
## for (i in 1:n) {
## filename = paste0("submission/problem_id_", i, ".txt")
## write.table(x[i], file = filename, quote = FALSE, row.names = FALSE,
## col.names = FALSE)
## }
## }
The download.data function gets all the working files and places them in the appropriate files for future use.
download.data
## function ()
## {
## if (!file.exists("source data")) {
## dir.create("source data")
## }
## files.to.download <- c("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv",
## "https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv")
## downloaded.names <- c("source data/training set.csv", "source data/testing set.csv")
## if (!file.exists(downloaded.names[1])) {
## download.file(files.to.download[1], downloaded.names[1])
## }
## if (!file.exists(downloaded.names[2])) {
## download.file(files.to.download[2], downloaded.names[2])
## }
## }
The prediction rotation can be found below with respect to the PCA.
pc$rotation
## PC1 PC2 PC3 PC4
## num_window -0.001515166 0.086317921 -0.036561264 0.0049134648
## roll_belt 0.309932532 0.123942248 0.073588096 0.0300701270
## pitch_belt 0.018809255 -0.290227640 0.067703633 0.0390786792
## yaw_belt 0.205504620 0.247600035 0.023369826 0.0012374190
## total_accel_belt 0.306137622 0.104602974 0.094907773 0.0344491112
## gyros_belt_x -0.090213224 0.194894626 -0.194757477 -0.0506853014
## gyros_belt_y 0.108422834 0.207266083 -0.081822431 -0.0326313087
## gyros_belt_z -0.180011701 0.050109439 -0.105636514 -0.0434435507
## accel_belt_x -0.003821971 0.292576354 -0.084911708 -0.0420619917
## accel_belt_y 0.317527084 0.033564433 0.100341963 0.0421890276
## accel_belt_z -0.318458626 -0.099498416 -0.069094989 -0.0301119034
## magnet_belt_x 0.021666686 0.287205834 -0.041668093 -0.0265398846
## magnet_belt_y -0.114942862 0.092142106 0.077087803 0.0174579563
## magnet_belt_z -0.058278819 0.121963825 0.064728995 0.0161372265
## roll_arm -0.064470385 -0.178173624 -0.061913217 -0.0338815517
## pitch_arm -0.035095795 0.067724393 0.227982009 0.0750860013
## yaw_arm -0.051202735 -0.115138016 -0.006113475 -0.0137723350
## total_accel_arm -0.114617079 -0.035185401 -0.056671778 -0.0006619857
## gyros_arm_x 0.012543019 0.050455877 -0.004717400 -0.0026798463
## gyros_arm_y -0.077011776 -0.078852675 0.009125065 0.0021565267
## gyros_arm_z 0.161428947 0.181301810 -0.069976101 -0.0131299725
## accel_arm_x 0.158786776 -0.111097885 -0.170344753 -0.0599335586
## accel_arm_y -0.269738536 -0.120631233 0.130518299 0.0245398851
## accel_arm_z 0.127961638 -0.007458285 0.271389922 0.0731778569
## magnet_arm_x 0.090117564 -0.011571803 -0.262971370 -0.0884078615
## magnet_arm_y -0.063424057 0.026635045 0.364951986 0.1060749172
## magnet_arm_z -0.030299313 0.026543964 0.303710637 0.0898413362
## roll_dumbbell -0.083524723 0.128537990 -0.054020436 -0.0286185710
## pitch_dumbbell 0.106499480 -0.151971277 -0.091862729 -0.0412797396
## yaw_dumbbell 0.119670160 -0.267409179 -0.006868518 0.0128361416
## total_accel_dumbbell -0.165605819 0.149702695 0.120853616 0.0444714458
## gyros_dumbbell_x 0.006562720 -0.010302427 0.145820838 -0.4385066042
## gyros_dumbbell_y -0.001579318 0.040753116 -0.103248304 0.3567188293
## gyros_dumbbell_z -0.003407479 0.005450333 -0.129445937 0.4409974201
## accel_dumbbell_x 0.167226818 -0.142816697 -0.130226047 -0.0593193612
## accel_dumbbell_y -0.178461141 0.183060863 0.005869431 -0.0031800766
## accel_dumbbell_z 0.150221546 -0.249543441 -0.071988020 0.0036454690
## magnet_dumbbell_x 0.164472522 -0.199028870 0.142418476 0.0425225454
## magnet_dumbbell_y -0.142043876 0.174690956 -0.203350765 -0.0557548340
## magnet_dumbbell_z -0.170114564 -0.019619599 -0.183430357 -0.0775241643
## roll_forearm -0.066148227 -0.043533128 0.146150319 0.0318182014
## pitch_forearm 0.142821365 -0.105723553 -0.098357017 -0.0259962971
## yaw_forearm -0.111585458 -0.036035204 0.125760761 0.0310787233
## total_accel_forearm 0.005024961 0.094358703 -0.002329106 0.0216630970
## gyros_forearm_x 0.075570477 0.187836129 0.110524291 -0.1737474430
## gyros_forearm_y -0.001053306 0.019261106 -0.118282146 0.4064005602
## gyros_forearm_z -0.005199553 0.022682774 -0.133503491 0.4416806697
## accel_forearm_x -0.195058409 -0.086016290 0.124321245 0.0165625886
## accel_forearm_y -0.033326405 0.091654811 0.112690337 0.0397596657
## accel_forearm_z 0.034182659 0.039489929 -0.211827108 -0.0737363625
## magnet_forearm_x -0.105975154 -0.008363535 -0.005320828 -0.0093667109
## magnet_forearm_y -0.023724659 0.052583768 0.132309480 0.0506625765
## magnet_forearm_z 0.039027136 0.110233405 0.193557919 0.0678966086
## PC5 PC6 PC7 PC8
## num_window 0.0203727077 -0.111631920 0.094142872 -0.036216711
## roll_belt -0.0096571428 0.018352353 -0.039432646 0.083884658
## pitch_belt 0.1023817114 -0.181637213 0.123313900 0.036131281
## yaw_belt -0.0536365304 0.121814723 -0.097654999 0.039805335
## total_accel_belt 0.0127879293 0.017608199 -0.039381588 0.092225374
## gyros_belt_x -0.1020815060 -0.168962973 0.072591515 -0.049125061
## gyros_belt_y -0.0645398857 -0.150578879 0.071446782 0.038519648
## gyros_belt_z -0.0520777026 -0.117984320 0.145080907 -0.011210653
## accel_belt_x -0.1237945333 0.166819610 -0.101745216 -0.018293861
## accel_belt_y 0.0235715151 -0.033770063 -0.006648498 0.086540813
## accel_belt_z 0.0262097234 -0.020615578 0.033593667 -0.088877324
## magnet_belt_x -0.1179160845 0.191263556 -0.109418740 0.018925116
## magnet_belt_y 0.2659673279 -0.157425684 0.009977979 -0.144792038
## magnet_belt_z 0.2375174958 -0.177698997 0.042682116 -0.191912918
## roll_arm -0.0875672694 0.215690508 -0.013494982 -0.100094603
## pitch_arm -0.2001290062 -0.037119024 -0.051756989 -0.024437807
## yaw_arm -0.1097355910 0.131566044 -0.010710117 -0.054054665
## total_accel_arm 0.0717616434 0.022630257 0.042508204 0.333408812
## gyros_arm_x -0.0224070857 0.041680838 0.507725871 -0.058733211
## gyros_arm_y 0.0003956997 -0.053138206 -0.482489895 0.048913838
## gyros_arm_z -0.0197942318 -0.081808478 0.218380411 0.043127994
## accel_arm_x 0.2693864932 0.211798877 -0.055982256 -0.017944204
## accel_arm_y -0.1234551066 0.002827910 -0.024926729 -0.135574477
## accel_arm_z -0.1749333524 -0.055531781 -0.101420509 -0.264883998
## magnet_arm_x 0.2456289505 0.101623568 -0.061049826 -0.142715396
## magnet_arm_y -0.2037442849 -0.105513824 0.001293123 -0.036743809
## magnet_arm_z -0.2775440250 -0.154634804 -0.007013398 -0.197559026
## roll_dumbbell 0.0775007887 0.060457456 -0.148788014 -0.303258105
## pitch_dumbbell -0.0790745421 0.077165855 -0.058185140 -0.279255194
## yaw_dumbbell -0.0185444153 -0.027352421 0.072516412 -0.059551016
## total_accel_dumbbell 0.1555338434 0.123263984 -0.140148832 0.007406071
## gyros_dumbbell_x 0.0317940933 -0.009179094 -0.001889065 0.027637815
## gyros_dumbbell_y 0.0222653056 -0.015222921 -0.007315763 0.044766867
## gyros_dumbbell_z -0.0222212490 0.020611718 -0.017093001 -0.026215624
## accel_dumbbell_x -0.1651455418 0.046481009 0.003868674 -0.185493647
## accel_dumbbell_y 0.1377367022 0.091151979 -0.143932066 -0.188031620
## accel_dumbbell_z -0.1442639138 0.009399519 0.078652073 -0.052954295
## magnet_dumbbell_x 0.0579061261 0.203250696 -0.109550934 -0.043557365
## magnet_dumbbell_y -0.0473040824 -0.212194231 0.065570730 -0.124703098
## magnet_dumbbell_z -0.2459563587 0.223431544 -0.047404306 -0.025250937
## roll_forearm 0.1845583189 0.139922107 0.062096680 -0.002335470
## pitch_forearm 0.0735519537 -0.096299922 -0.075168578 -0.268677968
## yaw_forearm 0.0522901073 0.276555611 0.149271055 -0.065799570
## total_accel_forearm -0.0276796441 0.193692458 0.014183708 -0.241263065
## gyros_forearm_x -0.0321999646 0.137028174 -0.074368773 -0.032806631
## gyros_forearm_y 0.0108029738 0.026915146 -0.027862165 -0.002082380
## gyros_forearm_z -0.0210675418 0.065919052 -0.027704478 -0.019809536
## accel_forearm_x 0.0175674978 0.182813601 -0.013448421 0.274186173
## accel_forearm_y -0.0048261112 0.396092561 0.204119740 -0.048982660
## accel_forearm_z -0.3321988373 0.069757459 0.076876529 0.051761309
## magnet_forearm_x -0.1029667615 -0.003138387 -0.163385857 0.367933698
## magnet_forearm_y 0.0224134066 0.215021648 0.366165392 -0.010914197
## magnet_forearm_z 0.3311417216 -0.007477590 0.041748117 -0.026986884
## PC9 PC10 PC11 PC12
## num_window 0.1161545695 -0.001164462 0.1626110597 -0.297892172
## roll_belt -0.0141255521 -0.006496791 0.0074239124 -0.043830004
## pitch_belt 0.0158630597 -0.038235330 -0.0156771030 -0.158769527
## yaw_belt -0.0191961189 0.014075700 0.0119659251 0.114682924
## total_accel_belt -0.0178043404 0.001932436 0.0059901418 -0.078548739
## gyros_belt_x 0.0473841913 -0.119854754 0.0621661617 -0.053379175
## gyros_belt_y 0.0773306132 -0.062455189 0.1716700924 0.048899921
## gyros_belt_z 0.1432438368 0.042058289 0.1369597997 0.094666938
## accel_belt_x 0.0004421643 0.033512169 0.0229281280 0.097135836
## accel_belt_y -0.0217913443 -0.014884285 -0.0002520096 -0.062374465
## accel_belt_z 0.0067861469 0.012712114 -0.0094395979 0.065546998
## magnet_belt_x -0.0105783776 0.066345398 -0.0340877339 0.035059362
## magnet_belt_y -0.0848906163 -0.032219059 0.0476531089 0.312783862
## magnet_belt_z -0.0851412040 -0.128820029 0.0291447524 0.414219770
## roll_arm 0.0366661626 0.153787229 -0.0218567245 0.241437898
## pitch_arm -0.0538201340 -0.003909172 -0.2743107117 0.033327819
## yaw_arm 0.0658939420 0.232984468 0.1146903873 0.029711371
## total_accel_arm -0.1685809786 -0.253590881 -0.4075203757 0.006972761
## gyros_arm_x -0.3117720923 0.313904562 -0.0758334786 0.014435276
## gyros_arm_y 0.3155274546 -0.304850641 0.0446780829 -0.018634162
## gyros_arm_z -0.1371858545 0.029910410 -0.0299960892 -0.018402075
## accel_arm_x 0.0319289688 0.175803565 0.0365423041 0.018969676
## accel_arm_y 0.0089416509 0.134737935 0.0214743192 -0.050168936
## accel_arm_z 0.0799224803 0.171613831 0.2026963699 0.008864144
## magnet_arm_x 0.1086337352 0.214567101 0.1950016534 0.006626609
## magnet_arm_y -0.1019123074 0.016007202 -0.0468486362 -0.041715067
## magnet_arm_z 0.0343354902 0.057505236 0.1260097124 -0.002079011
## roll_dumbbell -0.3464481309 -0.161969795 0.0177273783 -0.232850747
## pitch_dumbbell -0.3452715131 -0.277528695 0.0913022488 -0.029892963
## yaw_dumbbell -0.0738550024 -0.160081598 0.0620498318 0.013436839
## total_accel_dumbbell -0.0907999217 0.167275401 -0.1316459181 -0.259450244
## gyros_dumbbell_x 0.0073714710 -0.004555360 -0.0348300167 -0.050155389
## gyros_dumbbell_y -0.0007061687 -0.031374727 0.0103924097 -0.134962928
## gyros_dumbbell_z -0.0087841135 0.020826797 0.0342535959 0.065018602
## accel_dumbbell_x -0.2103419263 -0.239825838 0.1282220783 0.109235617
## accel_dumbbell_y -0.2208640625 -0.005968677 -0.0214571531 -0.272730878
## accel_dumbbell_z -0.0448242035 -0.142147093 0.0561108552 0.040081024
## magnet_dumbbell_x -0.1837457267 0.008335362 -0.0785644675 -0.109628779
## magnet_dumbbell_y -0.0422292988 -0.147094989 0.0996190080 -0.152939503
## magnet_dumbbell_z -0.0192511522 0.066473498 -0.0209922997 -0.111596832
## roll_forearm 0.0307497965 0.019004342 0.2179700573 -0.155730073
## pitch_forearm 0.1215378227 0.168536529 -0.1120613219 -0.122697053
## yaw_forearm 0.1384429502 -0.064450026 -0.0654792082 0.041336358
## total_accel_forearm 0.0938707130 -0.142594204 -0.2935395835 0.231787151
## gyros_forearm_x -0.0077172174 -0.021954566 0.0545556429 0.155768506
## gyros_forearm_y -0.0121097874 0.015063529 -0.0041239422 0.035395100
## gyros_forearm_z -0.0076471226 0.024331231 0.0145594157 0.057378555
## accel_forearm_x -0.1795227959 -0.047673469 0.3244636357 0.055583374
## accel_forearm_y 0.1595441057 -0.237022389 0.1568346787 0.025108522
## accel_forearm_z 0.0666748826 -0.024181581 0.0459917673 -0.182749674
## magnet_forearm_x -0.3380435407 0.062916178 0.3809554314 0.156510383
## magnet_forearm_y 0.2315197128 -0.324826071 0.0971905170 -0.152208677
## magnet_forearm_z -0.0109772928 -0.082070982 0.2104930610 -0.068496464
## PC13
## num_window 0.514473353
## roll_belt 0.038224697
## pitch_belt -0.040475917
## yaw_belt 0.074527271
## total_accel_belt 0.044235882
## gyros_belt_x 0.013712116
## gyros_belt_y 0.005043368
## gyros_belt_z -0.144660767
## accel_belt_x 0.039069510
## accel_belt_y 0.058952290
## accel_belt_z -0.013951092
## magnet_belt_x 0.086307600
## magnet_belt_y 0.259921928
## magnet_belt_z 0.236394809
## roll_arm 0.282088530
## pitch_arm 0.092496361
## yaw_arm 0.431038970
## total_accel_arm 0.223311645
## gyros_arm_x -0.068157609
## gyros_arm_y 0.054202698
## gyros_arm_z 0.071577167
## accel_arm_x -0.068866312
## accel_arm_y -0.052025198
## accel_arm_z -0.113587810
## magnet_arm_x -0.150272531
## magnet_arm_y 0.056922408
## magnet_arm_z -0.067365940
## roll_dumbbell -0.099897061
## pitch_dumbbell 0.007358848
## yaw_dumbbell 0.052848904
## total_accel_dumbbell 0.033853800
## gyros_dumbbell_x 0.016992299
## gyros_dumbbell_y -0.061459035
## gyros_dumbbell_z -0.003764849
## accel_dumbbell_x 0.089016556
## accel_dumbbell_y 0.047474381
## accel_dumbbell_z 0.068455080
## magnet_dumbbell_x 0.025437795
## magnet_dumbbell_y -0.019034726
## magnet_dumbbell_z 0.074173441
## roll_forearm 0.145467023
## pitch_forearm 0.127475805
## yaw_forearm 0.101967211
## total_accel_forearm -0.253534411
## gyros_forearm_x 0.002256876
## gyros_forearm_y 0.053324367
## gyros_forearm_z 0.026745996
## accel_forearm_x -0.079963119
## accel_forearm_y -0.022462339
## accel_forearm_z 0.119998166
## magnet_forearm_x -0.041083258
## magnet_forearm_y -0.044545968
## magnet_forearm_z 0.014792678
The summary of the tidy training set is view with the summary function.
summary(training.set)
## num_window roll_belt pitch_belt yaw_belt
## Min. : 1 Min. :-28.90 Min. :-55.8000 Min. :-180.00
## 1st Qu.:222 1st Qu.: 1.10 1st Qu.: 1.7900 1st Qu.: -88.30
## Median :424 Median :114.00 Median : 5.2900 Median : -12.60
## Mean :431 Mean : 64.45 Mean : 0.3205 Mean : -11.14
## 3rd Qu.:645 3rd Qu.:123.00 3rd Qu.: 15.0000 3rd Qu.: 13.07
## Max. :864 Max. :162.00 Max. : 60.3000 Max. : 179.00
## total_accel_belt gyros_belt_x gyros_belt_y gyros_belt_z
## Min. : 0.00 Min. :-1.04000 Min. :-0.64000 Min. :-1.4600
## 1st Qu.: 3.00 1st Qu.:-0.03000 1st Qu.: 0.00000 1st Qu.:-0.2000
## Median :17.00 Median : 0.03000 Median : 0.02000 Median :-0.1000
## Mean :11.31 Mean :-0.00506 Mean : 0.03962 Mean :-0.1308
## 3rd Qu.:18.00 3rd Qu.: 0.11000 3rd Qu.: 0.11000 3rd Qu.:-0.0200
## Max. :29.00 Max. : 2.22000 Max. : 0.64000 Max. : 1.6100
## accel_belt_x accel_belt_y accel_belt_z magnet_belt_x
## Min. :-120.000 Min. :-69.00 Min. :-275.00 Min. :-49.00
## 1st Qu.: -21.000 1st Qu.: 3.00 1st Qu.:-162.00 1st Qu.: 9.00
## Median : -15.000 Median : 35.00 Median :-152.00 Median : 34.00
## Mean : -5.649 Mean : 30.21 Mean : -72.67 Mean : 55.43
## 3rd Qu.: -5.000 3rd Qu.: 61.00 3rd Qu.: 27.00 3rd Qu.: 59.00
## Max. : 83.000 Max. :164.00 Max. : 105.00 Max. :485.00
## magnet_belt_y magnet_belt_z roll_arm pitch_arm
## Min. :359.0 Min. :-623.0 Min. :-180.0 Min. :-88.800
## 1st Qu.:581.0 1st Qu.:-375.0 1st Qu.: -32.3 1st Qu.:-25.800
## Median :601.0 Median :-320.0 Median : 0.0 Median : 0.000
## Mean :593.7 Mean :-345.4 Mean : 17.7 Mean : -4.673
## 3rd Qu.:610.0 3rd Qu.:-306.0 3rd Qu.: 77.2 3rd Qu.: 11.100
## Max. :673.0 Max. : 293.0 Max. : 180.0 Max. : 88.500
## yaw_arm total_accel_arm gyros_arm_x gyros_arm_y
## Min. :-180.000 Min. : 1.00 Min. :-6.37000 Min. :-3.3700
## 1st Qu.: -43.000 1st Qu.:17.00 1st Qu.:-1.35000 1st Qu.:-0.8000
## Median : 0.000 Median :27.00 Median : 0.08000 Median :-0.2400
## Mean : -0.883 Mean :25.42 Mean : 0.04471 Mean :-0.2585
## 3rd Qu.: 45.600 3rd Qu.:33.00 3rd Qu.: 1.59000 3rd Qu.: 0.1400
## Max. : 180.000 Max. :66.00 Max. : 4.87000 Max. : 2.8400
## gyros_arm_z accel_arm_x accel_arm_y accel_arm_z
## Min. :-2.3300 Min. :-371.00 Min. :-318.00 Min. :-636.00
## 1st Qu.:-0.0700 1st Qu.:-240.00 1st Qu.: -55.00 1st Qu.:-144.00
## Median : 0.2300 Median : -40.00 Median : 14.00 Median : -47.00
## Mean : 0.2703 Mean : -59.12 Mean : 32.12 Mean : -71.45
## 3rd Qu.: 0.7200 3rd Qu.: 84.00 3rd Qu.: 138.00 3rd Qu.: 23.00
## Max. : 3.0200 Max. : 435.00 Max. : 303.00 Max. : 292.00
## magnet_arm_x magnet_arm_y magnet_arm_z roll_dumbbell
## Min. :-584.0 Min. :-392.0 Min. :-597 Min. :-153.71
## 1st Qu.:-296.0 1st Qu.: -11.0 1st Qu.: 126 1st Qu.: -17.55
## Median : 297.0 Median : 199.0 Median : 442 Median : 48.12
## Mean : 195.8 Mean : 155.3 Mean : 305 Mean : 24.13
## 3rd Qu.: 639.0 3rd Qu.: 322.0 3rd Qu.: 545 3rd Qu.: 68.11
## Max. : 782.0 Max. : 583.0 Max. : 694 Max. : 153.55
## pitch_dumbbell yaw_dumbbell total_accel_dumbbell
## Min. :-149.59 Min. :-150.871 Min. : 0.00
## 1st Qu.: -40.49 1st Qu.: -77.495 1st Qu.: 4.00
## Median : -20.65 Median : -3.588 Median :10.00
## Mean : -10.55 Mean : 1.721 Mean :13.68
## 3rd Qu.: 17.73 3rd Qu.: 79.337 3rd Qu.:19.00
## Max. : 149.40 Max. : 154.952 Max. :58.00
## gyros_dumbbell_x gyros_dumbbell_y gyros_dumbbell_z
## Min. :-204.0000 Min. :-2.1000 Min. : -2.3800
## 1st Qu.: -0.0300 1st Qu.:-0.1400 1st Qu.: -0.3100
## Median : 0.1300 Median : 0.0500 Median : -0.1300
## Mean : 0.1555 Mean : 0.0503 Mean : -0.1221
## 3rd Qu.: 0.3500 3rd Qu.: 0.2100 3rd Qu.: 0.0300
## Max. : 2.2200 Max. :52.0000 Max. :317.0000
## accel_dumbbell_x accel_dumbbell_y accel_dumbbell_z magnet_dumbbell_x
## Min. :-419.00 Min. :-189.00 Min. :-334.00 Min. :-639
## 1st Qu.: -50.00 1st Qu.: -8.00 1st Qu.:-142.00 1st Qu.:-535
## Median : -8.00 Median : 41.00 Median : -1.00 Median :-479
## Mean : -28.28 Mean : 52.66 Mean : -38.27 Mean :-328
## 3rd Qu.: 11.00 3rd Qu.: 110.00 3rd Qu.: 38.00 3rd Qu.:-301
## Max. : 234.00 Max. : 315.00 Max. : 318.00 Max. : 584
## magnet_dumbbell_y magnet_dumbbell_z roll_forearm pitch_forearm
## Min. :-3600.0 Min. :-262.00 Min. :-180.000 Min. :-72.500
## 1st Qu.: 232.0 1st Qu.: -45.00 1st Qu.: -1.018 1st Qu.: 0.000
## Median : 312.0 Median : 13.00 Median : 21.500 Median : 9.475
## Mean : 222.4 Mean : 45.98 Mean : 33.542 Mean : 10.829
## 3rd Qu.: 392.0 3rd Qu.: 95.75 3rd Qu.: 140.000 3rd Qu.: 28.400
## Max. : 633.0 Max. : 452.00 Max. : 180.000 Max. : 89.800
## yaw_forearm total_accel_forearm gyros_forearm_x
## Min. :-180.00 Min. : 0.00 Min. :-22.0000
## 1st Qu.: -69.80 1st Qu.: 29.00 1st Qu.: -0.2200
## Median : 0.00 Median : 36.00 Median : 0.0500
## Mean : 18.67 Mean : 34.72 Mean : 0.1572
## 3rd Qu.: 110.00 3rd Qu.: 41.00 3rd Qu.: 0.5800
## Max. : 180.00 Max. :108.00 Max. : 3.9700
## gyros_forearm_y gyros_forearm_z accel_forearm_x accel_forearm_y
## Min. : -7.02000 Min. : -8.0900 Min. :-498.00 Min. :-595.0
## 1st Qu.: -1.48000 1st Qu.: -0.1800 1st Qu.:-179.00 1st Qu.: 55.0
## Median : 0.03000 Median : 0.0800 Median : -57.00 Median : 199.0
## Mean : 0.08483 Mean : 0.1523 Mean : -62.36 Mean : 162.2
## 3rd Qu.: 1.65000 3rd Qu.: 0.4900 3rd Qu.: 76.00 3rd Qu.: 312.0
## Max. :311.00000 Max. :231.0000 Max. : 477.00 Max. : 923.0
## accel_forearm_z magnet_forearm_x magnet_forearm_y magnet_forearm_z
## Min. :-446.00 Min. :-1280.0 Min. :-896.0 Min. :-973.0
## 1st Qu.:-182.00 1st Qu.: -616.0 1st Qu.: -5.0 1st Qu.: 183.0
## Median : -39.00 Median : -376.0 Median : 587.0 Median : 509.0
## Mean : -55.65 Mean : -312.8 Mean : 376.5 Mean : 392.7
## 3rd Qu.: 26.00 3rd Qu.: -77.0 3rd Qu.: 736.0 3rd Qu.: 653.0
## Max. : 291.00 Max. : 672.0 Max. :1480.0 Max. :1090.0
## classe
## A:4185
## B:2848
## C:2567
## D:2412
## E:2706
##