Evaluating Performance Measures

The objective of this assignment is to compare two different classification algorithms using accuracy and performance metrics. I’m going to use K-NN and Naive Bayes to to predict the age of an abalone using abalone features. Abalone are shell fish that are popular to eat in many countries, especially raw in a sashimi spread. The rings attribute corresponds to an abalone’s age in years (after the abalone reaches 1 to 1.5 years of age). The process of determining an abalone’s age is tedious and time consuming, so using classification machine learning might be useful for predicting an abalone’s age.

Data was taken from this website https://archive.ics.uci.edu/ml/datasets/Abalone

Here are the attribute descriptions:

Sex / nominal / – / M, F, and I (infant) Length / continuous / mm / Longest shell measurement Diameter / continuous / mm / perpendicular to length Height / continuous / mm / with meat in shell Whole weight / continuous / grams / whole abalone Shucked weight / continuous / grams / weight of meat Viscera weight / continuous / grams / gut weight (after bleeding) Shell weight / continuous / grams / after being dried Rings / integer / – / +1.5 gives the age in years

 abalone <- read.csv(url("https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data"), header = FALSE, sep = ",")
colnames(abalone) <- c("sex", "length", 'diameter', 'height', 'whole_weight', 'shucked_wieght', 'viscera_wieght', 'shell_weight', 'rings' )

summary(abalone)

 sex          length         diameter          height      
 F:1307   Min.   :0.075   Min.   :0.0550   Min.   :0.0000  
 I:1342   1st Qu.:0.450   1st Qu.:0.3500   1st Qu.:0.1150  
 M:1528   Median :0.545   Median :0.4250   Median :0.1400  
          Mean   :0.524   Mean   :0.4079   Mean   :0.1395  
          3rd Qu.:0.615   3rd Qu.:0.4800   3rd Qu.:0.1650  
          Max.   :0.815   Max.   :0.6500   Max.   :1.1300  
  whole_weight    shucked_wieght   viscera_wieght  
 Min.   :0.0020   Min.   :0.0010   Min.   :0.0005  
 1st Qu.:0.4415   1st Qu.:0.1860   1st Qu.:0.0935  
 Median :0.7995   Median :0.3360   Median :0.1710  
 Mean   :0.8287   Mean   :0.3594   Mean   :0.1806  
 3rd Qu.:1.1530   3rd Qu.:0.5020   3rd Qu.:0.2530  
 Max.   :2.8255   Max.   :1.4880   Max.   :0.7600  
  shell_weight        rings       
 Min.   :0.0015   Min.   : 1.000  
 1st Qu.:0.1300   1st Qu.: 8.000  
 Median :0.2340   Median : 9.000  
 Mean   :0.2388   Mean   : 9.934  
 3rd Qu.:0.3290   3rd Qu.:11.000  
 Max.   :1.0050   Max.   :29.000

str(abalone)

'data.frame':   4177 obs. of  9 variables:
 $ sex           : Factor w/ 3 levels "F","I","M": 3 3 1 3 2 2 1 1 3 1 ...
 $ length        : num  0.455 0.35 0.53 0.44 0.33 0.425 0.53 0.545 0.475 0.55 ...
 $ diameter      : num  0.365 0.265 0.42 0.365 0.255 0.3 0.415 0.425 0.37 0.44 ...
 $ height        : num  0.095 0.09 0.135 0.125 0.08 0.095 0.15 0.125 0.125 0.15 ...
 $ whole_weight  : num  0.514 0.226 0.677 0.516 0.205 ...
 $ shucked_wieght: num  0.2245 0.0995 0.2565 0.2155 0.0895 ...
 $ viscera_wieght: num  0.101 0.0485 0.1415 0.114 0.0395 ...
 $ shell_weight  : num  0.15 0.07 0.21 0.155 0.055 0.12 0.33 0.26 0.165 0.32 ...
 $ rings         : int  15 7 9 10 7 8 20 16 9 19 ...

summary(abalone$rings)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   8.000   9.000   9.934  11.000  29.000

As shown above, the “rings” variable has a range between 1-29. This is the variable that we want to predict, and predicting this many levels might not give us the insight we’re looking for. I suspect that there’s an optimal age range for harvesting abalones for consumption. While I don’t know this age range, this project could be adjusted with the sought-after age range inserted. For now, we’ll break the rings variable into 3 levels" “young” for abalones less than 8, “adult” for abalones between 8-11, and “old” for abalones older than 11.

abalone$rings <- as.numeric(abalone$rings)
abalone$rings <- cut(abalone$rings, br=c(-1,8,11,35), labels = c("young", 'adult', 'old'))
abalone$rings <- as.factor(abalone$rings)
summary(abalone$rings)

young adult   old 
 1407  1810   960

I’m going to create a couple of different classification models, and then compare them using accuracy and performance metrics. I’ll start with a KNN classification algorithm. Because KNN requires all numeric variables for prediction, I’m going to remove the “sex” variable.

z <- abalone
z$sex <- NULL

I’ll now normalize the data using min max normalization

normalize <- function(x) {
  return ((x - min(x)) / (max(x) - min(x)))
}
z[1:7] <- as.data.frame(lapply(z[1:7], normalize))
summary(z$shucked_wieght)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.1244  0.2253  0.2410  0.3369  1.0000

Now each variable has a min of 0 and a max of 1. We’ll now split the data into training and testing sets.

ind <- sample(2, nrow(z), replace=TRUE, prob=c(0.7, 0.3))
KNNtrain <- z[ind==1,]
KNNtest <- z[ind==2,]

Now we run the model. I’m going to make k equal to the square root of 2918, the number of observations in the training set.

library(class)
KNNpred <- knn(train = KNNtrain[1:7], test = KNNtest[1:7], cl = KNNtrain$rings, k = 54)

Let’s see how the model does on the test data.

library("gmodels")
CrossTable(x = KNNtest$rings, y = KNNpred, prop.chisq = FALSE)


 
   Cell Contents
|-------------------------|
|                       N |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  1276 

 
              | KNNpred 
KNNtest$rings |     young |     adult |       old | Row Total | 
--------------|-----------|-----------|-----------|-----------|
        young |       325 |        93 |         0 |       418 | 
              |     0.778 |     0.222 |     0.000 |     0.328 | 
              |     0.747 |     0.135 |     0.000 |           | 
              |     0.255 |     0.073 |     0.000 |           | 
--------------|-----------|-----------|-----------|-----------|
        adult |        90 |       418 |        35 |       543 | 
              |     0.166 |     0.770 |     0.064 |     0.426 | 
              |     0.207 |     0.607 |     0.230 |           | 
              |     0.071 |     0.328 |     0.027 |           | 
--------------|-----------|-----------|-----------|-----------|
          old |        20 |       178 |       117 |       315 | 
              |     0.063 |     0.565 |     0.371 |     0.247 | 
              |     0.046 |     0.258 |     0.770 |           | 
              |     0.016 |     0.139 |     0.092 |           | 
--------------|-----------|-----------|-----------|-----------|
 Column Total |       435 |       689 |       152 |      1276 | 
              |     0.341 |     0.540 |     0.119 |           | 
--------------|-----------|-----------|-----------|-----------|

(328+451+97)/((84+2+26+91+21+159)+(328+451+97))

[1] 0.6957903

This KNN classifier predicted the abalone age with 69% accuracy - likely not accurate enough for an abalone harvester to trust. Before moving on to more specific accuracy and performance tests I’m going to try a smaller k value and see if it improves the accuracy.

library(class)
KNNpred <- knn(train = KNNtrain[1:7], test = KNNtest[1:7], cl = KNNtrain$rings, k = 10)

library("gmodels")
CrossTable(x = KNNtest$rings, y = KNNpred, prop.chisq = FALSE)


 
   Cell Contents
|-------------------------|
|                       N |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  1276 

 
              | KNNpred 
KNNtest$rings |     young |     adult |       old | Row Total | 
--------------|-----------|-----------|-----------|-----------|
        young |       318 |        96 |         4 |       418 | 
              |     0.761 |     0.230 |     0.010 |     0.328 | 
              |     0.741 |     0.150 |     0.019 |           | 
              |     0.249 |     0.075 |     0.003 |           | 
--------------|-----------|-----------|-----------|-----------|
        adult |        94 |       382 |        67 |       543 | 
              |     0.173 |     0.703 |     0.123 |     0.426 | 
              |     0.219 |     0.597 |     0.324 |           | 
              |     0.074 |     0.299 |     0.053 |           | 
--------------|-----------|-----------|-----------|-----------|
          old |        17 |       162 |       136 |       315 | 
              |     0.054 |     0.514 |     0.432 |     0.247 | 
              |     0.040 |     0.253 |     0.657 |           | 
              |     0.013 |     0.127 |     0.107 |           | 
--------------|-----------|-----------|-----------|-----------|
 Column Total |       429 |       640 |       207 |      1276 | 
              |     0.336 |     0.502 |     0.162 |           | 
--------------|-----------|-----------|-----------|-----------|

(313+422+135)/((313+422+135)+(94+7+57+89+17+125))

[1] 0.6910246

This model has just about the same predictive power on the test set. This can also be shown in the confusion matrix below:

library(caret)

package ‘caret’ was built under R version 3.3.2Loading required package: lattice
Loading required package: ggplot2
package ‘ggplot2’ was built under R version 3.3.2

confusionMatrix(KNNpred, KNNtest$rings)

Confusion Matrix and Statistics

          Reference
Prediction young adult old
     young   318    94  17
     adult    96   382 162
     old       4    67 136

Overall Statistics
                                          
               Accuracy : 0.6552          
                 95% CI : (0.6284, 0.6813)
    No Information Rate : 0.4255          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.4581          
 Mcnemar's Test P-Value : 2.749e-10       

Statistics by Class:

                     Class: young Class: adult Class: old
Sensitivity                0.7608       0.7035     0.4317
Specificity                0.8706       0.6480     0.9261
Pos Pred Value             0.7413       0.5969     0.6570
Neg Pred Value             0.8819       0.7469     0.8326
Prevalence                 0.3276       0.4255     0.2469
Detection Rate             0.2492       0.2994     0.1066
Detection Prevalence       0.3362       0.5016     0.1622
Balanced Accuracy          0.8157       0.6758     0.6789

The misclassification rate is 1 minus the accuracy, shown below.

1-0.691

[1] 0.309

Let’s now create a naive bayes classifier for the same data.

NBtrain <- KNNtrain
NBtest <- KNNtest

library(e1071)

package ‘e1071’ was built under R version 3.3.2

model <- naiveBayes(rings ~., data = NBtrain)
model


Naive Bayes Classifier for Discrete Predictors

Call:
naiveBayes.default(x = X, y = Y, laplace = laplace)

A-priori probabilities:
Y
    young     adult       old 
0.3409169 0.4367459 0.2223371 

Conditional probabilities:
       length
Y            [,1]      [,2]
  young 0.4716203 0.1490067
  adult 0.6712654 0.1177351
  old   0.6912424 0.1092433

       diameter
Y            [,1]      [,2]
  young 0.4507906 0.1502746
  adult 0.6580688 0.1193139
  old   0.6863657 0.1123703

       height
Y             [,1]       [,2]
  young 0.09493365 0.04008618
  adult 0.13448953 0.02663204
  old   0.14618234 0.02554316

       whole_weight
Y            [,1]      [,2]
  young 0.1544802 0.1083561
  adult 0.3494055 0.1509557
  old   0.3924864 0.1602648

       shucked_wieght
Y            [,1]       [,2]
  young 0.1337806 0.09876195
  adult 0.2970106 0.14060814
  old   0.2962226 0.13836111

       viscera_wieght
Y            [,1]       [,2]
  young 0.1241098 0.09026025
  adult 0.2866922 0.12776408
  old   0.3145439 0.13646942

       shell_weight
Y            [,1]       [,2]
  young 0.1206878 0.08019425
  adult 0.2733141 0.10873391
  old   0.3365036 0.13167884

pred <- predict(model, NBtest)
print(confusionMatrix(pred,NBtest$rings))

Confusion Matrix and Statistics

          Reference
Prediction young adult old
     young   336   126  47
     adult    76   285 139
     old       6   132 129

Overall Statistics
                                          
               Accuracy : 0.5878          
                 95% CI : (0.5602, 0.6149)
    No Information Rate : 0.4255          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.3667          
 Mcnemar's Test P-Value : 1.32e-09        

Statistics by Class:

                     Class: young Class: adult Class: old
Sensitivity                0.8038       0.5249     0.4095
Specificity                0.7984       0.7067     0.8564
Pos Pred Value             0.6601       0.5700     0.4831
Neg Pred Value             0.8931       0.6675     0.8157
Prevalence                 0.3276       0.4255     0.2469
Detection Rate             0.2633       0.2234     0.1011
Detection Prevalence       0.3989       0.3918     0.2092
Balanced Accuracy          0.8011       0.6158     0.6330

The accuracy rate for the naive bayes model predicting the test set is only about 59%, which makes the misclassification rate approx. 41%.

While it’s likely that neither algorithm is adequate for predicting the abalone age, the KNN model is more accurate so far.

Let’s try a bootstrapping method for further model evaluation.

library(caret)
train_control <- trainControl(method='boot', number = 100)
trModel <- train(rings~., data = z, trControl=train_control, method="nb")

Loading required package: MASS
package ‘MASS’ was built under R version 3.3.2Numerical 0 probability for all classes with observation 777Numerical 0 probability for all classes with observation 330Numerical 0 probability for all classes with observation 454Numerical 0 probability for all classes with observation 664Numerical 0 probability for all classes with observation 773Numerical 0 probability for all classes with observation 644Numerical 0 probability for all classes with observation 352Numerical 0 probability for all classes with observation 644Numerical 0 probability for all classes with observation 522Numerical 0 probability for all classes with observation 624Numerical 0 probability for all classes with observation 301Numerical 0 probability for all classes with observation 499Numerical 0 probability for all classes with observation 624Numerical 0 probability for all classes with observation 323Numerical 0 probability for all classes with observation 1343Numerical 0 probability for all classes with observation 333Numerical 0 probability for all classes with observation 459Numerical 0 probability for all classes with observation 530Numerical 0 probability for all classes with observation 331Numerical 0 probability for all classes with observation 517Numerical 0 probability for all classes with observation 1354Numerical 0 probability for all classes with observation 437Numerical 0 probability for all classes with observation 665Numerical 0 probability for all classes with observation 437Numerical 0 probability for all classes with observation 528Numerical 0 probability for all classes with observation 533Numerical 0 probability for all classes with observation 664Numerical 0 probability for all classes with observation 665Numerical 0 probability for all classes with observation 1378Numerical 0 probability for all classes with observation 503Numerical 0 probability for all classes with observation 625Numerical 0 probability for all classes with observation 498Numerical 0 probability for all classes with observation 503Numerical 0 probability for all classes with observation 624Numerical 0 probability for all classes with observation 625Numerical 0 probability for all classes with observation 752Numerical 0 probability for all classes with observation 1362Numerical 0 probability for all classes with observation 491Numerical 0 probability for all classes with observation 422Numerical 0 probability for all classes with observation 487Numerical 0 probability for all classes with observation 704Numerical 0 probability for all classes with observation 336Numerical 0 probability for all classes with observation 541Numerical 0 probability for all classes with observation 779Numerical 0 probability for all classes with observation 643Numerical 0 probability for all classes with observation 742Numerical 0 probability for all classes with observation 1352Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 449Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 1387Numerical 0 probability for all classes with observation 545Numerical 0 probability for all classes with observation 761Numerical 0 probability for all classes with observation 1387Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 537Numerical 0 probability for all classes with observation 750Numerical 0 probability for all classes with observation 630Numerical 0 probability for all classes with observation 629Numerical 0 probability for all classes with observation 630Numerical 0 probability for all classes with observation 731Numerical 0 probability for all classes with observation 678Numerical 0 probability for all classes with observation 358Numerical 0 probability for all classes with observation 467Numerical 0 probability for all classes with observation 538Numerical 0 probability for all classes with observation 678Numerical 0 probability for all classes with observation 621Numerical 0 probability for all classes with observation 621Numerical 0 probability for all classes with observation 745Numerical 0 probability for all classes with observation 538Numerical 0 probability for all classes with observation 532Numerical 0 probability for all classes with observation 330Numerical 0 probability for all classes with observation 450Numerical 0 probability for all classes with observation 748Numerical 0 probability for all classes with observation 651Numerical 0 probability for all classes with observation 324Numerical 0 probability for all classes with observation 520Numerical 0 probability for all classes with observation 651Numerical 0 probability for all classes with observation 614Numerical 0 probability for all classes with observation 416Numerical 0 probability for all classes with observation 614Numerical 0 probability for all classes with observation 672Numerical 0 probability for all classes with observation 552Numerical 0 probability for all classes with observation 672Numerical 0 probability for all classes with observation 759Numerical 0 probability for all classes with observation 662Numerical 0 probability for all classes with observation 320Numerical 0 probability for all classes with observation 450Numerical 0 probability for all classes with observation 525Numerical 0 probability for all classes with observation 529Numerical 0 probability for all classes with observation 662Numerical 0 probability for all classes with observation 1409Numerical 0 probability for all classes with observation 657Numerical 0 probability for all classes with observation 530Numerical 0 probability for all classes with observation 657Numerical 0 probability for all classes with observation 338Numerical 0 probability for all classes with observation 516Numerical 0 probability for all classes with observation 338Numerical 0 probability for all classes with observation 437Numerical 0 probability for all classes with observation 516Numerical 0 probability for all classes with observation 637Numerical 0 probability for all classes with observation 749Numerical 0 probability for all classes with observation 331Numerical 0 probability for all classes with observation 511Numerical 0 probability for all classes with observation 734Numerical 0 probability for all classes with observation 643Numerical 0 probability for all classes with observation 509Numerical 0 probability for all classes with observation 643Numerical 0 probability for all classes with observation 516Numerical 0 probability for all classes with observation 434Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 540Numerical 0 probability for all classes with observation 668Numerical 0 probability for all classes with observation 668Numerical 0 probability for all classes with observation 1355Numerical 0 probability for all classes with observation 325Numerical 0 probability for all classes with observation 527Numerical 0 probability for all classes with observation 775Numerical 0 probability for all classes with observation 552Numerical 0 probability for all classes with observation 679Numerical 0 probability for all classes with observation 353Numerical 0 probability for all classes with observation 547Numerical 0 probability for all classes with observation 552Numerical 0 probability for all classes with observation 679Numerical 0 probability for all classes with observation 795Numerical 0 probability for all classes with observation 630Numerical 0 probability for all classes with observation 630Numerical 0 probability for all classes with observation 626Numerical 0 probability for all classes with observation 514Numerical 0 probability for all classes with observation 626Numerical 0 probability for all classes with observation 651Numerical 0 probability for all classes with observation 544Numerical 0 probability for all classes with observation 651Numerical 0 probability for all classes with observation 318Numerical 0 probability for all classes with observation 444Numerical 0 probability for all classes with observation 653Numerical 0 probability for all classes with observation 444Numerical 0 probability for all classes with observation 531Numerical 0 probability for all classes with observation 653Numerical 0 probability for all classes with observation 744Numerical 0 probability for all classes with observation 495Numerical 0 probability for all classes with observation 736Numerical 0 probability for all classes with observation 1373Numerical 0 probability for all classes with observation 615Numerical 0 probability for all classes with observation 614Numerical 0 probability for all classes with observation 615Numerical 0 probability for all classes with observation 1361Numerical 0 probability for all classes with observation 340Numerical 0 probability for all classes with observation 476Numerical 0 probability for all classes with observation 550Numerical 0 probability for all classes with observation 331Numerical 0 probability for all classes with observation 515Numerical 0 probability for all classes with observation 756Numerical 0 probability for all classes with observation 328Numerical 0 probability for all classes with observation 755Numerical 0 probability for all classes with observation 324Numerical 0 probability for all classes with observation 516Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 667Numerical 0 probability for all classes with observation 753Numerical 0 probability for all classes with observation 459Numerical 0 probability for all classes with observation 542Numerical 0 probability for all classes with observation 652Numerical 0 probability for all classes with observation 848Numerical 0 probability for all classes with observation 459Numerical 0 probability for all classes with observation 652Numerical 0 probability for all classes with observation 318Numerical 0 probability for all classes with observation 763Numerical 0 probability for all classes with observation 1358Numerical 0 probability for all classes with observation 667Numerical 0 probability for all classes with observation 655Numerical 0 probability for all classes with observation 351Numerical 0 probability for all classes with observation 654Numerical 0 probability for all classes with observation 655Numerical 0 probability for all classes with observation 441Numerical 0 probability for all classes with observation 318Numerical 0 probability for all classes with observation 441Numerical 0 probability for all classes with observation 516Numerical 0 probability for all classes with observation 650Numerical 0 probability for all classes with observation 1386Numerical 0 probability for all classes with observation 437Numerical 0 probability for all classes with observation 321Numerical 0 probability for all classes with observation 437Numerical 0 probability for all classes with observation 634Numerical 0 probability for all classes with observation 319Numerical 0 probability for all classes with observation 512Numerical 0 probability for all classes with observation 634Numerical 0 probability for all classes with observation 1356Numerical 0 probability for all classes with observation 336Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 780Numerical 0 probability for all classes with observation 435Numerical 0 probability for all classes with observation 510Numerical 0 probability for all classes with observation 435Numerical 0 probability for all classes with observation 505Numerical 0 probability for all classes with observation 639Numerical 0 probability for all classes with observation 1340Numerical 0 probability for all classes with observation 446Numerical 0 probability for all classes with observation 519Numerical 0 probability for all classes with observation 656Numerical 0 probability for all classes with observation 1384Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 524Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 433Numerical 0 probability for all classes with observation 1392Numerical 0 probability for all classes with observation 321Numerical 0 probability for all classes with observation 646Numerical 0 probability for all classes with observation 744Numerical 0 probability for all classes with observation 429Numerical 0 probability for all classes with observation 622Numerical 0 probability for all classes with observation 429Numerical 0 probability for all classes with observation 509Numerical 0 probability for all classes with observation 621Numerical 0 probability for all classes with observation 622Numerical 0 probability for all classes with observation 722Numerical 0 probability for all classes with observation 1313Numerical 0 probability for all classes with observation 685Numerical 0 probability for all classes with observation 551Numerical 0 probability for all classes with observation 685Numerical 0 probability for all classes with observation 1368Numerical 0 probability for all classes with observation 325Numerical 0 probability for all classes with observation 465Numerical 0 probability for all classes with observation 530Numerical 0 probability for all classes with observation 445Numerical 0 probability for all classes with observation 326Numerical 0 probability for all classes with observation 445Numerical 0 probability for all classes with observation 638Numerical 0 probability for all classes with observation 460Numerical 0 probability for all classes with observation 674Numerical 0 probability for all classes with observation 460Numerical 0 probability for all classes with observation 674Numerical 0 probability for all classes with observation 789Numerical 0 probability for all classes with observation 498Numerical 0 probability for all classes with observation 495Numerical 0 probability for all classes with observation 452Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 452Numerical 0 probability for all classes with observation 529Numerical 0 probability for all classes with observation 530Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 1375Numerical 0 probability for all classes with observation 319Numerical 0 probability for all classes with observation 779Numerical 0 probability for all classes with observation 496Numerical 0 probability for all classes with observation 504Numerical 0 probability for all classes with observation 423Numerical 0 probability for all classes with observation 423Numerical 0 probability for all classes with observation 683Numerical 0 probability for all classes with observation 683Numerical 0 probability for all classes with observation 778Numerical 0 probability for all classes with observation 521Numerical 0 probability for all classes with observation 335Numerical 0 probability for all classes with observation 514Numerical 0 probability for all classes with observation 520Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 447Numerical 0 probability for all classes with observation 534Numerical 0 probability for all classes with observation 764Numerical 0 probability for all classes with observation 543Numerical 0 probability for all classes with observation 667Numerical 0 probability for all classes with observation 340Numerical 0 probability for all classes with observation 454Numerical 0 probability for all classes with observation 537Numerical 0 probability for all classes with observation 667Numerical 0 probability for all classes with observation 768Numerical 0 probability for all classes with observation 518Numerical 0 probability for all classes with observation 642Numerical 0 probability for all classes with observation 514Numerical 0 probability for all classes with observation 642Numerical 0 probability for all classes with observation 406Numerical 0 probability for all classes with observation 610Numerical 0 probability for all classes with observation 406Numerical 0 probability for all classes with observation 609Numerical 0 probability for all classes with observation 610Numerical 0 probability for all classes with observation 645Numerical 0 probability for all classes with observation 637Numerical 0 probability for all classes with observation 637Numerical 0 probability for all classes with observation 741Numerical 0 probability for all classes with observation 440Numerical 0 probability for all classes with observation 438Numerical 0 probability for all classes with observation 440Numerical 0 probability for all classes with observation 762Numerical 0 probability for all classes with observation 417Numerical 0 probability for all classes with observation 633Numerical 0 probability for all classes with observation 313Numerical 0 probability for all classes with observation 417Numerical 0 probability for all classes with observation 633Numerical 0 probability for all classes with observation 1323Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 344Numerical 0 probability for all classes with observation 456Numerical 0 probability for all classes with observation 457Numerical 0 probability for all classes with observation 533Numerical 0 probability for all classes with observation 1399Numerical 0 probability for all classes with observation 436Numerical 0 probability for all classes with observation 519Numerical 0 probability for all classes with observation 312Numerical 0 probability for all classes with observation 436Numerical 0 probability for all classes with observation 1375Numerical 0 probability for all classes with observation 537Numerical 0 probability for all classes with observation 660Numerical 0 probability for all classes with observation 334Numerical 0 probability for all classes with observation 660Numerical 0 probability for all classes with observation 449Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 449Numerical 0 probability for all classes with observation 648Numerical 0 probability for all classes with observation 649Numerical 0 probability for all classes with observation 748Numerical 0 probability for all classes with observation 1384Numerical 0 probability for all classes with observation 440Numerical 0 probability for all classes with observation 518Numerical 0 probability for all classes with observation 440Numerical 0 probability for all classes with observation 753Numerical 0 probability for all classes with observation 750Numerical 0 probability for all classes with observation 318Numerical 0 probability for all classes with observation 506Numerical 0 probability for all classes with observation 743Numerical 0 probability for all classes with observation 1376Numerical 0 probability for all classes with observation 660Numerical 0 probability for all classes with observation 532Numerical 0 probability for all classes with observation 660Numerical 0 probability for all classes with observation 510Numerical 0 probability for all classes with observation 764

print(trModel)

Naive Bayes 

4177 samples
   7 predictor
   3 classes: 'young', 'adult', 'old' 

No pre-processing
Resampling: Bootstrapped (100 reps) 
Summary of sample sizes: 4177, 4177, 4177, 4177, 4177, 4177, ... 
Resampling results across tuning parameters:

  usekernel  Accuracy   Kappa    
  FALSE      0.5810706  0.3529160
   TRUE      0.6069231  0.3799894

Tuning parameter 'fL' was held constant at a value of 0

Tuning parameter 'adjust' was held constant at a value of 1
Accuracy was used to select the optimal model using  the
 largest value.
The final values used for the model were fL = 0, usekernel =
 TRUE and adjust = 1.

trModel2 <- train(rings~., data = z, trControl=train_control, method="knn")

print(trModel2)

k-Nearest Neighbors 

4177 samples
   7 predictor
   3 classes: 'young', 'adult', 'old' 

No pre-processing
Resampling: Bootstrapped (100 reps) 
Summary of sample sizes: 4177, 4177, 4177, 4177, 4177, 4177, ... 
Resampling results across tuning parameters:

  k  Accuracy   Kappa    
  5  0.6222626  0.4091962
  7  0.6347705  0.4266619
  9  0.6447592  0.4406755

Accuracy was used to select the optimal model using  the
 largest value.
The final value used for the model was k = 9.

The bootstrapping method indicates that the KNN model might be slightly more accurate for classifying the data. It was also estimated that the most effective k value for the KNN model would be 9. We used 10 for our model.

Let’s now do 10-fold cross validation to evaluate the models.

control = trainControl(method="repeatedcv", number=10, repeats=3)
model5 <- train(rings~., data = KNNtrain, method = "knn", preProcess="scale", trControl=control)
model5

k-Nearest Neighbors 

2901 samples
   7 predictor
   3 classes: 'young', 'adult', 'old' 

Pre-processing: scaled (7) 
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 2612, 2610, 2610, 2610, 2611, 2611, ... 
Resampling results across tuning parameters:

  k  Accuracy   Kappa    
  5  0.6564605  0.4565599
  7  0.6689747  0.4743428
  9  0.6789763  0.4889676

Accuracy was used to select the optimal model using  the
 largest value.
The final value used for the model was k = 9.

The 10-fold cross validation method indicates that the optimal model for KNN is one with k = 9 (same as what the bootstrap method predicted).

The 10-fold cross validation method indicates that the optimal model for Naive Bayes is a model with fL = 0, usekernal = TRUE and adjust = 1.

The cross-validation method confirms that the KNN method is more effective for this data set than Naive Bayes.

Let’s create the new models with the suggested parameters.

library(class)
KNNpred <- knn(train = KNNtrain[1:7], test = KNNtest[1:7], cl = KNNtrain$rings, k = 9)
library(caret)
confusionMatrix(KNNpred, KNNtest$rings)

Confusion Matrix and Statistics

          Reference
Prediction young adult old
     young   322    96  18
     adult    90   375 164
     old       6    72 133

Overall Statistics
                                          
               Accuracy : 0.6505          
                 95% CI : (0.6236, 0.6767)
    No Information Rate : 0.4255          
    P-Value [Acc > NIR] : < 2e-16         
                                          
                  Kappa : 0.4517          
 Mcnemar's Test P-Value : 3.9e-09         

Statistics by Class:

                     Class: young Class: adult Class: old
Sensitivity                0.7703       0.6906     0.4222
Specificity                0.8671       0.6535     0.9188
Pos Pred Value             0.7385       0.5962     0.6303
Neg Pred Value             0.8857       0.7403     0.8291
Prevalence                 0.3276       0.4255     0.2469
Detection Rate             0.2524       0.2939     0.1042
Detection Prevalence       0.3417       0.4929     0.1654
Balanced Accuracy          0.8187       0.6720     0.6705

With k = 9 the model was about 69% accurate in predicting the test data set.

library(e1071)

model <- naiveBayes(rings ~., data = NBtrain, fL = 0, usekernal = TRUE, adjust = 1)
model


Naive Bayes Classifier for Discrete Predictors

Call:
naiveBayes.default(x = X, y = Y, laplace = laplace, fL = 0, usekernal = TRUE, 
    adjust = 1)

A-priori probabilities:
Y
    young     adult       old 
0.3409169 0.4367459 0.2223371 

Conditional probabilities:
       length
Y            [,1]      [,2]
  young 0.4716203 0.1490067
  adult 0.6712654 0.1177351
  old   0.6912424 0.1092433

       diameter
Y            [,1]      [,2]
  young 0.4507906 0.1502746
  adult 0.6580688 0.1193139
  old   0.6863657 0.1123703

       height
Y             [,1]       [,2]
  young 0.09493365 0.04008618
  adult 0.13448953 0.02663204
  old   0.14618234 0.02554316

       whole_weight
Y            [,1]      [,2]
  young 0.1544802 0.1083561
  adult 0.3494055 0.1509557
  old   0.3924864 0.1602648

       shucked_wieght
Y            [,1]       [,2]
  young 0.1337806 0.09876195
  adult 0.2970106 0.14060814
  old   0.2962226 0.13836111

       viscera_wieght
Y            [,1]       [,2]
  young 0.1241098 0.09026025
  adult 0.2866922 0.12776408
  old   0.3145439 0.13646942

       shell_weight
Y            [,1]       [,2]
  young 0.1206878 0.08019425
  adult 0.2733141 0.10873391
  old   0.3365036 0.13167884

pred <- predict(model, NBtest)
print(confusionMatrix(pred,NBtest$rings))

Confusion Matrix and Statistics

          Reference
Prediction young adult old
     young   336   126  47
     adult    76   285 139
     old       6   132 129

Overall Statistics
                                          
               Accuracy : 0.5878          
                 95% CI : (0.5602, 0.6149)
    No Information Rate : 0.4255          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.3667          
 Mcnemar's Test P-Value : 1.32e-09        

Statistics by Class:

                     Class: young Class: adult Class: old
Sensitivity                0.8038       0.5249     0.4095
Specificity                0.7984       0.7067     0.8564
Pos Pred Value             0.6601       0.5700     0.4831
Neg Pred Value             0.8931       0.6675     0.8157
Prevalence                 0.3276       0.4255     0.2469
Detection Rate             0.2633       0.2234     0.1011
Detection Prevalence       0.3989       0.3918     0.2092
Balanced Accuracy          0.8011       0.6158     0.6330

With the suggested parameters given from the 10-fold validation, the naive bayes algorithm is about 59% accurate.

The models trained by the 10-fold validation have almost equal accuracy to the models I originally created, when testing on the test data set. My concern with this project is that the parameters I originally used didn’t differ much from the suggested model in 10-fold validation.

control14 = trainControl(method="repeatedcv", number=10, repeats=3)
model7 <- train(rings~., data = KNNtrain, method = "rf", preProcess="scale", trControl=control14)

randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.

Attaching package: ‘randomForest’

The following object is masked from ‘package:ggplot2’:

    margin

model7

Random Forest 

2901 samples
   7 predictor
   3 classes: 'young', 'adult', 'old' 

Pre-processing: scaled (7) 
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 2611, 2611, 2611, 2612, 2611, 2610, ... 
Resampling results across tuning parameters:

  mtry  Accuracy   Kappa    
  2     0.6886180  0.5076597
  4     0.6826480  0.4994915
  7     0.6813857  0.4980003

Accuracy was used to select the optimal model using  the
 largest value.
The final value used for the model was mtry = 2.

As shown above, the optimal random forest model is expected to perform at about 67%. It seems like the machine learning algorithms are having a difficult learning enough from the abalone features to accurately predict the age of the abalone. I would suspect that an abalone harvester would want a more accurate model before he or she could trust it in a commercial setting. Therefore, better data might be necessary.

LS0tCnRpdGxlOiAiRXZhbHVhdGluZyBQZXJmb3JtYW5jZSBNZWFzdXJlcyIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKVGhlIG9iamVjdGl2ZSBvZiB0aGlzIGFzc2lnbm1lbnQgaXMgdG8gY29tcGFyZSB0d28gZGlmZmVyZW50IGNsYXNzaWZpY2F0aW9uIGFsZ29yaXRobXMgdXNpbmcgYWNjdXJhY3kgYW5kIHBlcmZvcm1hbmNlIG1ldHJpY3MuIEknbSBnb2luZyB0byB1c2UgSy1OTiBhbmQgTmFpdmUgQmF5ZXMgdG8gdG8gcHJlZGljdCB0aGUgYWdlIG9mIGFuIGFiYWxvbmUgdXNpbmcgYWJhbG9uZSBmZWF0dXJlcy4gQWJhbG9uZSBhcmUgc2hlbGwgZmlzaCB0aGF0IGFyZSBwb3B1bGFyIHRvIGVhdCBpbiBtYW55IGNvdW50cmllcywgZXNwZWNpYWxseSByYXcgaW4gYSBzYXNoaW1pIHNwcmVhZC4gVGhlIHJpbmdzIGF0dHJpYnV0ZSBjb3JyZXNwb25kcyB0byBhbiBhYmFsb25lJ3MgYWdlIGluIHllYXJzIChhZnRlciB0aGUgYWJhbG9uZSByZWFjaGVzIDEgdG8gMS41IHllYXJzIG9mIGFnZSkuIFRoZSBwcm9jZXNzIG9mIGRldGVybWluaW5nIGFuIGFiYWxvbmUncyBhZ2UgaXMgdGVkaW91cyBhbmQgdGltZSBjb25zdW1pbmcsIHNvIHVzaW5nIGNsYXNzaWZpY2F0aW9uIG1hY2hpbmUgbGVhcm5pbmcgbWlnaHQgYmUgdXNlZnVsIGZvciBwcmVkaWN0aW5nIGFuIGFiYWxvbmUncyBhZ2UuCgpEYXRhIHdhcyB0YWtlbiBmcm9tIHRoaXMgd2Vic2l0ZSBodHRwczovL2FyY2hpdmUuaWNzLnVjaS5lZHUvbWwvZGF0YXNldHMvQWJhbG9uZQoKSGVyZSBhcmUgdGhlIGF0dHJpYnV0ZSBkZXNjcmlwdGlvbnM6CgpTZXggLyBub21pbmFsIC8gLS0gLyBNLCBGLCBhbmQgSSAoaW5mYW50KSAKTGVuZ3RoIC8gY29udGludW91cyAvIG1tIC8gTG9uZ2VzdCBzaGVsbCBtZWFzdXJlbWVudCAKRGlhbWV0ZXIJLyBjb250aW51b3VzIC8gbW0gLyBwZXJwZW5kaWN1bGFyIHRvIGxlbmd0aCAKSGVpZ2h0IC8gY29udGludW91cyAvIG1tIC8gd2l0aCBtZWF0IGluIHNoZWxsIApXaG9sZSB3ZWlnaHQgLyBjb250aW51b3VzIC8gZ3JhbXMgLyB3aG9sZSBhYmFsb25lIApTaHVja2VkIHdlaWdodCAvIGNvbnRpbnVvdXMJLyBncmFtcyAvIHdlaWdodCBvZiBtZWF0IApWaXNjZXJhIHdlaWdodCAvIGNvbnRpbnVvdXMgLyBncmFtcyAvIGd1dCB3ZWlnaHQgKGFmdGVyIGJsZWVkaW5nKSAKU2hlbGwgd2VpZ2h0IC8gY29udGludW91cyAvIGdyYW1zIC8gYWZ0ZXIgYmVpbmcgZHJpZWQgClJpbmdzIC8gaW50ZWdlciAvIC0tIC8gKzEuNSBnaXZlcyB0aGUgYWdlIGluIHllYXJzCgoKCgpgYGB7cn0KIGFiYWxvbmUgPC0gcmVhZC5jc3YodXJsKCJodHRwczovL2FyY2hpdmUuaWNzLnVjaS5lZHUvbWwvbWFjaGluZS1sZWFybmluZy1kYXRhYmFzZXMvYWJhbG9uZS9hYmFsb25lLmRhdGEiKSwgaGVhZGVyID0gRkFMU0UsIHNlcCA9ICIsIikKCmNvbG5hbWVzKGFiYWxvbmUpIDwtIGMoInNleCIsICJsZW5ndGgiLCAnZGlhbWV0ZXInLCAnaGVpZ2h0JywgJ3dob2xlX3dlaWdodCcsICdzaHVja2VkX3dpZWdodCcsICd2aXNjZXJhX3dpZWdodCcsICdzaGVsbF93ZWlnaHQnLCAncmluZ3MnICkKYGBgCgoKYGBge3J9CnN1bW1hcnkoYWJhbG9uZSkKYGBgCgpgYGB7cn0Kc3RyKGFiYWxvbmUpCmBgYAoKCgpgYGB7cn0Kc3VtbWFyeShhYmFsb25lJHJpbmdzKQpgYGAKCkFzIHNob3duIGFib3ZlLCB0aGUgInJpbmdzIiB2YXJpYWJsZSBoYXMgYSByYW5nZSBiZXR3ZWVuIDEtMjkuIFRoaXMgaXMgdGhlIHZhcmlhYmxlIHRoYXQgd2Ugd2FudCB0byBwcmVkaWN0LCBhbmQgcHJlZGljdGluZyB0aGlzIG1hbnkgbGV2ZWxzIG1pZ2h0IG5vdCBnaXZlIHVzIHRoZSBpbnNpZ2h0IHdlJ3JlIGxvb2tpbmcgZm9yLiBJIHN1c3BlY3QgdGhhdCB0aGVyZSdzIGFuIG9wdGltYWwgYWdlIHJhbmdlIGZvciBoYXJ2ZXN0aW5nIGFiYWxvbmVzIGZvciBjb25zdW1wdGlvbi4gV2hpbGUgSSBkb24ndCBrbm93IHRoaXMgYWdlIHJhbmdlLCB0aGlzIHByb2plY3QgY291bGQgYmUgYWRqdXN0ZWQgd2l0aCB0aGUgc291Z2h0LWFmdGVyIGFnZSByYW5nZSBpbnNlcnRlZC4gRm9yIG5vdywgd2UnbGwgYnJlYWsgdGhlIHJpbmdzIHZhcmlhYmxlIGludG8gMyBsZXZlbHMiICJ5b3VuZyIgZm9yIGFiYWxvbmVzIGxlc3MgdGhhbiA4LCAiYWR1bHQiIGZvciBhYmFsb25lcyBiZXR3ZWVuIDgtMTEsIGFuZCAib2xkIiBmb3IgYWJhbG9uZXMgb2xkZXIgdGhhbiAxMS4gCgpgYGB7cn0KYWJhbG9uZSRyaW5ncyA8LSBhcy5udW1lcmljKGFiYWxvbmUkcmluZ3MpCgphYmFsb25lJHJpbmdzIDwtIGN1dChhYmFsb25lJHJpbmdzLCBicj1jKC0xLDgsMTEsMzUpLCBsYWJlbHMgPSBjKCJ5b3VuZyIsICdhZHVsdCcsICdvbGQnKSkKCgphYmFsb25lJHJpbmdzIDwtIGFzLmZhY3RvcihhYmFsb25lJHJpbmdzKQoKc3VtbWFyeShhYmFsb25lJHJpbmdzKQpgYGAKCgpJJ20gZ29pbmcgdG8gY3JlYXRlIGEgY291cGxlIG9mIGRpZmZlcmVudCBjbGFzc2lmaWNhdGlvbiBtb2RlbHMsIGFuZCB0aGVuIGNvbXBhcmUgdGhlbSB1c2luZyBhY2N1cmFjeSBhbmQgcGVyZm9ybWFuY2UgbWV0cmljcy4gSSdsbCBzdGFydCB3aXRoIGEgS05OIGNsYXNzaWZpY2F0aW9uIGFsZ29yaXRobS4gQmVjYXVzZSBLTk4gcmVxdWlyZXMgYWxsIG51bWVyaWMgdmFyaWFibGVzIGZvciBwcmVkaWN0aW9uLCBJJ20gZ29pbmcgdG8gcmVtb3ZlIHRoZSAic2V4IiB2YXJpYWJsZS4KCgpgYGB7cn0KeiA8LSBhYmFsb25lCnokc2V4IDwtIE5VTEwKYGBgCgpJJ2xsIG5vdyBub3JtYWxpemUgdGhlIGRhdGEgdXNpbmcgbWluIG1heCBub3JtYWxpemF0aW9uCgpgYGB7cn0Kbm9ybWFsaXplIDwtIGZ1bmN0aW9uKHgpIHsKICByZXR1cm4gKCh4IC0gbWluKHgpKSAvIChtYXgoeCkgLSBtaW4oeCkpKQp9Cgp6WzE6N10gPC0gYXMuZGF0YS5mcmFtZShsYXBwbHkoelsxOjddLCBub3JtYWxpemUpKQoKCnN1bW1hcnkoeiRzaHVja2VkX3dpZWdodCkKYGBgCgoKCk5vdyBlYWNoIHZhcmlhYmxlIGhhcyBhIG1pbiBvZiAwIGFuZCBhIG1heCBvZiAxLiBXZSdsbCBub3cgc3BsaXQgdGhlIGRhdGEgaW50byB0cmFpbmluZyBhbmQgdGVzdGluZyBzZXRzLgoKYGBge3J9CmluZCA8LSBzYW1wbGUoMiwgbnJvdyh6KSwgcmVwbGFjZT1UUlVFLCBwcm9iPWMoMC43LCAwLjMpKQpLTk50cmFpbiA8LSB6W2luZD09MSxdCktOTnRlc3QgPC0geltpbmQ9PTIsXQpgYGAKCk5vdyB3ZSBydW4gdGhlIG1vZGVsLiBJJ20gZ29pbmcgdG8gbWFrZSBrIGVxdWFsIHRvIHRoZSBzcXVhcmUgcm9vdCBvZiAyOTE4LCB0aGUgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyBpbiB0aGUgdHJhaW5pbmcgc2V0LgoKCmBgYHtyfQpsaWJyYXJ5KGNsYXNzKQpLTk5wcmVkIDwtIGtubih0cmFpbiA9IEtOTnRyYWluWzE6N10sIHRlc3QgPSBLTk50ZXN0WzE6N10sIGNsID0gS05OdHJhaW4kcmluZ3MsIGsgPSA1NCkKYGBgCgoKCkxldCdzIHNlZSBob3cgdGhlIG1vZGVsIGRvZXMgb24gdGhlIHRlc3QgZGF0YS4KCmBgYHtyfQpsaWJyYXJ5KCJnbW9kZWxzIikKQ3Jvc3NUYWJsZSh4ID0gS05OdGVzdCRyaW5ncywgeSA9IEtOTnByZWQsIHByb3AuY2hpc3EgPSBGQUxTRSkKYGBgCmBgYHtyfQooMzI4KzQ1MSs5NykvKCg4NCsyKzI2KzkxKzIxKzE1OSkrKDMyOCs0NTErOTcpKQpgYGAKClRoaXMgS05OIGNsYXNzaWZpZXIgcHJlZGljdGVkIHRoZSBhYmFsb25lIGFnZSB3aXRoIDY5JSBhY2N1cmFjeSAtIGxpa2VseSBub3QgYWNjdXJhdGUgZW5vdWdoIGZvciBhbiBhYmFsb25lIGhhcnZlc3RlciB0byB0cnVzdC4gQmVmb3JlIG1vdmluZyBvbiB0byBtb3JlIHNwZWNpZmljIGFjY3VyYWN5IGFuZCBwZXJmb3JtYW5jZSB0ZXN0cyBJJ20gZ29pbmcgdG8gdHJ5IGEgc21hbGxlciBrIHZhbHVlIGFuZCBzZWUgaWYgaXQgaW1wcm92ZXMgdGhlIGFjY3VyYWN5LgoKCmBgYHtyfQpsaWJyYXJ5KGNsYXNzKQpLTk5wcmVkIDwtIGtubih0cmFpbiA9IEtOTnRyYWluWzE6N10sIHRlc3QgPSBLTk50ZXN0WzE6N10sIGNsID0gS05OdHJhaW4kcmluZ3MsIGsgPSAxMCkKYGBgCgpgYGB7cn0KbGlicmFyeSgiZ21vZGVscyIpCkNyb3NzVGFibGUoeCA9IEtOTnRlc3QkcmluZ3MsIHkgPSBLTk5wcmVkLCBwcm9wLmNoaXNxID0gRkFMU0UpCmBgYAoKYGBge3J9CigzMTMrNDIyKzEzNSkvKCgzMTMrNDIyKzEzNSkrKDk0KzcrNTcrODkrMTcrMTI1KSkKYGBgCgoKVGhpcyBtb2RlbCBoYXMganVzdCBhYm91dCB0aGUgc2FtZSBwcmVkaWN0aXZlIHBvd2VyIG9uIHRoZSB0ZXN0IHNldC4gVGhpcyBjYW4gYWxzbyBiZSBzaG93biBpbiB0aGUgY29uZnVzaW9uIG1hdHJpeCBiZWxvdzoKCgpgYGB7cn0KbGlicmFyeShjYXJldCkKY29uZnVzaW9uTWF0cml4KEtOTnByZWQsIEtOTnRlc3QkcmluZ3MpCmBgYAoKVGhlIG1pc2NsYXNzaWZpY2F0aW9uIHJhdGUgaXMgMSBtaW51cyB0aGUgYWNjdXJhY3ksIHNob3duIGJlbG93LgoKYGBge3J9CjEtMC42OTEKYGBgCgoKTGV04oCZcyBub3cgY3JlYXRlIGEgbmFpdmUgYmF5ZXMgY2xhc3NpZmllciBmb3IgdGhlIHNhbWUgZGF0YS4KCgpgYGB7cn0KTkJ0cmFpbiA8LSBLTk50cmFpbgpOQnRlc3QgPC0gS05OdGVzdApgYGAKCmBgYHtyfQpsaWJyYXJ5KGUxMDcxKQpgYGAKCmBgYHtyfQptb2RlbCA8LSBuYWl2ZUJheWVzKHJpbmdzIH4uLCBkYXRhID0gTkJ0cmFpbikKbW9kZWwKYGBgCgpgYGB7cn0KcHJlZCA8LSBwcmVkaWN0KG1vZGVsLCBOQnRlc3QpCnByaW50KGNvbmZ1c2lvbk1hdHJpeChwcmVkLE5CdGVzdCRyaW5ncykpCmBgYAoKClRoZSBhY2N1cmFjeSByYXRlIGZvciB0aGUgbmFpdmUgYmF5ZXMgbW9kZWwgcHJlZGljdGluZyB0aGUgdGVzdCBzZXQgaXMgb25seSBhYm91dCA1OSUsIHdoaWNoIG1ha2VzIHRoZSBtaXNjbGFzc2lmaWNhdGlvbiByYXRlIGFwcHJveC4gNDElLgoKV2hpbGUgaXQncyBsaWtlbHkgdGhhdCBuZWl0aGVyIGFsZ29yaXRobSBpcyBhZGVxdWF0ZSBmb3IgcHJlZGljdGluZyB0aGUgYWJhbG9uZSBhZ2UsIHRoZSBLTk4gbW9kZWwgaXMgbW9yZSBhY2N1cmF0ZSBzbyBmYXIuCgpMZXQncyB0cnkgYSBib290c3RyYXBwaW5nIG1ldGhvZCBmb3IgZnVydGhlciBtb2RlbCBldmFsdWF0aW9uLgoKYGBge3J9CmxpYnJhcnkoY2FyZXQpCnRyYWluX2NvbnRyb2wgPC0gdHJhaW5Db250cm9sKG1ldGhvZD0nYm9vdCcsIG51bWJlciA9IDEwMCkKCnRyTW9kZWwgPC0gdHJhaW4ocmluZ3N+LiwgZGF0YSA9IHosIHRyQ29udHJvbD10cmFpbl9jb250cm9sLCBtZXRob2Q9Im5iIikKYGBgCgoKYGBge3J9CnByaW50KHRyTW9kZWwpCmBgYAoKYGBge3J9CnRyTW9kZWwyIDwtIHRyYWluKHJpbmdzfi4sIGRhdGEgPSB6LCB0ckNvbnRyb2w9dHJhaW5fY29udHJvbCwgbWV0aG9kPSJrbm4iKQpgYGAKCmBgYHtyfQpwcmludCh0ck1vZGVsMikKYGBgCgoKVGhlIGJvb3RzdHJhcHBpbmcgbWV0aG9kIGluZGljYXRlcyB0aGF0IHRoZSBLTk4gbW9kZWwgbWlnaHQgYmUgc2xpZ2h0bHkgbW9yZSBhY2N1cmF0ZSBmb3IgY2xhc3NpZnlpbmcgdGhlIGRhdGEuIEl0IHdhcyBhbHNvIGVzdGltYXRlZCB0aGF0IHRoZSBtb3N0IGVmZmVjdGl2ZSBrIHZhbHVlIGZvciB0aGUgS05OIG1vZGVsIHdvdWxkIGJlIDkuIFdlIHVzZWQgMTAgZm9yIG91ciBtb2RlbC4KCkxldCdzIG5vdyBkbyAxMC1mb2xkIGNyb3NzIHZhbGlkYXRpb24gdG8gZXZhbHVhdGUgdGhlIG1vZGVscy4KCmBgYHtyfQpjb250cm9sID0gdHJhaW5Db250cm9sKG1ldGhvZD0icmVwZWF0ZWRjdiIsIG51bWJlcj0xMCwgcmVwZWF0cz0zKQptb2RlbDUgPC0gdHJhaW4ocmluZ3N+LiwgZGF0YSA9IEtOTnRyYWluLCBtZXRob2QgPSAia25uIiwgcHJlUHJvY2Vzcz0ic2NhbGUiLCB0ckNvbnRyb2w9Y29udHJvbCkKbW9kZWw1CmBgYAoKCmBgYHtyfQpjb250cm9sMTEgPSB0cmFpbkNvbnRyb2wobWV0aG9kPSJyZXBlYXRlZGN2IiwgbnVtYmVyPTEwLCByZXBlYXRzPTMpCm1vZGVsNiA8LSB0cmFpbihyaW5nc34uLCBkYXRhID0gdHJhaW5ERiwgbWV0aG9kID0gIm5iIiwgcHJlUHJvY2Vzcz0ic2NhbGUiLCB0ckNvbnRyb2w9Y29udHJvbDExKQptb2RlbDYKYGBgCgoKVGhlIDEwLWZvbGQgY3Jvc3MgdmFsaWRhdGlvbiBtZXRob2QgaW5kaWNhdGVzIHRoYXQgdGhlIG9wdGltYWwgbW9kZWwgZm9yIEtOTiBpcyBvbmUgd2l0aCBrID0gOSAoc2FtZSBhcyB3aGF0IHRoZSBib290c3RyYXAgbWV0aG9kIHByZWRpY3RlZCkuCgpUaGUgMTAtZm9sZCBjcm9zcyB2YWxpZGF0aW9uIG1ldGhvZCBpbmRpY2F0ZXMgdGhhdCB0aGUgb3B0aW1hbCBtb2RlbCBmb3IgTmFpdmUgQmF5ZXMgaXMgYSBtb2RlbCB3aXRoIGZMID0gMCwgdXNla2VybmFsID0gVFJVRSBhbmQgYWRqdXN0ID0gMS4KCgpUaGUgY3Jvc3MtdmFsaWRhdGlvbiBtZXRob2QgY29uZmlybXMgdGhhdCB0aGUgS05OIG1ldGhvZCBpcyBtb3JlIGVmZmVjdGl2ZSBmb3IgdGhpcyBkYXRhIHNldCB0aGFuIE5haXZlIEJheWVzLgoKTGV0J3MgY3JlYXRlIHRoZSBuZXcgbW9kZWxzIHdpdGggdGhlIHN1Z2dlc3RlZCBwYXJhbWV0ZXJzLgpgYGB7cn0KbGlicmFyeShjbGFzcykKS05OcHJlZCA8LSBrbm4odHJhaW4gPSBLTk50cmFpblsxOjddLCB0ZXN0ID0gS05OdGVzdFsxOjddLCBjbCA9IEtOTnRyYWluJHJpbmdzLCBrID0gOSkKbGlicmFyeShjYXJldCkKY29uZnVzaW9uTWF0cml4KEtOTnByZWQsIEtOTnRlc3QkcmluZ3MpCmBgYAoKV2l0aCBrID0gOSB0aGUgbW9kZWwgd2FzIGFib3V0IDY5JSBhY2N1cmF0ZSBpbiBwcmVkaWN0aW5nIHRoZSB0ZXN0IGRhdGEgc2V0LgoKCmBgYHtyfQpsaWJyYXJ5KGUxMDcxKQpgYGAKCmBgYHtyfQptb2RlbCA8LSBuYWl2ZUJheWVzKHJpbmdzIH4uLCBkYXRhID0gTkJ0cmFpbiwgZkwgPSAwLCB1c2VrZXJuYWwgPSBUUlVFLCBhZGp1c3QgPSAxKQptb2RlbApgYGAKCmBgYHtyfQpwcmVkIDwtIHByZWRpY3QobW9kZWwsIE5CdGVzdCkKcHJpbnQoY29uZnVzaW9uTWF0cml4KHByZWQsTkJ0ZXN0JHJpbmdzKSkKYGBgCgpXaXRoIHRoZSBzdWdnZXN0ZWQgcGFyYW1ldGVycyBnaXZlbiBmcm9tIHRoZSAxMC1mb2xkIHZhbGlkYXRpb24sIHRoZSBuYWl2ZSBiYXllcyBhbGdvcml0aG0gaXMgYWJvdXQgNTklIGFjY3VyYXRlLgoKVGhlIG1vZGVscyB0cmFpbmVkIGJ5IHRoZSAxMC1mb2xkIHZhbGlkYXRpb24gaGF2ZSBhbG1vc3QgZXF1YWwgYWNjdXJhY3kgdG8gdGhlIG1vZGVscyBJIG9yaWdpbmFsbHkgY3JlYXRlZCwgd2hlbiB0ZXN0aW5nIG9uIHRoZSB0ZXN0IGRhdGEgc2V0LiBNeSBjb25jZXJuIHdpdGggdGhpcyBwcm9qZWN0IGlzIHRoYXQgdGhlIHBhcmFtZXRlcnMgSSBvcmlnaW5hbGx5IHVzZWQgZGlkbid0IGRpZmZlciBtdWNoIGZyb20gdGhlIHN1Z2dlc3RlZCBtb2RlbCBpbiAxMC1mb2xkIHZhbGlkYXRpb24uIAoKYGBge3J9CmNvbnRyb2wxNCA9IHRyYWluQ29udHJvbChtZXRob2Q9InJlcGVhdGVkY3YiLCBudW1iZXI9MTAsIHJlcGVhdHM9MykKbW9kZWw3IDwtIHRyYWluKHJpbmdzfi4sIGRhdGEgPSBLTk50cmFpbiwgbWV0aG9kID0gInJmIiwgcHJlUHJvY2Vzcz0ic2NhbGUiLCB0ckNvbnRyb2w9Y29udHJvbDE0KQptb2RlbDcKYGBgCgpBcyBzaG93biBhYm92ZSwgdGhlIG9wdGltYWwgcmFuZG9tIGZvcmVzdCBtb2RlbCBpcyBleHBlY3RlZCB0byBwZXJmb3JtIGF0IGFib3V0IDY3JS4gSXQgc2VlbXMgbGlrZSB0aGUgbWFjaGluZSBsZWFybmluZyBhbGdvcml0aG1zIGFyZSBoYXZpbmcgYSBkaWZmaWN1bHQgbGVhcm5pbmcgZW5vdWdoIGZyb20gdGhlIGFiYWxvbmUgZmVhdHVyZXMgdG8gYWNjdXJhdGVseSBwcmVkaWN0IHRoZSBhZ2Ugb2YgdGhlIGFiYWxvbmUuIEkgd291bGQgc3VzcGVjdCB0aGF0IGFuIGFiYWxvbmUgaGFydmVzdGVyIHdvdWxkIHdhbnQgYSBtb3JlIGFjY3VyYXRlIG1vZGVsIGJlZm9yZSBoZSBvciBzaGUgY291bGQgdHJ1c3QgaXQgaW4gYSBjb21tZXJjaWFsIHNldHRpbmcuIFRoZXJlZm9yZSwgYmV0dGVyIGRhdGEgbWlnaHQgYmUgbmVjZXNzYXJ5Lgo=