Bayes' Theorem is a theorem of probability theory originally stated by the Reverend Thomas Bayes. It can be seen as a way of understanding how the probability that a theory is true is affected by a new piece of evidence.
We plug these values into the following equation:
PRIOR_incidence <- .01
NO_cancer <- .99
TRUE_positive <- .80
FALSE_positive <- .096
numerator <- TRUE_positive * PRIOR_incidence
denominator <- ((TRUE_positive * PRIOR_incidence) + (FALSE_positive * NO_cancer))
POSTERIOR <- (numerator/denominator)*100
So the probability that a given patient actually has cancer given a positive test result is approximately 7.8%:
a <- signif(POSTERIOR, digits = 2)
sprintf("There is a %s percent chance that you have cancer given that your test result was positive.", a)
## [1] "There is a 7.8 percent chance that you have cancer given that your test result was positive."
NOTE: Some statisticians are disturbed by the widespread use of Naive Bayes Classifiers, which they dub “Idiot’s Bayes”, because that naive assumption of independence is almost always invalid in the real world. However, the method has been shown to perform surprisingly well in a wide variety of contexts.
We're using the well known “iris” dataset in this example:
library("klaR")
library("caret")
library(dplyr)
#splitting independent and dependent variables
attach(iris)
x <- select(iris, -Species)
y <- Species
#(x = attributes, y = labels, 'nb' = tells trainer to use Naive Bayes, trainControl = tells train method to use cross-validation with 10 folds)
model = train(x, y, 'nb', trControl = trainControl(method = 'cv', number=10))
model
## Naive Bayes
##
## 150 samples
## 4 predictor
## 3 classes: 'setosa', 'versicolor', 'virginica'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
##
## Summary of sample sizes: 135, 135, 135, 135, 135, 135, ...
##
## Resampling results across tuning parameters:
##
## usekernel Accuracy Kappa Accuracy SD Kappa SD
## FALSE 0.9533333 0.93 0.05488484 0.08232726
## TRUE 0.9600000 0.94 0.04661373 0.06992059
##
## Tuning parameter 'fL' was held constant at a value of 0
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were fL = 0 and usekernel = TRUE.
Kappa is a less biased estimator of the classification model's accuracy than the regular “accuracy” measure, as it compares the observed accuracy with the expected accuracy (random chance).
A more thorough explanation is given here: http://stats.stackexchange.com/questions/82162/kappa-statistic-in-plain-english
predict(model$finalModel, x)
confusionMatrix(predict(model$finalModel, x)$class, y)
## Confusion Matrix and Statistics
##
## Reference
## Prediction setosa versicolor virginica
## setosa 50 0 0
## versicolor 0 47 3
## virginica 0 3 47
##
## Overall Statistics
##
## Accuracy : 0.96
## 95% CI : (0.915, 0.9852)
## No Information Rate : 0.3333
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.94
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: setosa Class: versicolor Class: virginica
## Sensitivity 1.0000 0.9400 0.9400
## Specificity 1.0000 0.9700 0.9700
## Pos Pred Value 1.0000 0.9400 0.9400
## Neg Pred Value 1.0000 0.9700 0.9700
## Prevalence 0.3333 0.3333 0.3333
## Detection Rate 0.3333 0.3133 0.3133
## Detection Prevalence 0.3333 0.3333 0.3333
## Balanced Accuracy 1.0000 0.9550 0.9550