Week 2 Discussion

Problem 2.21

Using Bayes’ Theorem P(Ai|B) = P(B|Ai)P(Ai)/Σ P(B|Ai) P(Ai) we can calculate the probability that a student is able to construct a box plot if it is known that he passed. - Let B = event the student passed - Let A1 = event the student is able to construct a box plot, A2 = NOT able - P(B|A1) = probability the student passed, given that the student is able to construct a box plot = 86% pass = .86 - P(A1) = 80% able to construct a boxplot = .80 - P(B|A2) = probability the student passed, given that the student is NOT able to construct a box plot = 65% pass = .65 - P(A2) = 20% not able to construct a box plot = .20

P(Ai|B) = P(B|Ai)P(Ai)/Σ P(B|Ai) P(Ai) → (.86)(.80)/[(.86)(.80) + (.65)(.20)] = .8411

So, the probability that a student is able to construct a box plot it is known that he passed is .8411 or 84.11%.

Tree Diagram

source("https://bioconductor.org/biocLite.R")
## Bioconductor version 3.6 (BiocInstaller 1.28.0), ?biocLite for help
biocLite("Rgraphviz")
## BioC_mirror: https://bioconductor.org
## Using Bioconductor 3.6 (BiocInstaller 1.28.0), R 3.4.3 (2017-11-30).
## Installing package(s) 'Rgraphviz'
## 
## The downloaded binary packages are in
##  /var/folders/4h/rxhrwjqx1hbftlm3jjpzr5780000gn/T//Rtmpz2DuXu/downloaded_packages
# R Conditional Probability Tree Diagram
 
# The Rgraphviz graphing package must be installed to do this
require("Rgraphviz")
## Loading required package: Rgraphviz
## Loading required package: graph
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, cbind, colMeans,
##     colnames, colSums, do.call, duplicated, eval, evalq, Filter,
##     Find, get, grep, grepl, intersect, is.unsorted, lapply,
##     lengths, Map, mapply, match, mget, order, paste, pmax,
##     pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce,
##     rowMeans, rownames, rowSums, sapply, setdiff, sort, table,
##     tapply, union, unique, unsplit, which, which.max, which.min
## Loading required package: grid
# Change the three variables below to match your actual values
# These are the values that you can change for your own probability tree
# From these three values, other probabilities (e.g. prob(b)) will be calculated 
 
# Probability of a
a<-.80
 
# Probability (b | a)
bGivena<-.86
 
# Probability (b | ?a)
bGivenNota<-.65
 
###################### Everything below here will be calculated
 
# Calculate the rest of the values based upon the 3 variables above
notbGivena<-1-bGivena
notA<-1-a
notbGivenNota<-1-bGivenNota
 
#Joint Probabilities of a and B, a and notb, nota and b, nota and notb
aANDb<-a*bGivena
aANDnotb<-a*notbGivena
notaANDb <- notA*bGivenNota
notaANDnotb <- notA*notbGivenNota
 
# Probability of B
b<- aANDb + notaANDb
notB <- 1-b
 
# Bayes theorum - probabiliyt of A | B
# (a | b) = Prob (a AND b) / prob (b)
aGivenb <- aANDb / b
 
# These are the labels of the nodes on the graph
# To signify "Not A" - we use A' or A prime 
 
node1<-"P"
node2<-"A"
node3<-"A'"
node4<-"A&B"
node5<-"A&B'"
node6<-"A'&B"
node7<-"A'&B'"
nodeNames<-c(node1,node2,node3,node4, node5,node6, node7)
 
rEG <- new("graphNEL", nodes=nodeNames, edgemode="directed")
#Erase any existing plots

 
# Draw the "lines" or "branches" of the probability Tree
rEG <- addEdge(nodeNames[1], nodeNames[2], rEG, 1)
rEG <- addEdge(nodeNames[1], nodeNames[3], rEG, 1)
rEG <- addEdge(nodeNames[2], nodeNames[4], rEG, 1)
rEG <- addEdge(nodeNames[2], nodeNames[5], rEG, 1)
rEG <- addEdge(nodeNames[3], nodeNames[6], rEG, 1)
rEG <- addEdge(nodeNames[3], nodeNames[7], rEG, 10)
 
eAttrs <- list()
 
q<-edgeNames(rEG)
 
# Add the probability values to the the branch lines
 
eAttrs$label <- c(toString(a),toString(notA),
 toString(bGivena), toString(notbGivena),
 toString(bGivenNota), toString(notbGivenNota))
names(eAttrs$label) <- c(q[1],q[2], q[3], q[4], q[5], q[6])
edgeAttrs<-eAttrs
 
# Set the color, etc, of the tree
attributes<-list(node=list(label="foo", fillcolor="lightgreen", fontsize="15"),
 edge=list(color="red"),graph=list(rankdir="LR"))
 
#Plot the probability tree using Rgraphvis
plot(rEG, edgeAttrs=eAttrs, attrs=attributes)
nodes(rEG)
## [1] "P"     "A"     "A'"    "A&B"   "A&B'"  "A'&B"  "A'&B'"
edges(rEG)
## $P
## [1] "A"  "A'"
## 
## $A
## [1] "A&B"  "A&B'"
## 
## $`A'`
## [1] "A'&B"  "A'&B'"
## 
## $`A&B`
## character(0)
## 
## $`A&B'`
## character(0)
## 
## $`A'&B`
## character(0)
## 
## $`A'&B'`
## character(0)
text(500,420,aANDb, cex=.8)
 
text(500,280,aANDnotb,cex=.8)
 
text(500,160,notaANDb,cex=.8)
 
text(500,30,notaANDnotb,cex=.8)
 
text(340,440,"(B | A)",cex=.8)
 
text(340,230,"(B | A')",cex=.8)
 
#Write a table in the lower left of the probablites of A and B
text(80,50,paste("P(A):",a),cex=.9, col="darkgreen")
text(80,20,paste("P(A'):",notA),cex=.9, col="darkgreen")
 
text(160,50,paste("P(B):",round(b,digits=2)),cex=.9)
text(160,20,paste("P(B'):",round(notB, 2)),cex=.9)
 
text(80,420,paste("P(A|B): ",round(aGivenb,digits=2)),cex=.9,col="blue")

The tree diagram displays the same numbers needed to answer the question in Problem 2.21, in a easy-to-navigate diagram. By taking A&B and dividing by the sum of A&B and A’&B we arrive at the same answer of .8411 or 84.11%.