A tree diagram is set up to understand the probabilities better.
We are asked to find P(Bus | On time)? According to Bayes’ Theorm of inverting probabilities, we get:
\({P(Bus | On time)}\) = \(\frac{P(On time | Bus) * P(Bus)}{P(On time | Bus) * P(Bus) + P(On time | Train) * P(Train)}\)
# probability bus is taken
p_bus <- 10/30
# probability train is taken
p_train <- 20/30
# probability of on time if bus is taken
p_ontime_given_bus <- 0.5
# probability of on time if train is taken
p_ontime_given_train <- 0.9
# probability of taking a bus if arrived on time
p_bus_given_ontime <- (p_ontime_given_bus * p_bus)/((p_ontime_given_bus * p_bus) + (p_ontime_given_train * p_train))
p_bus_given_ontime
## [1] 0.2173913
\({P(Bus | On time)}\) = 21.74%
library("gRain")
## Loading required package: gRbase
# setting up conditional probability tables with values given
ny <- c("N","Y")
lh <- c("L","H")
d <- cptable(~ difficulty, values = c(0.3, 0.7), levels = ny)
i <- cptable(~ intelligence, values = c(0.8, 0.2), levels = lh)
g.di <- cptable(~ grade + difficulty + intelligence, values = c(0.6, 0.4, 0.8, 0.2, 0.01, 0.99, 0.1, 0.9), levels = lh )
l.g <- cptable(~ letter + grade, values= c(0.9, 0.1, 0.05, 0.95), levels = lh)
s.i <- cptable(~ sat + intelligence, values = c(0.9, 0.1, 0.2, 0.8), levels = lh)
# create list of conditional probability tables
plist <- compileCPT(list(d,i,g.di,l.g,s.i))
plist$difficulty
## difficulty
## N Y
## 0.3 0.7
plist$intelligence
## intelligence
## L H
## 0.8 0.2
plist$grade
## , , intelligence = L
##
## difficulty
## grade N Y
## L 0.6 0.8
## H 0.4 0.2
##
## , , intelligence = H
##
## difficulty
## grade N Y
## L 0.01 0.1
## H 0.99 0.9
plist$letter
## grade
## letter L H
## L 0.9 0.05
## H 0.1 0.95
plist$sat
## intelligence
## sat L H
## L 0.9 0.2
## H 0.1 0.8
# create network from list
network <- grain(plist)
(I) What happens to the probability of Difficulty of Course when you present the evidence that the received recommendation letter was good?
# Set findings and the query
network.find <- setFinding(network, nodes = "letter", state = "H")
diff.course <- querygrain(network.find, nodes = "difficulty")
diff.course
## $difficulty
## difficulty
## N Y
## 0.3731439 0.6268561
The probability of difficulty of courses decreases by 7%.
(II) In addition, now present the evidence that both SAT scores were good and the letter of recommendation was good, What is the probability of the Difficulty of Course now?
# Set findings and the query
network.find <- setFinding(network, nodes = c("letter", "sat"), state = c("H","H"))
diff.course <- querygrain(network.find, nodes = "difficulty")
diff.course
## $difficulty
## difficulty
## N Y
## 0.3323478 0.6676522
With good SAT scores, the probability of difficulty of course increases by 4% than just having a good recommendation letter.