Your colleague either commutes by train or by the bus. 20 days of the month, she takes the train and the remaining 10 days she takes the bus. If she takes the train, she reaches work on time with a probability of 0.9. If she takes the bus, she frequently gets stuck in traffic and reaches work on time with a probability of 0.5.
Given that she was on time today, what is the probability that she took the bus to work today?
p.T <- 20/30
p.B <- 10/30
p.O.T <- 0.9
p.L.T <- 0.1
p.O.B <- 0.5
p.O.B <- 0.5
p.O.B * p.B / (p.O.B * p.B + p.O.T * p.T)
## [1] 0.2173913
Answer: the probability that she took the bus to work today is 0.2173913
In the Grade Network that we looked at in the notes, what happens to the probability of Difficulty of Course when you present the evidence that the received recommendation letter was good? In addition, now present the evidence that both SAT scores were good and the letter of recommendation was good, What is the probability of the Difficulty of Course now? You should use the gRain package in R to build your network and perform these calculations.
You may need to install RBGL package from BioConductor in R to get gRain working. See http://www.bioconductor.org/packages/release/bioc/html/RBGL.html for instructions on RBGL.
library(gRain)
## Loading required package: gRbase
# levels
ny <- c("no", "yes")
lh <- c("low", "high")
# nodes
d <- cptable(~d, values=c(0.3, 0.7), levels=ny)
i <- cptable(~i, values=c(0.8, 0.2), levels=lh)
s <- cptable(~s | i, values = c(0.9, 0.1, 0.2, 0.8), levels=lh)
g <- cptable(~g | d:i, values=c(0.6, 0.4, 0.8, 0.2, 0.01, 0.99, 0.1, 0.9), levels=lh)
l <- cptable(~l | g, values=c(0.9, 0.1, 0.05, 0.95), levels=lh)
# make a list
cpt.list <- compileCPT(list(d, i, s, g, l))
cpt.list$d
## d
## no yes
## 0.3 0.7
## attr(,"class")
## [1] "parray" "array"
cpt.list$i
## i
## low high
## 0.8 0.2
## attr(,"class")
## [1] "parray" "array"
cpt.list$s
## i
## s low high
## low 0.9 0.2
## high 0.1 0.8
## attr(,"class")
## [1] "parray" "array"
cpt.list$g
## , , i = low
##
## d
## g no yes
## low 0.6 0.8
## high 0.4 0.2
##
## , , i = high
##
## d
## g no yes
## low 0.01 0.1
## high 0.99 0.9
##
## attr(,"class")
## [1] "parray" "array"
cpt.list$l
## g
## l low high
## low 0.9 0.05
## high 0.1 0.95
## attr(,"class")
## [1] "parray" "array"
gn <- grain(cpt.list)
gn
## Independence network: Compiled: FALSE Propagated: FALSE
## Nodes: chr [1:5] "d" "i" "s" "g" "l"
plot(gn)
# query network to findd prob of diffifulty of course
querygrain(gn, nodes = "d")
## $d
## d
## no yes
## 0.3 0.7
# query network to find prob of difficulty of course given the recommendation letter was good
d.ev <- setEvidence(gn, nodes = "l", states = "high")
querygrain(d.ev, nodes = "d")
## $d
## d
## no yes
## 0.3731439 0.6268561
# query network to find prob of difficulty of course given both SAT scores and the letter of recommendation were good
d.ev2 <- setEvidence(gn, c("l", "s"), c("high", "high"))
querygrain(d.ev2, nodes = "d")
## $d
## d
## no yes
## 0.3323478 0.6676522
Conclusion: The probability of diffifulty of course is .7, given the good recommendation letter, the difficulty of course lowers to 0.6268561, and add on good SAT scores, the difficulty of the course increases to 0.6676522. It shows that the difficulty of the course has a negative relationship with the good recommendation letter, and the SAT scores create a positive relationship with a good recommendation letter.
Reference:
https://rpubs.com/HoneyBerk/IS605_assign8
http://people.math.aau.dk/~sorenh/misc/2014-useR-GMBN/bayesnet-slides.pdf