DATA 605 - Assignment 8

PROBLEM SET 1

Your colleague either commutes by train or by the bus. 20 days of the month, she takes the train and the remaining 10 days she takes the bus. If she takes the train, she reaches work on time with a probability of 0.9. If she takes the bus, she frequently gets stuck in traffic and reaches work on time with a probability of 0.5. Given that she was on time today, what is the probability that she took the bus to work today?

#Define variables and known probabilities
nBus<-10
nTrain<-20
total<-30
busOnTime<-0.5
busLate<-0.5
trainOnTime<-0.9
trainLate<-0.1

#Calculate possibilities
oddsBusOnTime<-(nBus/total)*busOnTime
oddsBusLate<-(nBus/total)*busLate
oddsTrainOnTime<-(nTrain/total)*trainOnTime
oddsTrainLate<-(nTrain/total)*trainLate

#respective odds
round(oddsBusOnTime,4)

## [1] 0.1667

round(oddsBusLate,4)

## [1] 0.1667

round(oddsTrainOnTime,4)

## [1] 0.6

round(oddsTrainLate,4)

## [1] 0.0667

The answer:

#Reduce to just On Time and determine probability it was by bus
oddsBusOnTime/(oddsBusOnTime+oddsTrainOnTime)

## [1] 0.2173913

PROBLEM SET 2

In the Grade Network that we looked at in the notes, what happens to the probability of Difficulty of Course when you present the evidence that the received recommendation letter was good? In addition, now present the evidence that both SAT scores were good and the letter of recommendation was good, What is the probability of the Difficulty of Course now?

Specify conditional probability tables (with values as given in Lauritzen and Spiegelhalter (1988)):

yn <- c("yes","no")
lh <- c("low","high")

diffNode <- cptable(~difficulty, values=c(0.3, 0.7), levels=yn)

intNode <- cptable(~intelligence, values=c(0.8, 0.2), levels=lh)

satNode <- cptable(~SAT | intelligence,values=c(0.9,0.1,0.2,0.8),levels=lh)

gradeNode <- cptable(~grade | difficulty:intelligence,values=c(0.6, 0.4, 0.8, 0.2, 0.01, 0.99, 0.1, 0.9),levels=lh)

letterNode <- cptable(~letter | grade, values=c(0.9,0.1,0.05,0.95), levels=lh)

Compile list of conditional probability tables and create the network:

#compile
plist <- compileCPT(list(intNode,diffNode,satNode,gradeNode,letterNode))
plist

## CPTspec with probabilities:
##  P( intelligence )
##  P( difficulty )
##  P( SAT | intelligence )
##  P( grade | difficulty intelligence )
##  P( letter | grade )

Plot the network

plot(grain(plist))

Add evidence that the recommendation letter was good and check the difficulty.

#check originals
plist$difficulty

## difficulty
## yes  no 
## 0.3 0.7

#inject evidence
update<-querygrain(setEvidence(grain(plist), evidence=list(letter="high"),nodes="difficulty"))
update$difficulty

## difficulty
##       yes        no 
## 0.3731439 0.6268561

It is evident that the difficulty now increases.

Add both SAT scores as good.

What is the probability of the Difficulty of Course now?

#inject evidence
update2<-querygrain(setEvidence(grain(plist), evidence=list(letter="high",SAT="high"),nodes="difficulty",))
update2$difficulty

## difficulty
##       yes        no 
## 0.3323478 0.6676522

DATA 605 - Assignment 8

Robert Sellers

October 16, 2016

PROBLEM SET 1

PROBLEM SET 2