MATI - Intelligent Systems Architecture Homework #2
1.1 What is the sufficient number of numbers (conditional and prior probabilities) an analyst needs to estimate from data (or to obtain by interviewing domain experts) in order to fully determine the joint probability distribution spanning attributes A, B, C, and D, using a Bayesian belief network of the structure depicted below? Assume all attributes are binary.
(A)
* *
* *
* *
* *
|_ _|
(B) *************>(C)
* *
* *
* *
* *
* *
_|_
(D)
A. 9(X) Would be the answer B. 11 C. 13 D. None of the above.
Answer: Assuming that all attributes are binary
For (A), that has no parents, there are two values that can be taken because it is binary (1,0), If probaility of one(1) of them is known, the other can be calculated.
For (B), that has a Parent (A).(B) depends of (A). If (A) is known, (B) can be calculated if the dependency is known.
For (C), where (A) and (B) are parents, there is 4 possible options as for (D).
In summary, the information needed from the experts would be 9 numbers to calculate the joint probability.
1.2 Let us assume that all attributes in the Bayesian network shown below are binary and the following information is provided: P(Y) = 0.6 P(X|Y) = 0.3 P(X|~Y) = 0.8 P(Z|X,Y) = 0.7 P(Z|X,~Y) = 0.2 P(Z|~X,Y) = 0.1 P(Z|X,Y) = 0.8
(X) <*************(Y)
* *
* *
* *
* *
* *
_|_
(Z)
What is the probability of the event that [Z = FALSE and X=TRUE] given the available information?
#The following link was taken as a reference guide for this point: #http://sujitpal.blogspot.com.co/2013/07/bayesian-network-inference-with-r-and.html
library(bnlearn)
## Warning: package 'bnlearn' was built under R version 3.4.1
##
## Attaching package: 'bnlearn'
## The following object is masked from 'package:stats':
##
## sigma
set.seed(42)
# a BN using expert knowledge
net <- model2network("[Y][X|Y][Z|X:Y]")
yn <- c("yes", "no")
cptY <- matrix(c(0.6, 0.4), ncol=2, dimnames=list(NULL, yn))
cptX <- matrix(c(0.3, 0.7, 0.8, 0.2),
ncol=2, dimnames=list("X"=yn, "Y"=yn))
cptZ<- c(0.7, 0.3, 0.2, 0.8, 0.1, 0.9, 0.8, 0.2)
dim(cptZ) <- c(2, 2, 2)
dimnames(cptZ) <- list("Z"=yn, "X"=yn, "Y"=yn)
net.disc <- custom.fit(net, dist=list(X=cptX, Y=cptY, Z=cptZ))
cpquery(net.disc, (Z=="no" & X=="yes"), TRUE)
## [1] 0.3518
The following is the table of probability
Y Probability TRUE FALSE 0.6 0.4
X Probability y TRUE FALSE TRUE 0.3 0.7 FALSE 0.8 0.2
Z Probability X Y TRUE FALSE TRUE TRUE 0.7 0.3 TRUE FALSE 0.2 0.8 FALSE TRUE 0.1 0.9 FALSE FALSE 0.8 0.2
What is the probaility of Z=false if X=true?
P(x=t,z=f)= P(x=t|y=t)P(x=t|y=f) * P(z=f|x=t,y=t)P(z=f|x=t,y=f)P(z=f|x=f,y=t)P(z=f|x=f,y=f) =0.30.8 + 0.30.80.90.2 =0.010398
p(z=f|x=t)=p(x=t,z=f)/p(x=t) p(z=f,y,x=t) p(z=f|x=t,y=t)p(x=t|y=t)p(y=t) + p(z=f|x=t,y=f)p(x=t|y=f)p(y=f)
=0.30.30.6+0.80.80.4=0.054+0.256=0.31
Manual calculations shows that probability that z=false and x true would be 31%
Answer:
A. It is equal or greater than 20%. B. It is lower than 20%. C. It is equal or greater than 30%.(X)This would be the answer D. It is lower than 30%.
| .8 .3 | | .6 .35 |
T= | | O= | |
| .2 .7 | | .4 .65 |
2.1. Write code to generate 3 chains of length 10 from this HMM. Report the three chains you obtain.
#The following link was taken as a reference guide, for this point:
# https://cran.r-project.org/web/packages/HMM/HMM.pdf
#Usage
#initHMM(States, Symbols, startProbs=NULL, transProbs=NULL, emissionProbs=NULL)
library (HMM)
# Initialise HMM
hmm = initHMM(c("1","2"), c("A","B"), transProbs=matrix(c(.8,.2,.3,.7),2), emissionProbs=matrix(c(.6,.4,.35,.65),2))
print(hmm)
## $States
## [1] "1" "2"
##
## $Symbols
## [1] "A" "B"
##
## $startProbs
## 1 2
## 0.5 0.5
##
## $transProbs
## to
## from 1 2
## 1 0.8 0.3
## 2 0.2 0.7
##
## $emissionProbs
## symbols
## states A B
## 1 0.6 0.35
## 2 0.4 0.65
2.1 Write code to generate 3 chains of length 10 from this HMM. Report the three chains you obtain.
simHMM(hmm, 10)
## $states
## [1] "1" "2" "2" "2" "2" "2" "2" "2" "1" "1"
##
## $observation
## [1] "A" "B" "B" "B" "A" "B" "A" "B" "A" "A"
simHMM(hmm, 10)
## $states
## [1] "2" "2" "2" "2" "2" "2" "1" "2" "2" "2"
##
## $observation
## [1] "A" "B" "B" "B" "A" "B" "B" "B" "A" "B"
simHMM(hmm, 10)
## $states
## [1] "2" "2" "1" "1" "1" "1" "2" "1" "2" "2"
##
## $observation
## [1] "B" "A" "B" "A" "A" "A" "B" "A" "B" "B"
print (simHMM)
## function (hmm, length)
## {
## hmm$transProbs[is.na(hmm$transProbs)] = 0
## hmm$emissionProbs[is.na(hmm$emissionProbs)] = 0
## states = c()
## emission = c()
## states = c(states, sample(hmm$States, 1, prob = hmm$startProbs))
## for (i in 2:length) {
## state = sample(hmm$States, 1, prob = hmm$transProbs[states[i -
## 1], ])
## states = c(states, state)
## }
## for (i in 1:length) {
## emi = sample(hmm$Symbols, 1, prob = hmm$emissionProbs[states[i],
## ])
## emission = c(emission, emi)
## }
## return(list(states = states, observation = emission))
## }
## <bytecode: 0x00000000171e8cd0>
## <environment: namespace:HMM>
2.2 Find the most probable path of hidden states for the sequence of observations below. Report your answer.
# Sequence of observations
observations = c("A" , "A" , "A" , "B" , "A" , "B" , "A" , "B" , "A" , "A" , "B" , "B" , "A" , "A" , "B")
v=viterbi(hmm, observations)
print (v)
## [1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1"