Bayesian Network

Represent a probability distribution as a probabilistic directed acyclic graph (DAG) .Graph = nodes and edges (arcs) denote variables and dependencies, respectively .Directed = arrows represent the directions of relationships between nodes .Acyclic = if you trace arrows with a pencil, you cannot traverse back to the same node without picking up your pencil .Probabilistic = each node has an associated probability that can be influenced by values other nodes assume based on the structure of the graph .The node at the tail of a connection is called the parent, and the node at the head of the connection is called its child .Ex. A ???B: A is the parent, B is its child

Necessary package selection.

#Note: The below packages are not available in CRAN but having dependency.

#source("http://bioconductor.org/biocLite.R")
#biocLite(c("graph","Rgraphviz"))

#biocLite(c("graph","Rgraphviz","RBGL"))

#install.packages("gRain")

#install.packages(c("pcalg","catnet","abn"))

library("gRain")

## Warning: package 'gRain' was built under R version 3.5.1

## Loading required package: gRbase

## Warning: package 'gRbase' was built under R version 3.5.1

library("pcalg")

## Warning: package 'pcalg' was built under R version 3.5.1

library("catnet")

## 
## Attaching package: 'catnet'

## The following object is masked from 'package:pcalg':
## 
##     dag2cpdag

#library("abn")


#install.packages("bnlearn")
library(bnlearn)

## Warning: package 'bnlearn' was built under R version 3.5.1

## 
## Attaching package: 'bnlearn'

## The following objects are masked from 'package:pcalg':
## 
##     dsep, pdag2dag, shd, skeleton

## The following objects are masked from 'package:gRbase':
## 
##     ancestors, children, parents

## The following object is masked from 'package:stats':
## 
##     sigma

data("coronary")
#?coronary

bn_df <- data.frame(coronary)

#HC - Hill Climbing algorithm
res <- hc(bn_df)

res

## 
##   Bayesian network learned via Score-based methods
## 
##   model:
##    [Smoking][P..Work|Smoking][Pressure|Smoking]
##    [M..Work|Smoking:P..Work:Pressure][Proteins|Smoking:M..Work]
##    [Family|M..Work]
##   nodes:                                 6 
##   arcs:                                  8 
##     undirected arcs:                     0 
##     directed arcs:                       8 
##   average markov blanket size:           3.00 
##   average neighbourhood size:            2.67 
##   average branching factor:              1.33 
## 
##   learning algorithm:                    Hill-Climbing 
##   score:                                 BIC (disc.) 
##   penalization coefficient:              3.759032 
##   tests used in the learning procedure:  65 
##   optimized:                             TRUE

plot(res)

Removing a particular arcs - If we feel that some connections doesnot make sense, we can remove them.

family anamnesis of coronary heart (Family) disease shall not have any dependency on M. Work (strenuous mental work)

res$arcs <- res$arcs[-which(res$arcs[,'from']=="M..Work" & res$arcs[,'to']=="Family"),]

# Ploting the bayesian network
plot(res)

Fit the parameters of a Bayesian network

fittedbn <- bn.fit(res, data = bn_df)
print(fittedbn$Family)

## 
##   Parameters of node Family (multinomial distribution)
## 
## Conditional probability table:
##  
##       M..Work
## Family        no       yes
##    neg 0.8814159 0.8227848
##    pos 0.1185841 0.1772152

Perform conditional probability queries

cpquery(fittedbn, event = (Proteins=="<3"), evidence = ((Smoking=="no") & (Pressure == ">140")))

## [1] 0.6260097

# Bayesian network learned via Score-based methods
skeleton(res)

## 
##   Bayesian network learned via Score-based methods
## 
##   model:
##     [undirected graph]
##   nodes:                                 6 
##   arcs:                                  8 
##     undirected arcs:                     8 
##     directed arcs:                       0 
##   average markov blanket size:           2.67 
##   average neighbourhood size:            2.67 
##   average branching factor:              0.00 
## 
##   learning algorithm:                    Hill-Climbing 
##   score:                                 BIC (disc.) 
##   penalization coefficient:              3.759032 
##   tests used in the learning procedure:  65 
##   optimized:                             TRUE

# Core
moral(res)

## 
##   Bayesian network learned via Score-based methods
## 
##   model:
##     [undirected graph]
##   nodes:                                 6 
##   arcs:                                  9 
##     undirected arcs:                     9 
##     directed arcs:                       0 
##   average markov blanket size:           3.00 
##   average neighbourhood size:            3.00 
##   average branching factor:              0.00 
## 
##   learning algorithm:                    Hill-Climbing 
##   score:                                 BIC (disc.) 
##   penalization coefficient:              3.759032 
##   tests used in the learning procedure:  65 
##   optimized:                             TRUE

# Here no subgraph is present so no such plot.
#subgraph(res)

# Root Node - No Arrow are directed to this Node. 
root.nodes(res)

## [1] "Smoking"

## Smoking

# Leaf Nodes - No Arrow are getting out side from these Node.
leaf.nodes(res)

## [1] "Proteins" "Family"

## "Proteins" "Family"

# Summary - Direction of directed arcs.
directed.arcs(res)

##      from       to        
## [1,] "M..Work"  "Proteins"
## [2,] "Smoking"  "M..Work" 
## [3,] "Smoking"  "Proteins"
## [4,] "Smoking"  "P..Work" 
## [5,] "Pressure" "M..Work" 
## [6,] "P..Work"  "M..Work" 
## [7,] "Smoking"  "Pressure"

# No such Arcs undirected
undirected.arcs(res)

##      from to

# Summary - Direction of all arcs.
arcs(res)

##      from       to        
## [1,] "M..Work"  "Proteins"
## [2,] "Smoking"  "M..Work" 
## [3,] "Smoking"  "Proteins"
## [4,] "Smoking"  "P..Work" 
## [5,] "Pressure" "M..Work" 
## [6,] "P..Work"  "M..Work" 
## [7,] "Smoking"  "Pressure"

## adjacency matrix
amat(res)

##          Smoking M..Work P..Work Pressure Proteins Family
## Smoking        0       1       1        1        1      0
## M..Work        0       0       0        0        1      0
## P..Work        0       1       0        0        0      0
## Pressure       0       1       0        0        0      0
## Proteins       0       0       0        0        0      0
## Family         0       0       0        0        0      0

Bayesian Networks model

Rahul Saha

23 September 2018