Jose visits campus every Thursday evening. However, some days the parking garage is full, often due to college events. There are academic events on 35% of evenings, sporting events on 20% of evenings, and no events on 45% of evenings. When there is an academic event, the garage fills up about 25% of the time, and it fills up 70% of evenings with sporting events. On evenings when there are no events, it only fills up about 5% of the time. If Jose comes to campus and finds the garage full, what is the probability that there is a sporting event? Use a tree diagram to solve this problem.

##install Packages

library(BiocManager)

BiocManager::install("Rgraphviz")
## Bioconductor version 3.18 (BiocManager 1.30.22), R 4.3.2 (2023-10-31 ucrt)
## Warning: package(s) not installed when version(s) same as or greater than current; use
##   `force = TRUE` to re-install: 'Rgraphviz'
## Installation paths not writeable, unable to update packages
##   path: C:/Program Files/R/R-4.3.2/library
##   packages:
##     cluster, foreign, lattice, MASS, Matrix, mgcv, nlme, rpart
## Old packages: 'DataExplorer', 'tidyr'
library(Rgraphviz)
## Loading required package: graph
## Loading required package: BiocGenerics
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
##     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
##     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
##     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
##     table, tapply, union, unique, unsplit, which.max, which.min
## Loading required package: grid

First we can define each variable as it pertains to Bayes Theorem. Given the information above it is determined:

\(P(A) = Academic Event = 0.35\)
\(P(S) = Sporting Event = 0.20\)
\(P(N) = No Event = 0.45\)
\(P(F \mid A) = Garage Full During Academic Event = 0.25\)
\(P(F \mid S) = Garage Full During Sporting Event = 0.70\)
\(P(F \mid N) = Garage Full During No Event = 0.05\) \(P(S \mid F) = UNKNOWN\)

Therefore we can calculate the conditional probability that there is a sporting event given that the garage is full by plugging the values into Bayes Theorem.

\(P(S \mid F) = \frac{P(F \mid S)*P(S)}{P(F \mid S)*P(S) + P(F \mid A)*P(A) + P(F \mid N)*P(N)}\)
\(P(S \mid F) = \frac{0.70*0.20}{0.70*0.20 + 0.25*0.35 + 0.05*0.45}\)
\(P(S \mid F) = \frac{0.14}{0.25}\)
\(P(S \mid F) = 0.56\)

The conditional probability that there is a sporting event given that the garage is ful is therefore 0.56

Additionally, in R Bayes Theorem can be set as a function to find the value of \(P(S \mid F)\)
SEE BELOW

##First set all variables that will be used in Bayes Theorem
pA <- 0.35 ##Academic Event
pS <- 0.20 ##Sporting Event
pN <- 0.45 ##No Event
pFA <- 0.25 ##Garage Full/Academic Event
pFS <- 0.70 ##Garage Full/Sporting Event
pFN <- 0.05 ##Garage Full/No Event

#Create Compliments
pNotA <- 1-pFA
pNotS <- 1-pFS
pNotN <- 1-pFN

##create Bayes Theorem Formula
BayesTheorem <- function(pA,pS,Pn,pFA,pFS,pFN){
  pSF <- (pFS*pS)/((pFA*pA) + (pFS*pS) + (pFN*pN))
  return(pSF)}


##Calculate garage full based on sporting event
BayesTheorem(pA,pS,pN,pFA,pFS,pFN)
## [1] 0.56

Output shows that the probability that the garage is full based on sporting event is 0.56

Next, it is important to visually represent the data by using a conditional probability tree SEE BELOW

##define nodes of the probability tree
n1 <- "P"
n2 <- "Academic Event"
n3 <- "Sporting Event"
n4 <- "No Event"
n5 <- "A - Garage Full"
n6 <- "A - Garage Available"
n7 <- "S - Garage Full"
n8 <- "S - Garage Available"
n9 <- "N - Garage Full"
n10 <- "N - Garage Available"

##Define labels of nodes
n_labels <- c(n1,n2,n3,n4,n5,n6,n7,n8,n9,n10)

##create directed graph
rEG <- new("graphNEL", nodes=n_labels, edgemode="directed")

##Add branches to the Probability Tree
rEG <- addEdge(n_labels[1], n_labels[2], rEG, 1) 
rEG <- addEdge(n_labels[1], n_labels[3], rEG, 1)
rEG <- addEdge(n_labels[1], n_labels[4], rEG, 1)
rEG <- addEdge(n_labels[2], n_labels[5], rEG, 1)
rEG <- addEdge(n_labels[2], n_labels[6], rEG, 1)
rEG <- addEdge(n_labels[3], n_labels[7], rEG, 1)
rEG <- addEdge(n_labels[3], n_labels[8], rEG, 1)
rEG <- addEdge(n_labels[4], n_labels[9], rEG, 1)
rEG <- addEdge(n_labels[4], n_labels[10], rEG, 1)

##store edge attributes to list
eAttrs <- list()
  
##Create variable to hold edge names
e <- edgeNames(rEG)

##add probability values to branches 
eAttrs$label <- c(toString(pA),toString(pS), toString(pN), toString(pFA), toString(pNotA), toString(pFS), toString(pNotS), toString(pFN), toString(pNotN))

##assign edge names
names(eAttrs$label) <- c(e[1], e[2], e[3], e[4], e[5], e[6], e[7], e[8], e[9])

##assign edge attributes
edgeAttrs <- eAttrs

##Format Probability Tree
attributes <- list(node=list(fillcolor="red", fontsize="15"), edge=list(color="black", fontsize = "25"),graph=list(rankdir="LR"))

##plot
probability.tree <- plot(rEG, edgeAttrs=eAttrs, attrs=attributes) 

nodes(rEG)
##  [1] "P"                    "Academic Event"       "Sporting Event"      
##  [4] "No Event"             "A - Garage Full"      "A - Garage Available"
##  [7] "S - Garage Full"      "S - Garage Available" "N - Garage Full"     
## [10] "N - Garage Available"