Week 5 Discussion

2023 10 2

Haiding Luo

Part I

I. Please explain Bayes Theorem in your own words, and give an example. Less than 10 sentences. Also, write out the formula.  Pick up on how to to type equations in R Markdown using Latex terminology


Bayes’ Theorem is a mathematical formula based on conditional probability, used to estimate the probability of one event occurring given the occurrence of another related event. The formula for Bayes’ Theorem is:

\[ P(A \mid B) = \frac{P(A)*P(B \mid A)}{P(B)} \]

where P(A) and P(B) represent the probabilities of events A and B occurring, respectively. P(B|A) represents the conditional probability of event B occurring given that event A has occurred, and P(A|B) represents the conditional probability of event A occurring given that event B has occurred.

Example:

There are two buckets. Bucket 1 contains 40 balls, including 30 white balls and 10 black balls. Bucket 2 also contains 40 balls, with 20 white balls and 20 black balls.

Event A: Drawing 1 ball from bucket 1. Event B: Drawing a white ball. Event C: Drawing 1 ball from bucket 2.

If I grab a white ball, what is the probability P(A|B) that the white ball came from bucket 1?

\[ P(B \mid A)=30/(30+10) = 75\% \]

\[ P(B \mid C) = 20/(20+20)= 50\% \]

\[ P(B) = P(B \mid A) P(A)+P(B \mid C)P(C)=75\%*50\% + 50\%* 50\% = 62.5\% \]

\[ \frac{P(B \mid A)}{P(B)}= 75\% / 62.5 \% = 1.2 \]

\[ \frac{P(A)*P(B \mid A)}{P(B)} = \frac{1}{2}* 1.2 = \frac{3}{5} \]

Part II

options(repos = c(CRAN = "https://cran.rstudio.com/"))
install.packages("BiocManager")
## 将程序包安装入'C:/Users/pokem/AppData/Local/R/win-library/4.3'
## (因为'lib'没有被指定)
## 程序包'BiocManager'打开成功,MD5和检查也通过
## 
## 下载的二进制程序包在
##  C:\Users\pokem\AppData\Local\Temp\RtmpohcMCT\downloaded_packages里
library(BiocManager)

BiocManager::install("Rgraphviz")
## 'getOption("repos")' replaces Bioconductor standard repositories, see
## 'help("repositories", package = "BiocManager")' for details.
## Replacement repositories:
##     CRAN: https://cran.rstudio.com/
## Bioconductor version 3.17 (BiocManager 1.30.22), R 4.3.1 (2023-06-16 ucrt)
## Warning: package(s) not installed when version(s) same as or greater than current; use
##   `force = TRUE` to re-install: 'Rgraphviz'
## Installation paths not writeable, unable to update packages
##   path: C:/Program Files/R/R-4.3.1/library
##   packages:
##     foreign, KernSmooth, lattice, Matrix, mgcv, nlme, spatial, survival
## Old packages: 'curl', 'evaluate', 'Hmisc', 'knitr', 'lubridate', 'markdown',
##   'openssl', 'plyr', 'psych', 'renv', 'rmarkdown', 'tinytex', 'withr'
library(Rgraphviz)
## 载入需要的程辑包:graph
## 载入需要的程辑包:BiocGenerics
## 
## 载入程辑包:'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
##     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
##     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
##     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
##     table, tapply, union, unique, unsplit, which.max, which.min
## 载入需要的程辑包:grid

Guided Practice 3.43: Jose visits campus every Thursday evening. However, some days the parking garage is full, often due to college events. There are academic events on 35% of evenings, sporting events on 20% of evenings, and no events on 45% of evenings. When there is an academic event, the garage fills up about 25% of the time, and it fills up 70% of evenings with sporting events. On evenings when there are no events, it only fills up about 5% of the time. If Jose comes to campus and finds the garage full, what is the probability that there is a sporting event? Use a tree diagram to solve this problem. 

Let:

A be the event of an academic event.

S be the event of a sporting event.

N be the event of no event.

F be the event of the garage being full.

P(A) = 0.35

P(S) = 0.20 P(N) = 0.45

P(F|A) = 0.25

P(F|S) = 0.70

P(F|N) = 0.05

\[ P(S \mid F) =\frac{P(S)*P(F\mid S)}{P(A)*P(F\mid A)+P(S)*P(F\mid S)+P(N)*P(F\mid N)} \]

P_A  <- 0.35
P_S <- 0.20
P_N <- 0.45
P_F_A <- 0.25
P_F_S <- 0.70
P_F_N <- 0.05

Prob <- (P_S * P_F_S) / (P_A * P_F_A + P_S * P_F_S + P_N * P_F_N)
Prob
## [1] 0.56
a1 <- .35
a2 <- .20
a3 <- .45
fa1 <- .25
fa2 <- .7
fa3 <- .05

notfa1 <- 1 - fa1
notfa2 <- 1 - fa2
notfa3 <- 1 - fa3
aANDfa1 <- a1 * fa1
bANDfa2 <- a2 * fa2
cANDfa3 <- a3 * fa3

node1     <-  "P"
node2     <-  "a1"
node3     <-  "a2"
node4     <-  "a3"
node5     <-  "a1ANDfa1"
node6     <-  "notfa1"
node7     <-  "a2ANDfa2"
node8     <-  "notfa2"
node9     <-  "a3ANDfa3"
node10    <-  "notfa3"
nodeNames <- c(node1, node2, node3, node4, node5, node6, node7, node8, node9, node10)

rEG   <- new("graphNEL", 
             nodes = nodeNames, 
             edgemode="directed"
             )


rEG <- addEdge (nodeNames[1], nodeNames[2], rEG, 1)
rEG <- addEdge (nodeNames[1], nodeNames[3], rEG, 1)
rEG <- addEdge (nodeNames[1], nodeNames[4], rEG, 1)
rEG <- addEdge (nodeNames[2], nodeNames[5], rEG, 1)
rEG <- addEdge (nodeNames[2], nodeNames[6], rEG, 1)
rEG <- addEdge (nodeNames[3], nodeNames[7], rEG, 1)
rEG <- addEdge (nodeNames[3], nodeNames[8], rEG, 1)
rEG <- addEdge (nodeNames[4], nodeNames[9], rEG, 1)
rEG <- addEdge (nodeNames[4], nodeNames[10], rEG, 10)
eAttrs <- list()
q <- edgeNames(rEG)




eAttrs$label <- c(toString(a1), toString(a2),
                  toString(a3), toString(fa1),
                  toString(notfa1), toString(fa2),
                  toString(notfa2), toString(fa3),
                  toString(notfa3)
                  )

names(eAttrs$label) <- c( q[1], q[2], q[3], q[4], q[5], q[6],  q[7], q[8], q[9])
edgeAttrs <- eAttrs
attributes <- list(node  = list(label    = "foo", 
                              fillcolor = "green", 
                              fontsize  = "15"
                              ),
                   edge  = list(color   = "red"),
                   graph = list(rankdir = "LR")
                   )

plot (rEG, edgeAttrs = eAttrs,attrs=attributes)

text(578,410, aANDfa1, cex = .8)
text(570,320,notfa1,cex=.8)
 text(578,230, bANDfa2, cex = .8)
 text(570,170,notfa2,cex=.8)
text(578,95, cANDfa3, cex = .8)
text(570,30,notfa3,cex=.8)
text(160,55, paste('P(B):', a2), cex = 1.1)
text(160,35, paste('P(F):', aANDfa1+bANDfa2+cANDfa3), cex = 1.1)
text(160,15, paste('P(B|F):', fa2*a2/(aANDfa1+bANDfa2+cANDfa3)), cex = 1.1)