2023 10 2
Haiding Luo
I. Please explain Bayes Theorem in your own words, and give an example. Less than 10 sentences. Also, write out the formula. Pick up on how to to type equations in R Markdown using Latex terminology
Bayes’ Theorem is a mathematical formula based on conditional
probability, used to estimate the probability of one event occurring
given the occurrence of another related event. The formula for Bayes’
Theorem is:
\[ P(A \mid B) = \frac{P(A)*P(B \mid A)}{P(B)} \]
where P(A) and P(B) represent the probabilities of events A and B occurring, respectively. P(B|A) represents the conditional probability of event B occurring given that event A has occurred, and P(A|B) represents the conditional probability of event A occurring given that event B has occurred.
Example:
There are two buckets. Bucket 1 contains 40 balls, including 30 white balls and 10 black balls. Bucket 2 also contains 40 balls, with 20 white balls and 20 black balls.
Event A: Drawing 1 ball from bucket 1. Event B: Drawing a white ball. Event C: Drawing 1 ball from bucket 2.
If I grab a white ball, what is the probability P(A|B) that the white ball came from bucket 1?
\[ P(B \mid A)=30/(30+10) = 75\% \]
\[ P(B \mid C) = 20/(20+20)= 50\% \]
\[ P(B) = P(B \mid A) P(A)+P(B \mid C)P(C)=75\%*50\% + 50\%* 50\% = 62.5\% \]
\[ \frac{P(B \mid A)}{P(B)}= 75\% / 62.5 \% = 1.2 \]
\[ \frac{P(A)*P(B \mid A)}{P(B)} = \frac{1}{2}* 1.2 = \frac{3}{5} \]
options(repos = c(CRAN = "https://cran.rstudio.com/"))
install.packages("BiocManager")
## 将程序包安装入'C:/Users/pokem/AppData/Local/R/win-library/4.3'
## (因为'lib'没有被指定)
## 程序包'BiocManager'打开成功,MD5和检查也通过
##
## 下载的二进制程序包在
## C:\Users\pokem\AppData\Local\Temp\RtmpohcMCT\downloaded_packages里
library(BiocManager)
BiocManager::install("Rgraphviz")
## 'getOption("repos")' replaces Bioconductor standard repositories, see
## 'help("repositories", package = "BiocManager")' for details.
## Replacement repositories:
## CRAN: https://cran.rstudio.com/
## Bioconductor version 3.17 (BiocManager 1.30.22), R 4.3.1 (2023-06-16 ucrt)
## Warning: package(s) not installed when version(s) same as or greater than current; use
## `force = TRUE` to re-install: 'Rgraphviz'
## Installation paths not writeable, unable to update packages
## path: C:/Program Files/R/R-4.3.1/library
## packages:
## foreign, KernSmooth, lattice, Matrix, mgcv, nlme, spatial, survival
## Old packages: 'curl', 'evaluate', 'Hmisc', 'knitr', 'lubridate', 'markdown',
## 'openssl', 'plyr', 'psych', 'renv', 'rmarkdown', 'tinytex', 'withr'
library(Rgraphviz)
## 载入需要的程辑包:graph
## 载入需要的程辑包:BiocGenerics
##
## 载入程辑包:'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
## table, tapply, union, unique, unsplit, which.max, which.min
## 载入需要的程辑包:grid
Guided Practice 3.43: Jose visits campus every Thursday evening. However, some days the parking garage is full, often due to college events. There are academic events on 35% of evenings, sporting events on 20% of evenings, and no events on 45% of evenings. When there is an academic event, the garage fills up about 25% of the time, and it fills up 70% of evenings with sporting events. On evenings when there are no events, it only fills up about 5% of the time. If Jose comes to campus and finds the garage full, what is the probability that there is a sporting event? Use a tree diagram to solve this problem.
Let:
A be the event of an academic event.
S be the event of a sporting event.
N be the event of no event.
F be the event of the garage being full.
P(A) = 0.35
P(S) = 0.20 P(N) = 0.45
P(F|A) = 0.25
P(F|S) = 0.70
P(F|N) = 0.05
\[ P(S \mid F) =\frac{P(S)*P(F\mid S)}{P(A)*P(F\mid A)+P(S)*P(F\mid S)+P(N)*P(F\mid N)} \]
P_A <- 0.35
P_S <- 0.20
P_N <- 0.45
P_F_A <- 0.25
P_F_S <- 0.70
P_F_N <- 0.05
Prob <- (P_S * P_F_S) / (P_A * P_F_A + P_S * P_F_S + P_N * P_F_N)
Prob
## [1] 0.56
a1 <- .35
a2 <- .20
a3 <- .45
fa1 <- .25
fa2 <- .7
fa3 <- .05
notfa1 <- 1 - fa1
notfa2 <- 1 - fa2
notfa3 <- 1 - fa3
aANDfa1 <- a1 * fa1
bANDfa2 <- a2 * fa2
cANDfa3 <- a3 * fa3
node1 <- "P"
node2 <- "a1"
node3 <- "a2"
node4 <- "a3"
node5 <- "a1ANDfa1"
node6 <- "notfa1"
node7 <- "a2ANDfa2"
node8 <- "notfa2"
node9 <- "a3ANDfa3"
node10 <- "notfa3"
nodeNames <- c(node1, node2, node3, node4, node5, node6, node7, node8, node9, node10)
rEG <- new("graphNEL",
nodes = nodeNames,
edgemode="directed"
)
rEG <- addEdge (nodeNames[1], nodeNames[2], rEG, 1)
rEG <- addEdge (nodeNames[1], nodeNames[3], rEG, 1)
rEG <- addEdge (nodeNames[1], nodeNames[4], rEG, 1)
rEG <- addEdge (nodeNames[2], nodeNames[5], rEG, 1)
rEG <- addEdge (nodeNames[2], nodeNames[6], rEG, 1)
rEG <- addEdge (nodeNames[3], nodeNames[7], rEG, 1)
rEG <- addEdge (nodeNames[3], nodeNames[8], rEG, 1)
rEG <- addEdge (nodeNames[4], nodeNames[9], rEG, 1)
rEG <- addEdge (nodeNames[4], nodeNames[10], rEG, 10)
eAttrs <- list()
q <- edgeNames(rEG)
eAttrs$label <- c(toString(a1), toString(a2),
toString(a3), toString(fa1),
toString(notfa1), toString(fa2),
toString(notfa2), toString(fa3),
toString(notfa3)
)
names(eAttrs$label) <- c( q[1], q[2], q[3], q[4], q[5], q[6], q[7], q[8], q[9])
edgeAttrs <- eAttrs
attributes <- list(node = list(label = "foo",
fillcolor = "green",
fontsize = "15"
),
edge = list(color = "red"),
graph = list(rankdir = "LR")
)
plot (rEG, edgeAttrs = eAttrs,attrs=attributes)
text(578,410, aANDfa1, cex = .8)
text(570,320,notfa1,cex=.8)
text(578,230, bANDfa2, cex = .8)
text(570,170,notfa2,cex=.8)
text(578,95, cANDfa3, cex = .8)
text(570,30,notfa3,cex=.8)
text(160,55, paste('P(B):', a2), cex = 1.1)
text(160,35, paste('P(F):', aANDfa1+bANDfa2+cANDfa3), cex = 1.1)
text(160,15, paste('P(B|F):', fa2*a2/(aANDfa1+bANDfa2+cANDfa3)), cex = 1.1)