R makes the following functions available:
rbinom(n = 10, size = 1, prob = 0.5)
## [1] 1 0 0 1 1 0 0 1 1 1
rbinom(1, 100, 0.3)
## [1] 33
A<-rbinom(10000,2,0.08)
head(A)
## [1] 0 0 0 0 0 0
table(A)
## A
## 0 1 2
## 8505 1424 71
table(A)/10000
## A
## 0 1 2
## 0.8505 0.1424 0.0071
dbinom(0,2,0.08)#three different lines
## [1] 0.8464
dbinom(1,2,0.08)
## [1] 0.1472
dbinom(2,2,0.08)
## [1] 0.0064
round(dbinom(0:2,2,0.08),4) #all 3 probabilites in one line
## [1] 0.8464 0.1472 0.0064
# Axis titles can be added via the xlab and ylab arguments.
barplot(table(A)/10000,main="PDF",xlab="X",
ylab="Probability",col=2)
cumsum(table(A))/10000
## 0 1 2
## 0.8505 0.9929 1.0000
round(pbinom(0:2,2,0.08),4)
## [1] 0.8464 0.9936 1.0000
plot(0:2,cumsum(table(A))/10000,type='b',
xlab="X",ylab="Probability", main="CDF")
#change in shape when you incease n to 10
barplot(round(dbinom(0:10,10,0.08),4),main="PDF",xlab="X",
names.arg = 0:10, ylab="Probability",col=2,ylim=c(0,1))
A fair coin is tossed 60 times, if we treat heads as a success, answer the following:
A<-rbinom(100000,60,0.5)
head(A)
## [1] 27 24 30 33 34 26
table(A)
## A
## 15 16 17 18 19 20 21 22 23 24 25 26
## 5 12 41 72 157 396 693 1251 2118 3210 4547 6110
## 27 28 29 30 31 32 33 34 35 36 37 38
## 7626 9011 9900 10279 9773 8816 7648 6120 4491 3100 2019 1222
## 39 40 41 42 43 44 45
## 715 349 184 84 34 10 7
table(A)/100000 #get prob
## A
## 15 16 17 18 19 20 21 22 23
## 0.00005 0.00012 0.00041 0.00072 0.00157 0.00396 0.00693 0.01251 0.02118
## 24 25 26 27 28 29 30 31 32
## 0.03210 0.04547 0.06110 0.07626 0.09011 0.09900 0.10279 0.09773 0.08816
## 33 34 35 36 37 38 39 40 41
## 0.07648 0.06120 0.04491 0.03100 0.02019 0.01222 0.00715 0.00349 0.00184
## 42 43 44 45
## 0.00084 0.00034 0.00010 0.00007
barplot(table(A)/100000,main="PDF", xlab="X",
ylab="Probability",col=2)
p<-round(dbinom(0:60,60,0.5),8)
barplot(p,main="PDF",names.arg = 0:60, xlab="X",
ylab="Probability",col=2)
b. What is the probability heads comes up 20 times?
dbinom(20,60,0.5)
## [1] 0.003635846
# prob 20 or 25 or 30 heads out of 60
dbinom(20,60,0.5)+
dbinom(25,60,0.5)+
dbinom(30,60,0.5)
## [1] 0.1512435
pbinom(19,60,0.5) #gives P(X<=19)
## [1] 0.003108801
pbinom(30,60,0.5) #gives P(X<=30)
## [1] 0.5512891
pbinom(19,60,0.5) #gives P(X<=19)
## [1] 0.003108801
#p(20<=X<=30)
pbinom(30,60,0.5) - pbinom(19,60,0.5)
## [1] 0.5481803
cum_p<-round(pbinom(0:60,60,0.5),8)
plot(0:60,cum_p,type='l',
xlab="X",ylab="Probability", main="CDF",ylim=c(0,1))
\(P(x) = \binom{n}{x} p^{x}(1-p)^{n-x}\)
To find the probability of exactly 10 computers being infected
\(P(x) = (0.184756) 0.6^{10}(0.4)^{10} = 0.11714155\)
dbinom(10, size=20, prob=0.4)
## [1] 0.1171416
Now find find the probability that the virus enters at least 10 computers are infected:
dbinom(10, size=20, prob=0.4) +
dbinom(11, size=20, prob=0.4) +
dbinom(12, size=20, prob=0.4) +
dbinom(13, size=20, prob=0.4) +
dbinom(14, size=20, prob=0.4) +
dbinom(15, size=20, prob=0.4) +
dbinom(16, size=20, prob=0.4) +
dbinom(17, size=20, prob=0.4) +
dbinom(18, size=20, prob=0.4) +
dbinom(19, size=20, prob=0.4) +
dbinom(20, size=20, prob=0.4)
## [1] 0.2446628
Alternatively, we can use the cumulative probability function for binomial distribution pbinom:
1 - pbinom(9, size=20, prob=0.4)
## [1] 0.2446628
$P(x > 3) - 1 - P(X 3) $
1 - pbinom(3, size=10, prob=0.1)
## [1] 0.0127952
\(P(x) = p(1-p)^{x-1}\)
The geometric distributon in R is zero based so to get P(6):
dgeom(5, prob=0.4)
## [1] 0.031104
But it’s “at least” so we need to sum P(6)..P(20), alternatively:
\(P(x \ge 6) = 1 - P(X < 6)\)
1 - (dgeom(0, prob=0.4) +
dgeom(1, prob=0.4) +
dgeom(2, prob=0.4) +
dgeom(3, prob=0.4) +
dgeom(4, prob=0.4))
## [1] 0.07776
1 - pgeom(4,0.4)
## [1] 0.07776
However, the argument, lower.tail defaults to TRUE, meaning probabilities are P[X <= x], otherwise, P[X > x]. So, we can use lower.tail = FALSE, meaning we compute the probability of an observation greater than 5 (at least 6):
pgeom(4,0.4,lower.tail = FALSE)
## [1] 0.07776
#var is 25
#parameters describe a normal dist are mean and sd
mu<-22
sigma<-sqrt(25)
sigma
## [1] 5
X<-rnorm(100000,mean=mu,sd=sigma)
hist(X)
plot(density(X),main="Normal PDF",xlab="X")
b. Display the cumulative density function on a graph.
x<-seq(0,44,length=1000) #create a sequence of numbers
x[1:10]
## [1] 0.00000000 0.04404404 0.08808809 0.13213213 0.17617618 0.22022022
## [7] 0.26426426 0.30830831 0.35235235 0.39639640
x[990:1000]
## [1] 43.55956 43.60360 43.64765 43.69169 43.73574 43.77978 43.82382
## [8] 43.86787 43.91191 43.95596 44.00000
dnorm(x[500:510],mu,sigma)
## [1] 0.07978768 0.07978768 0.07978149 0.07976911 0.07975054 0.07972579
## [7] 0.07969487 0.07965777 0.07961452 0.07956511 0.07950957
plot(x,dnorm(x,mu,sigma), type='l',
main="Normal PDF")
plot(x,pnorm(x,mu,sigma),type='l',
main="Normal CDF",xlab="X",
ylab="Cumulative Density")
pnorm(27.5,mu,sigma)-pnorm(16.2,mu,sigma)
## [1] 0.7413095
pnorm(29,mu,sigma) #p(x<29)
## [1] 0.9192433
1- pnorm(29,mu,sigma)#p(x>29)
## [1] 0.08075666
pnorm(17,mu,sigma) #P(X<=17)=P(X<17)
## [1] 0.1586553
1-pnorm(25,mu,sigma) #P(x>25)
## [1] 0.2742531
pnorm(15,mu,sigma) #P(x<15)
## [1] 0.08075666
1-pnorm(25,mu,sigma)+pnorm(15,mu,sigma)
## [1] 0.3550098
#P(X<15 or X>25) = P(X<15)+P(X>25)
pnorm(15,mu,sigma)+(1-pnorm(25,mu,sigma))
## [1] 0.3550098