# Tim 3!, 10!
prod(3:1)
## [1] 6
prod(10:1)
## [1] 3628800
# Tim 10.9.8.7.6.5.4
prod(10:4)
## [1] 604800
# Tim (10.9.8.7.6.5.4)/(40.39.38.37.36)
prod(10:4)/prod(40:36)
## [1] 0.007659481
To hop n phan tu chap k la moi tap hop con gom k phan tu cua tap hop n phan tu
choose(5,2)
## [1] 10
1/choose(5,2)
## [1] 0.1
VD1: Ham mat do nhi phan (Binomial density probability function): dbinom(k, n, p)
lop hoc co 10 nguoi, trong do co 6 nu. Neu 3 ban duoc chon 1 cach ngau nhien, xac suat ma chung ta co 2 ban nu la bao nhieu?
# Chung ta biet rang co 6/10 la nu --> Xac suat nu la 0.6
dbinom(2,3,0.6)
## [1] 0.432
VD2: Ham nhi phan tich luy (Cumulative binomial probability): pbinom(k,n,p)
Xac suat thuoc chong loang xuong co hieu nghiem la khoang 70% (p=0.70). Neu chung ta dieu tri 10 benh nhan, xac suat co toi thieu 8 BN voi kq tich cuc la bao nhieu? Chung ta can tim P (X>=8). Ham pninom(k,n,p) cho chung ta P(X<=k). Do do P(X>=8) = 1-P(X<=7). Cho nen dap so la
1-pbinom(7,10,0.7)
## [1] 0.3827828
VD3: Mo phong ham nhi phan: rbinom (n,k,p) Biet rang trong 1 quan the co khoang 20% nguoi mac benh cao huyet ap, neu chung ta tien hanh chon mau 1000 lan, moi lan chon 20 nguoi trong quan the do 1 cach ngau nhien, su phan phoi so benh nhan cao huyet ap se nhu the nao.
b=rbinom(1000,20,0.20)
table(b)
## b
## 0 1 2 3 4 5 6 7 8 9 11
## 12 53 131 234 221 176 93 42 31 6 1
hist(b,main="number of hypertensive patients")
VD4: Ung dung ham phan phoi nhi phan: 20 khach hang duoc moi uong 2 loai bia A va B, va duoc hoi ho thich bia nao. KQ: 16 nguoi thich bia A. Van de dat ra la kq nay co du de ket luan rang bia A duoc nhieu nguoi thich hon bia B, hay ket qua chi la do cac yeu to ngau nhien gay nen? TL: Gia thuyet neu khong co khac nhau, thi xac suat p=0.50 thich bia A va q=0.50 thich bia B. Neu gia thuyet nay dung, thi xac suat ma chung ta quan sat 16 nguoi trong so 20 nguoi thich bia A la bao nhieu?
1-pbinom(15,20,0.5)
## [1] 0.005908966
Dap so la xac suat 0.005 (0.5%). Hay noi cach khac, neu qua that hai bia giong nhau thi xac suat ma 16/20 nguoi thich bia A chi 0.5%. Tuc la chung ta co bang chung cho thay kha nang bia A qua that duoc nhieu nguoi thich hon bia B, chu khong phai do yeu to ngau nhien.
Ham phan phoi Poisson, noi chung, rat giong voi ham nhi phan, ngoai tru thong so p thuong rat nho, va n thuong rat lon. Vi the, ham Poisson thuong duoc su dung de mo ta cac bien so rat hiem xay ra (nhu so nguoi mac ung thu trong dan so chang han).
VD5: Ham mat do Poisson (Poisson density probability function): dpois (k,lamda) - Qua theo doi nhieu thang, nguoi ta biet duoc ti le danh sai chinh ta cua 1 thu ki danh may: trung binh cu khoang 2.000 chu thi danh sai 1 chu. Hoi xac suat ma thu ky danh sai chinh ta 2 chu, >2 chu la bao nhieu?
dpois(2,1)
## [1] 0.1839397
# Chung ta cung co the tinh xac suat sai 1 chu
dpois(1,1)
## [1] 0.3678794
# Va xac suat khong sai chu nao
dpois(0,1)
## [1] 0.3678794
Tren day la xac xuat ma thu ky danh sai chinh ta dung 2 chu, nhung xac suat ma thu ky danh sai >2 chu (tuc 3,4,5,…chu) co the uoc tinh bang: P (X>2)= 1-P(X<=2)=1-0.3678-0.3678-0.1839 =0.08
#Bang R chung ta co the uoc tinh nhu sau:
# P(x<=2)
ppois(2,1)
## [1] 0.9196986
# 1-P(x<=2)
1-ppois(2,1)
## [1] 0.0803014
Doi voi cac bien so lien tuc, co vai luat phan phoi thich hop khac, ma quan trong nhat la phan phoi chuan. Phan phoi chuan la nen tang quan trong nhat cua phan tich thong ke. Ham mat do phan phoi chuan co 2 thong so: Trung binh “u” (nuy) va phuong sai (o^2) (hay do lech chuan o).
VD6: Ham mat do phan phoi chuan (normal density probability function) Chieu cao trung binh PN Vietnam la 156cm, vo do lech chuan la 4.6cm. Cung biet rang, chieu cao nay tuan theo luat phan phoi chuan. Voi 2 thong so: u=156, o=4.6, chung ta co the xay dung 1 ham phan phoi chieu cao cho toan bo quan the phu nu VN, va ham nay co hinh dang nhu sau:
height=seq(130, 200, 1)
plot(height, dnorm(height, 156, 4.6),
type="l",
ylab="f(height)",
xlab="Height",
main="Probability distribution of height in Vietnamese women")
Voi 2 thong so tren (va bieu do), chung ta co the uoc tinh xac suat cho bat cu chieu cao nao. Chang han nhu xac suat cua 1 PN cao 160cm
dnorm(160, mean=156, sd=4.6)
## [1] 0.05942343
Vi chieu cao la 1 bien lien tuc, trong thuc te chung ta it khi nao muon tim xac suat cho 1 gia tri cu the x, ma thuong tim xac suat cho 1 khoang gia tri a den b. chang han tu 150-160cm (tuc la P(160<=X<=150)), hay la xac suat chieu cao thap hon 145cm (P(X<145)).
# Chieu cao cua Phu nu VN <= 150cm la 9.6%
pnorm(150, 156, 4.6)
## [1] 0.09605751
# Xac suat chieu cao Phu nu Vn >=165cm la 4.1%
1-pnorm(164,156, 4.6)
## [1] 0.04100591
VD7: Ung dung luat phan phoi chuan Trong 1 quan the chung ta biet rang ap suat mau trung binh la 100mmHg va do lech chuan la 13 mmHg, hoi: co bao nhieu nguoi trong quan the nay co ap suat mau bang hoac cao hon 120mmHg?
1-pnorm(120, mean=100, sd=13)
## [1] 0.0619679
Tuc la khoang 6.2% nguoi trong quan the nay co ap suat mau >=120mmHg
Chuan hoa de dua cac bien so ve cung don vi so sanh duoc. Chuan hoa (standardized) X sao cho so trung binh la 0 va phuong sai la 1.
height=seq(-4, 4, 0.1)
plot(height, dnorm(height, 0,1),
type="l",
ylab="f(z)",
xlab="z",
main="Probability distribution of height in Vietnamese women")
Voi phan phoi chuan hoa, chung ta co 1 tien loi la co the dung no de mo ta va so sanh mat do phan phoi cua bat cu bien nao, vi tat ca deu dc chuyen sang chi so z.
Chung ta co the tinh toan xac suat z nho hon 1 hang so (constant) nao do bang R. VD: chung ta muon tim P(z<=-1.96)=? cho 1 phan phoi ma trung binh la 0 va do lech chuan la 1.
pnorm(-1.96, mean=0, sd=1)
## [1] 0.0249979
Hay P(z<=1.96)=?
pnorm(1.96, mean=0, sd=1)
## [1] 0.9750021
Do do, P(-1.96 < z < 1.96) la
pnorm(1.96) - pnorm (-1.96)
## [1] 0.9500042
Noi cach khac, xac suat 95% la z nam giua -1.96 va 1.96. (Chu y trong lenh tren chung ta khong cung cap mean=0, sd=1, boi vi trong thuc te, pnorm gia tri mac dinh (default value) cua thong so mean la 0 va sd la 1)
VD6 (TT) Mot phu nu co chieu cao 170cm cung co nghia la z = (170-156)/4.6=3.04 do lech chuan, va ti le cac phu nu VN co chieu cao hon 170cm la rat thap, chi khoang 0.1%
1-pnorm(3.04)
## [1] 0.001182891
Doi khi chung ta can lam 1 tinh toan dao nguoc. Chang han chung ta muon biet: neu xac suat Z nho hon 1 hang so z nao do cho truoc bang p, thi z la bao nhieu: P(Z<z)=p De tra loi cau hoi nay, chung ta su dung ham qnorm(p, mean=, sd=)
VD8: Biet rang Z~N(0,1) va neu P(Z<z)=0.95. Chung ta muon tim z
qnorm(0.95, mean=0, sd=1)
## [1] 1.644854
Hay P(Z<z)=0.975 cho phan phoi chuan voi mean=0 va do lech chuan bang 1
qnorm(0.975, mean=0, sd=1)
## [1] 1.959964
Cac ham phan phoi t, F va X^2 trong thuc te la ham cua ham phan phoi chuan.
Xuat phat tu tong binh phuong cua 1 bien phan phoi chuan (co mean=0, va phuong sai =1) VD9: Tim xac suat cua 1 bien Khi binh phuong, do do, chi can 2 thong so u va n. Chang han nhu neu chung ta muon tim xac suat P(u=21, df=13), chi don gian dung ham pchisq (u, df)
dchisq(21, 13)
## [1] 0.01977879
Tim xac suat ma 1 bien so u nho hon 21 voi bac tu do 13df. Tuc la tim P(u<=21 | df=13) =?
pchisq(21, 13)
## [1] 0.9270714
Cung co the noi ket qua tren cho biet P (X^2 <21) = 0.927
qchisq(0.90, 15)
## [1] 22.30713
pchisq(22.30713, 15)
## [1] 0.9
qchisq(0.95, 15)
## [1] 24.99579
pchisq(24.99579, 15)
## [1] 0.95
Neu 1 bien phan phoi chuan co trung binh # 0 va phuong sai # 1 thi chung ta se co 1 Phan phoi Khi binh phuong phi trung tam.
# Tim xac suat ma u <= 21, voi dk bac tu do la 13, va thong so non-centrality bang 5.4
pchisq(21, 13, 5.4)
## [1] 0.6837649
# Tim quantile cua 1 tri so tuong duong voi 50% cua 1 phan phoi Khi binh phuong voi 7 bac tu do va thong so non-centrality bang 3
qchisq(0.5, 7, 3)
## [1] 9.180148
VD10: Tim xac suat ma x lon hon 1, trong bien theo luat phan phoi t voi 6 bac tu do
1-pt(1.1, 6)
## [1] 0.1567481
Tuc la P(t6>1.1) = 1-P(t6<1.1) =0.157
# Tim dinh luong cua mot tri so tuong duong voi 95% cua mot phan phoi t voi 15 bac tu do
qt(0.95, 15)
## [1] 1.75305
Noi cach khac, P(t19<1.75305)=0.95
Ti so giua 2 bien so thoe luat phan phoi Khi binh phuong co the chung minh la tuan theo luat phan phoi F. VD11: Tim xac suat ma 1 tri so F>3.24, biet rang bien so do tuan theo luat phan phoi F voi bac tu do 3 va 15df va thong so non-centrality 5
1-pf(3.24, 3, 15, 5)
## [1] 0.3558721
Do do, P(F3,15,5>3.24) = 1-P(F3,15,5 <= 3.24) = 0.3558721
#Voi bac tu do 3 va 15, tim C sao cho P(F3,15>C) =0.05
qf(1-0.05, 3, 15)
## [1] 3.287382
Noi cach khac, P(F3, 15>3.287382)= 1-P(F3,15>=3.287382) = 1-0.95 = 0.05
Ap dung voi du lieu ma han che so mau khien chung ta kho co the uoc tinh mot cach chinh xac cac thong so. Can mo phong de biet duoc do dao dong cua 1 hay nhieu thong so. Mo phong thuong dua vao cac luat phan phoi. VD11: Mo phong de chung minh phuong sai cua so trung binh bang phuong sai chia cho n
# Bay gio chung ta su dung 2 thong so "gia tri trung binh" va "phuong sai" de thu mo phong 500 lan
#Lenh 1: tao ra 3 gia tri cua x
#Lenh 2: Nhap so xac suat cho tung gia tri x
#Lenh 3: lenh sample yeu cau R tao nen 500 so ngau nhien va cho vao doi tuong draws
x=c(1,3,5)
px=c(0.6, 0.3, 0.1)
draws=sample(x, size=500, replace=T, prob=px)
hist(draws, breaks=seq(1,5, by=0.25), main="500 draws")
var(draws)
## [1] 1.956168
Tu luat phan phoi xac suat chung ta biet rang tinh trung binh se co 60% lan co gia tri “1”, 30% co gia tri “3”, va 10% co gia tri “5” Do do, chung ta ky vong se quan sat 300, 150 va 50 lan cho moi gia tri
# Test ve nghien cuu Ho ga
x=c(1, 3, 5)
px=c(0.1, 0.833, 0.067)
draws=sample(x, size=600, replace=T, prob=px)
hist(draws, breaks=seq(1,5, by=0.5), main="600 mau")
Vd: Chung ta co 1 quan the gom 40 nguoi (ma so 1, 2, 3…..40). Neu chung ta muon chon 5 doi tuong quan the do, ai se la nguoi dc chon? Chung ta co the dung lenh sample()
sample(1:40, 5)
## [1] 26 17 31 32 36
Moi lan ra lenh nay, R se chon 1 mau khac nhau.
sample(1:40, 5)
## [1] 38 7 36 23 5
Tren day la chung ta chon mau ngau nhien ma khogn thay the (random sampling without replacement), tuc la moi lan chon mau, chung ta khong bo lai cac mau da chon vao quan the Nhung neu chung ta muon chon mau thay the (tuc moi lan chon ra mot so doi tuong, chung ta bo vao lai trong quan the de chon tiep lan sau). VD: Chung ta muon chon 10 nguoi tu 1 quan the 50 nguoi, bang cach lay mau thay the (random sampling with replacement), chung ta chi can them tham so replace = TRUE
sample (1:50, 10, replace =T)
## [1] 23 45 26 21 6 2 26 21 7 49
Hay nem 1 dong xu 10 lan, moi lan, di nhien dong xu co 2 kq H va T, va ket qua 10 lan co the la:
sample(c("H", "T"), 10, replace=T)
## [1] "H" "T" "H" "H" "T" "T" "H" "T" "T" "H"
VD: Chung ta co 5 qua banh mau xanh (X) va 5 qua banh mau do (D) trong 1 bao. Neu chung ta chon 1 qua banh, ghi nhan mau, roi de lai vao bao, roi lai chon 1 qua banh khac, ghi nhan mau, va bo vao bao lai. Cu nhu the, chung ta chon 20 lan, ket qua co the la:
sample(c("X", "D"), 20, replace=T)
## [1] "X" "D" "X" "X" "X" "X" "D" "X" "D" "D" "X" "X" "X" "X" "X" "D" "X" "X" "D"
## [20] "X"
Ngoai ra, chung ta con co the lay mau voi 1 xac suat cho truoc. Trong ham sau day, chung ta chon 10 doi tuong tu day so 1 den 5, nhung xac suat khong bang nhau:
sample(5,10, prob=c(0.3, 0.4, 0.1, 0.1, 0.1), replace=T)
## [1] 4 2 1 2 1 3 5 1 3 3