Weak Law of Large Numbers
Let \(X_1, X_2, ..., X_n\) be a sequence of independent and identically distributed random variables having mean \(\mu\). Then, for any \(\epsilon > 0\), we have:
\[P\left( \left|\frac{X_1 + \dots + X_n}{n} - \mu \right| > \epsilon \right) \stackrel{\text{n} \rightarrow \infty}{\longrightarrow} 0 \ \ or \ \ \lim_{n \to \infty} P\left( \left| \bar{X_n} - \mu \right| > \epsilon \right) = 0\]
弱大數法則同時也是機率收斂的特殊案例,故放在一起講。最簡單的理解就是,樣本平均會機率收斂至母體平均。依循這樣的極限概念,我利用隨機生成 \(\Gamma(\alpha = 1, \beta = 1)\) 變數的方式來計算不同樣本數 \(n\) 下的平均值。下圖為這個部分的示意圖,可以從中清楚知道在樣本數不夠大的時候,平均值不太貼近母體期望值。但當樣本數逐漸擴大,。但當樣本數逐漸擴大,樣本估計值會逐漸收斂至目標母體參數。這順便可以連結到統計上所謂的一致性,也就是當實務上我們有能力去搜集非常大樣本的時候,我們確實是可以改善估計精確度的。所謂世界上能用錢解決的都是小事,一致性的重要程度在於保證這項道理,若少了一致性澤不管花再多代價搜集資料,最終都不會是目標參數的估計,等於徒勞。
set.seed(1234)
n<- 10000
fin<- c()
# par(mfrow = c(2, 1))
# plot(1:n, seq(0, 2, length.out = n), type = "n", main = expression(paste("Sequence of ", bar(X), sep = "")), xlab = "", ylab = "")
# abline(h = 1, lwd = 2)
for (j in 1:n){
x<- rgamma(j, shape = 1, scale = 1) # E(X)=1 Var(X)=1
fin[j]<- mean(x)
}
plot(1:n, fin, type = "b", main = expression(paste("Sequence of ", bar(X), sep = "")), xlab = "n", ylab = expression(bar(X)), cex = 0.6, col = 2)
abline(h = 1, lwd = 2)
簡單來說,我的想法是將標準均勻分配隨機亂數樣本的最大順序統計量 \(X_{(n)}\) 一一取出來,隨著我抽取的樣本越來越多,直覺上來理解,我每次出現的 \(X_{(n)}\) 就會逼近1。這個實驗的目的就是要看說隨著實驗次數的增加,有沒有辦法讓這個 \(X_{(n)}\) 能夠幾乎確信的收斂至1。不過得不好意思地說,我這邊沒有搞很懂,我只能用證據來說確實是有這跡象的,如下圖所示。
n<-2000
B<-50
set.seed(123)
xn.max<-matrix(NA, B, n)
plot(1, 1, type = "n", xlim = c(1, n), ylim=c(0.3,1), main = "", xlab = "n", ylab = "max(x1,...,xn)")
for (j in 1:n) {
xn<-matrix(runif(B*j), B, j)
xn.max[,j]<-apply(xn, 1, max)
points(rep(j, B), xn.max[,j])
}
The CLT
Let \(X_1, X_2, ..., X_n\) be a sequence of independent and identically distributed random variables having mean \(\mu < \infty\) and variance \(0 < \sigma^2 < \infty\). Then:
\[Z_n = \frac{\bar{X} - \mu}{\sigma/\sqrt{n}}\]
converges in distribution to standard normal random variable as \(n \rightarrow \infty\), that is:
\[\lim_{n \to \infty} P\left( Z_n < x \right) = \Phi(x), ~~ \forall x \in \mathbb{R}\]
例子挑最經典的講,那肯定得是中央極限定理。關鍵在於中央極限定理是不需要常態分配假設的,這次也是使用 \(\Gamma(\alpha = 1, \beta = 1)\) 生成隨機變數來模擬分配。利用下面例圖,可以看到上下兩排分別是不同樣本數下的樣本平均機率密度分配和樣本平均累積分配。當樣本數越大,越接近中央極限定理理論上的機率分配函數,符合分配收斂。要不是因為硬體上的限制我很想試試樣本數一百萬,一定會跟理論值更貼合,但作業一定會遲交,下次再說。
set.seed(1234)
# n= 10
n<-10
B<-1000
xbar<-apply( matrix(rgamma(B*n, shape=1, rate=1), B, n), 1, mean)
mu<-1;sigma<-1
par(mfrow = c(2, 2))
plot(density(xbar), xlab=expression(bar(X)), main=paste("pdf, n = ", n))
curve(dnorm(x, mean = mu, sd = sigma/sqrt(n)), col = 2, add = T)
legend("topright", c("Empirical", "CLT"), col = c(1, 2), lty = 1)
plot(ecdf(xbar), main=paste("cdf, n = ", n)) # Empirical Cumulative Distribution Function
curve(pnorm(x, mean = mu, sd = sigma/sqrt(n) ), add = T, col = 2 )
legend("bottomright", c("Empirical", "CLT"), col = c(1, 2), lty = 1)
# n = 10000
n<-10000
B<-1000
xbar<-apply( matrix(rgamma(B*n, shape=1, rate=1), B, n), 1, mean)
mu<-1;sigma<-1
plot(density(xbar), xlab=expression(bar(X)), main=paste("pdf, n = ", n))
curve(dnorm(x, mean = mu, sd = sigma/sqrt(n)), col = 2, add = T)
legend("topright", c("Empirical", "CLT"), col = c(1, 2), lty = 1)
plot(ecdf(xbar), main=paste("cdf, n = ", n)) # Empirical Cumulative Distribution Function
curve(pnorm(x, mean = mu, sd = sigma/sqrt(n) ), add = T, col = 2 )
legend("bottomright", c("Empirical", "CLT"), col = c(1, 2), lty = 1)
事實上如果深究,我這壓根和這學期的內容扯不上邊,頂多和標準布朗運動可能有些關聯,畢竟就是每一個時間片斷會做出往前或往後一單位的行為。若每一步之間獨立,則長此以往會有期望值為原點、變異數和時間成正比的性質。不過我自己是覺得蠻好玩,因為統計上最喜歡玩的就是重複實驗,我先秀出三種可能的情形,分別是失足落水、還在路上、成功到家。然後假設醉漢一開始在線段上的百分之三十處,到家表示到達終點、跌落表示回歸原點、還在路上表示依舊在線段中徘徊。我本來的想法是重複實驗很多次,最後回家機率會接近30%,但我的硬體設備跟時間算得不準,不然可以看到跟上面WLLN很類似的圖,只是最後收斂到0.3,稍微美中不足。
par(mfrow = c(1, 3))
set.seed(1234)
# 1st trial
B <- 10000
#2 plot a empty plot
plot(1, 1, type = "n", main = "Path: Drunkard's Walk",
ylim = c(0, 100), xlim = c(0, B), ylab = "Distance",
xlab = "Step", yaxp = c(0, 100, 10), xaxp = c(0, B, 10),
col = "blue", xaxs = "i", yaxs = "i")
grid(nx = NA, ny = 10)
#3 initialize an m numeric vector
x <- rep(NA, B)
#4 initial value
x[1] <- 30
points(1, 30, pch = 20, cex = 0.8, col = "aquamarine")
#5 random walk
for (i in 2:B) {
x[i] <- x[i-1] + sample(c(-1, 1), size = 1, prob = c(0.5, 0.5))
#Sys.sleep(0.05)
points(i, x[i], pch = 20, cex = 0.8, col = "aquamarine")
if (x[i] == 100) {
legend("center", legend = "Congratulations!! You got Home :)", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (x[i] == 0) {
legend("center", legend = "Oops!! You fell down the Cliff :(", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (i == B) {
legend("center", legend = "Still on the way Home!!", bty="n",
text.col = "purple", cex = 1.5)
}
}
# 2nd trial
B <- 10000
#2 plot a empty plot
plot(1, 1, type = "n", main = "Path: Drunkard's Walk",
ylim = c(0, 100), xlim = c(0, B), ylab = "Distance",
xlab = "Step", yaxp = c(0, 100, 10), xaxp = c(0, B, 10),
col = "blue", xaxs = "i", yaxs = "i")
grid(nx = NA, ny = 10)
#3 initialize an m numeric vector
x <- rep(NA, B)
#4 initial value
x[1] <- 30
points(1, 30, pch = 20, cex = 0.8, col = "aquamarine")
#5 random walk
for (i in 2:B) {
x[i] <- x[i-1] + sample(c(-1, 1), size = 1, prob = c(0.5, 0.5))
#Sys.sleep(0.05)
points(i, x[i], pch = 20, cex = 0.8, col = "aquamarine")
if (x[i] == 100) {
legend("center", legend = "Congratulations!! You got Home :)", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (x[i] == 0) {
legend("center", legend = "Oops!! You fell down the Cliff :(", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (i == B) {
legend("center", legend = "Still on the way Home!!", bty="n",
text.col = "purple", cex = 1.5)
}
}
# 3rd trial
set.seed(123)
B <- 10000
#2 plot a empty plot
plot(1, 1, type = "n", main = "Path: Drunkard's Walk",
ylim = c(0, 100), xlim = c(0, B), ylab = "Distance",
xlab = "Step", yaxp = c(0, 100, 10), xaxp = c(0, B, 10),
col = "blue", xaxs = "i", yaxs = "i")
grid(nx = NA, ny = 10)
#3 initialize an m numeric vector
x <- rep(NA, B)
#4 initial value
x[1] <- 30
points(1, 30, pch = 20, cex = 0.8, col = "aquamarine")
#5 random walk
for (i in 2:B) {
x[i] <- x[i-1] + sample(c(-1, 1), size = 1, prob = c(0.5, 0.5))
#Sys.sleep(0.05)
points(i, x[i], pch = 20, cex = 0.8, col = "aquamarine")
if (x[i] == 100) {
legend("center", legend = "Congratulations!! You got Home :)", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (x[i] == 0) {
legend("center", legend = "Oops!! You fell down the Cliff :(", bty="n",
text.col = "purple", cex = 1.5)
break
} else if (i == B) {
legend("center", legend = "Still on the way Home!!", bty="n",
text.col = "purple", cex = 1.5)
}
}
下圖顯示的是不斷增加實驗次數後算出的成功率走勢,可以看到抵達終點率隨著重複次數增加而漸漸收斂至0.3,其實這樣看下來,還挺有均勻分配的味兒,但是至於要怎麼解釋,我再想一下。
set.seed(1234)
B<- 10000
result_vec<- c()
for (k in 1:1000){ # rep
count<- 0
for (j in 1:k){
x <- 30
for (i in 2:B) {
x<- x + sample(c(-1, 1), size = 1, prob = c(0.5, 0.5))
if (x == 100) {
result<- 1
break
} else if (x == 0) {
result<- 0
break
} else{
result<- 0
}
}
count<- count + result
}
result_vec<- c(result_vec, count/k)
}
plot(1:(k), result_vec, type = "l", main = "Repetition",
ylim = c(0, 1), xlim = c(0, k), ylab = "probability",
xlab = "reps", col = 2)
abline(h = 0.3)