Más aplicaciones estadísticas en:

https://rpubs.com/orlandoan

Teorema del límite central.

Sea \(X_{1}, X_{2},\ldots, X_{n}\) una muestra aleatoria de tamaño \(n\) de una distribución con media \(\mu\) y varianza \(\sigma^{2}\). Entonces para \(n\) suficientemente grande \(\bar{X}\) es aproximadamente normal, con media \(\mu_{\bar{X}}=\mu\) y varianza \(\sigma^{2}_{\bar{X}}=\dfrac{\sigma^{2}}{n}\). Además para \(n\) suficientemente grande \(Z=(\bar{X}-\mu)/(\frac{\sigma}{\sqrt{n}})\) tiene aproximadamente una distribución normal estándar.

Ilustración.

Sea la población:

y<-c(42.8,25.3,48.0,23.5,44.9,42.3,47.2,29.8,29.5,41.5,41.4,43.9,34.7,28.9,37.6,37.9,31.3,42.2,26.9,41.8,49.3,33.5,21.3,27.3,39.6,38.7,27.6,28.3,24.5,43.3,43.5,21.9,29.2,22.0,33.4,46.6,24.6,45.9,35.2,46.2,27.9,21.7,30.4,29.1,33.7,25.8,31.3,43.2,37.3,47.9,48.8,41.6,22.1,49.4,28.2,28.4,35.9,29.2,29.5,37.1,40.3,24.4,42.4,33.5,30.8,41.7,27.4,31.5,20.9,24.0,34.0,37.7,20.1,33.8,38.0,23.6,23.7,40.0,48.3,46.3,40.5,43.2,38.3,32.6,36.3,41.2,34.9,21.9,48.6,42.6,38.5,47.5,33.8,46.7,21.3,48.0,48.1,23.1,42.1,27.6,28.0,22.7,20.0,28.7,31.8,20.5,41.6,41.7,28.6,30.4,43.7,34.5,28.0,40.0,45.0,35.5,45.3,46.8,39.1,41.7,42.9,47.1,28.1,46.5,29.9,24.7,28.1,36.2,30.1,30.8,22.1,41.5,26.7,23.8,40.6,38.4,36.7,36.0,31.3,33.6,39.2,43.9,48.3,43.6,48.6,35.3,30.1,26.3,49.6,38.0,39.2,45.0,29.7,42.3,33.5,49.7,21.0,28.3,46.4,32.5,28.7,31.2,43.2,30.3,26.9,28.5,32.1,23.0,47.1,25.1,31.2,39.8,27.1,22.9,44.0,48.3,35.1,31.9,40.8,21.9,34.3,38.8,47.7,44.3,27.5,36.5,27.0,24.1,41.8,28.1,26.4,42.6,30.0,36.2,46.2,24.8,44.0,47.5,39.5,30.8)

hist(y,prob=TRUE,main= paste("Histograma de la poblacion" ))

Puede observarse en el histograma que la población no tiene distribución normal.

Tamaño de la población.

N<-length(y);N
## [1] 200

Media de la población.

mepo<-mean(y);mepo
## [1] 35.187

Varianza de la población.

vapo<-var(y)*(N-1)/N;vapo  
## [1] 70.83323

Desviación estándar de la población.

depo<-sd(y)*sqrt((N-1)/N);depo  
## [1] 8.416248

Seleccionar muestras aleatorias con reemplazamiento de la población. Se seleccionan 20000 muestras de diferentes tamaños.

m10<-c() # muestras de tamaño 10
m20<-c() # muestras de tamaño 20
m50<-c() # muestras de tamaño 50
m100<-c() # muestras de tamaño 100
m500<-c() # muestras de tamaño 500
m1000<-c() # muestras de tamaño 1000
k =20000 # número de muestras
for(i in 1:k){
m10[i]=mean(sample(y,10,replace=TRUE))
m20[i]=mean(sample(y,20,replac=TRUE))
m50[i]=mean(sample(y,50,replace=TRUE))
m100[i]=mean(sample(y,100,replace=TRUE))
m500[i]=mean(sample(y,500,replace=TRUE))
m1000[i]=mean(sample(y,1000,replace=TRUE))
}

Muestras de tamaño 10.

m1<-mean(m10);m1   #medias de las medias de tamaño 10
## [1] 35.18515
de10<-sd(m10);de10
## [1] 2.671652
de1<-depo/sqrt(10);de1 # Desviación estándar de las medias muestrales.
## [1] 2.661451
hist(m10,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m1,de1),col="red",add=TRUE,lwd=2)

Normal estándar.

Z1<-(m10-mepo)/de1
mean(Z1)
## [1] -0.0006937944
sd(Z1)
## [1] 1.003833
hist(Z1,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Muestras de tamaño 20.

m2<-mean(m20);m2
## [1] 35.19437
de20<-sd(m20);de20
## [1] 1.883687
de2<-depo/sqrt(20);de2
## [1] 1.88193
hist(m20,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m2,de2),col="red",add=TRUE,lwd=2)

Normal estándar.

Z2<-(m20-mepo)/(depo/sqrt(20))
mean(Z2)
## [1] 0.003914731
sd(Z2)
## [1] 1.000933
hist(Z2,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Muestras de tamaño 50.

m3<-mean(m50);m3
## [1] 35.18527
de50<-sd(m50);de50
## [1] 1.191863
de3<-depo/sqrt(50);de3
## [1] 1.190237
hist(m50,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m3,de3),col="red",add=TRUE,lwd=2)

Normal estándar.

Z3<-(m50-mepo)/(depo/sqrt(50))
mean(Z3)
## [1] -0.001450635
sd(Z3)
## [1] 1.001366
hist(Z3,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Muestras de tamaño 100.

m4<-mean(m100);m4
## [1] 35.18749
de100<-sd(m100);de100
## [1] 0.8416791
de4<-depo/sqrt(100);de4
## [1] 0.8416248
hist(m100,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m4,de4),col="red",add=TRUE,lwd=2)

Normal estándar.

Z4<-(m100-mepo)/(depo/sqrt(100))
mean(Z4)
## [1] 0.0005773951
sd(Z4)
## [1] 1.000064
hist(Z4,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Muestras de tamaño 500.

m5<-mean(m500);m5
## [1] 35.18822
de500<-sd(m500);de500
## [1] 0.3765323
de5<-depo/sqrt(500);de5
## [1] 0.3763861
hist(m500,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m5,de5),col="red",add=TRUE,lwd=2)

Normal estándar.

Z5<-(m500-mepo)/(depo/sqrt(500))
mean(Z5)
## [1] 0.003238191
sd(Z5)
## [1] 1.000389
hist(Z5,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Muestras de tamaño 1000.

m6<-mean(m1000);m6
## [1] 35.18853
de1000<-sd(m1000);de1000
## [1] 0.2635524
de6<-depo/sqrt(1000);de6
## [1] 0.2661451
hist(m1000,col="green",border="yellow",main="",prob=TRUE)
curve(dnorm(x,m6,de6),col="red",add=TRUE,lwd=2)

Normal estándar.

Z6<-(m1000-mepo)/(depo/sqrt(1000))
mean(Z6)
## [1] 0.005749044
sd(Z6)
## [1] 0.9902582
hist(Z6,prob=TRUE,col="steelblue",border="green")
curve(dnorm(x,0,1),add=TRUE,lwd=2,col="red")

Comparación de las medias.

a1<-cbind(mepo,m1,m2,m3,m4,m5,m6);a1
##        mepo       m1       m2       m3       m4       m5       m6
## [1,] 35.187 35.18515 35.19437 35.18527 35.18749 35.18822 35.18853

|-|-|-|

O.M.F.

|-|-|-|