Generación de variables aleatorias normal

x1<-rnorm(10,3,1)
hist(x1)

x2<-rnorm(100,3,1)
hist(x2)

x3<-rnorm(1000,3,1)
hist(x3)

x<-seq(0,7,length=500)
f<-dnorm(x,3,1)
plot(x,f,type = "l")

General binomiales

Se verifica que la distribución binomial cuando n es grande tiende a la distribución normal.

x1<-rbinom(10,4,1/4)
barplot(x1)

x2<-rbinom(50,4,1/4)
plot(density(x2))

x3<-rbinom(200,4,1/4)
plot(density(x3))

x3<-rbinom(300,10,1/6)
plot(density(x3))

x4<-rbinom(500,10,1/2)
plot(density(x4))

hist(x4)

x1<-rpois(10,3)
x1

##  [1] 3 5 2 0 1 4 3 2 1 1

hist(x1)

x2<-rpois(50,3)
hist(x2)

plot(density(x2))

x3<-rpois(200,3)
hist(x3)

plot(density(x3))

Distribuciones continuas.

x1=rgamma(10,1,1)
hist(x1)

plot(density(x1))

x1=rgamma(30,1,1)
hist(x1)

plot(density(x1))

x1=rgamma(100,1,1)
hist(x1)

plot(density(x1))

Pruebas de bondad de ajuste

x=rnorm(100,0,1)
shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.98045, p-value = 0.1441

Intervalo de confianza bajo normalidad usando la distribución t - student.

\[ H_0: \mu=0\\ H_a: \mu \neq 0 \]

t.test(x)

## 
##  One Sample t-test
## 
## data:  x
## t = -1.5354, df = 99, p-value = 0.1279
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.34144049  0.04353934
## sample estimates:
##  mean of x 
## -0.1489506

Dos poblaciones independientes y normales.

x=rnorm(40,0,1)
y=rnorm(40,0,1)

\[ H_0: \mu_X = \mu_Y\\ H_a: \mu_X \neq \mu_Y \]

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.98345, p-value = 0.8142

shapiro.test(y)

## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.94426, p-value = 0.0482

Como x & y no se rechaza la hipótesis de normalidad se puede realizar inferencia sobre la diferencia de medias

t.test(x,y)

## 
##  Welch Two Sample t-test
## 
## data:  x and y
## t = 0.16449, df = 74.249, p-value = 0.8698
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4137013  0.4881565
## sample estimates:
##    mean of x    mean of y 
## -0.002128128 -0.039355715

Medias son diferentes tamaño de muestra más grande y diferencia 10

x=rnorm(40,0,1)
y=rnorm(50,1,1)

\[ H_0: \mu_X = \mu_Y\\ H_a: \mu_X \neq \mu_Y \]

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.98542, p-value = 0.877

Medias son diferentes

x=rnorm(40,0,1)
y=rnorm(45,3,1)

\[ H_0: \mu_X = \mu_Y\\ H_a: \mu_X \neq \mu_Y \]

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.96817, p-value = 0.3144

shapiro.test(y)

## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.98065, p-value = 0.6462

Como x & y no se rechaza la hipótesis de normalidad se puede realizar inferencia sobre la diferencia de medias

t.test(x,y)

## 
##  Welch Two Sample t-test
## 
## data:  x and y
## t = -14.022, df = 80.241, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.532364 -2.654368
## sample estimates:
##  mean of x  mean of y 
## -0.1006986  2.9926674

Se rechaza la hipótesis nula luego las medias son significativamente diferentes.

T.TEST

?t.test

## starting httpd help server ... done

Camino para realizar una pruaba de hipótesis

x=rnorm(50,35,2)
y=rnorm(50,40,2)

#Primer paso: Pruebas de bondad de ajuste.

\[ H_0: X \sim Normal\\ H_a: X \not\sim Normal \]

library(nortest)
lillie.test(x)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  x
## D = 0.13261, p-value = 0.02806

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.95346, p-value = 0.0474

qqnorm(x)

Ahora se valida el supuesto de normalidad para la variable aleatoria Y

\[ H_0: Y \sim Normal\\ H_a: Y \not\sim Normal \]

lillie.test(y)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  y
## D = 0.074156, p-value = 0.704

shapiro.test(y)

## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.98601, p-value = 0.8139

qqnorm(y)

Segundo paso: Hacer la demostracion de que tenemos varianza igual: Homocedasticidad.

\[ H_0: \sigma^2_X = \sigma^2_Y\\ H_a: \sigma^2_X \neq \sigma^2_Y \]

var.test(x,y)

## 
##  F test to compare two variances
## 
## data:  x and y
## F = 0.80348, num df = 49, denom df = 49, p-value = 0.4466
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.4559546 1.4158796
## sample estimates:
## ratio of variances 
##          0.8034779

Luego se satisface el supuesto de normalidad.

Tercer paso: si se valida el supuesto de normalidad se puede usar el test de t.test

\[ H_0: \mu_X = \mu_Y\\ H_a: \mu_X \neq \mu_Y \]

t.test(x,y,var.equal = T)

## 
##  Two Sample t-test
## 
## data:  x and y
## t = -14.064, df = 98, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.235484 -4.693428
## sample estimates:
## mean of x mean of y 
##  34.56064  40.02510

Se concluye que bajo una significancia del 5% se rechaza \(H_0\) en favor de la alterna pues el p valor es más pequeño que el 5% es decir existe diferencia significativo entre la media de los hombres y la media de las mujeres.

Suponga que rechaza el supuesto de normalidad.

x=rgamma(100,1,1)
y=rgamma(100,2,1)

Pruebas de normal

qqnorm(x)

\[ H_0: X \sim Normal\\ H_a: X \not\sim Normal \]

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.69836, p-value = 4.939e-13

lillie.test(x)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  x
## D = 0.21933, p-value = 6.787e-13

Se rechaza la hipótesis de normalidad

Ahora considere el siguiente contraste de hipótesis.

\[ H_0:Y \sim Normal\\ H_1:Y \not\sim Normal \]

shapiro.test(y)

## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.91249, p-value = 5.747e-06

lillie.test(y)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  y
## D = 0.12826, p-value = 0.0003459

Se rechaza el supuesto de normalidad, suponga que las distribuciones de X como Y son simétricas

\[ H_0: \theta_X = \theta_Y\\ H_a: \theta_X \neq \theta_Y \]

wilcox.test(x,y)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  x and y
## W = 2483, p-value = 7.808e-10
## alternative hypothesis: true location shift is not equal to 0

Suponga que rechaza el supuesto de normalidad pero las medias son iguales

x=rgamma(100,1,1)
y=rgamma(100,1,1)

Pruebas de normal

qqnorm(x)

\[ H_0: X \sim Normal\\ H_a: X \not\sim Normal \]

shapiro.test(x)

## 
##  Shapiro-Wilk normality test
## 
## data:  x
## W = 0.89069, p-value = 5.291e-07

lillie.test(x)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  x
## D = 0.17055, p-value = 1.383e-07

Se rechaza la hipótesis de normalidad

Ahora considere el siguiente contraste de hipótesis.

\[ H_0:Y \sim Normal\\ H_1:Y \not\sim Normal \]

shapiro.test(y)

## 
##  Shapiro-Wilk normality test
## 
## data:  y
## W = 0.82134, p-value = 1.184e-09

lillie.test(y)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  y
## D = 0.16875, p-value = 2.035e-07

Se rechaza el supuesto de normalidad, suponga que las distribuciones de X como Y son simétricas

\[ H_0: \theta_X = \theta_Y\\ H_a: \theta_X \neq \theta_Y \]

wilcox.test(x,y)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  x and y
## W = 4738, p-value = 0.5229
## alternative hypothesis: true location shift is not equal to 0

Las medianas son iguales, bajo el modelo gamma

Estadítica robusta

Si usted presenta Outliers pues hacer uso de estadística robusta.

Metodos pareados

\[ H_0: \mu_D = \mu_A\\ H_a: \mu_D \neq \mu_A \]

Bajo normalidad

a=rnorm(45,0,1)
d=rnorm(45,1,1)

Pruebas de normalidad.

qqnorm(a)

shapiro.test(a-d)

## 
##  Shapiro-Wilk normality test
## 
## data:  a - d
## W = 0.98491, p-value = 0.8165

Como la diferencia es normal entonces se puede realizar un t.test

\[ H_0: \mu_D =0\\ H_a: \mu_D \neq 0 \]

t.test(a,d,paired = T)

## 
##  Paired t-test
## 
## data:  a and d
## t = -7.6331, df = 44, p-value = 1.365e-09
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -1.742575 -1.014600
## sample estimates:
## mean difference 
##       -1.378588

Existe evidencia suficiente para decir que el efecto del medicamento es significativo.

Que pasa si no hay normalidad

a = rpois(45,2)
d = rpois(45,1)

Explorar normalidad \[ H_0: A-D \sim Normal\\ H_a: A-D \not\sim Normal \]

qqnorm(a-d)

shapiro.test(a-d)

## 
##  Shapiro-Wilk normality test
## 
## data:  a - d
## W = 0.96053, p-value = 0.128

lillie.test(a-d)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  a - d
## D = 0.17645, p-value = 0.00121

Como se rechazo normalidad, se hace uso de el test de wilcoxon para datos pareados.

\[ H_0: \theta_D =0\\ H_a: \theta_D \neq 0 \]

wilcox.test(a,d,paired = T)

## Warning in wilcox.test.default(a, d, paired = T): cannot compute exact p-value
## with ties

## Warning in wilcox.test.default(a, d, paired = T): cannot compute exact p-value
## with zeroes

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  a and d
## V = 551, p-value = 0.0005629
## alternative hypothesis: true location shift is not equal to 0

Luego se tiene que las medianas son estadísticamente diferente

Principios de inferencia

Jhonier Rangel

2022-10-09

Generación de variables aleatorias normal

General binomiales

Se verifica que la distribución binomial cuando n es grande tiende a la distribución normal.

Distribuciones continuas.

Pruebas de bondad de ajuste

Intervalo de confianza bajo normalidad usando la distribución t - student.

Dos poblaciones independientes y normales.

Medias son diferentes tamaño de muestra más grande y diferencia 10

Medias son diferentes

T.TEST

Segundo paso: Hacer la demostracion de que tenemos varianza igual: Homocedasticidad.

Tercer paso: si se valida el supuesto de normalidad se puede usar el test de t.test

Suponga que rechaza el supuesto de normalidad.

Pruebas de normal

Suponga que rechaza el supuesto de normalidad pero las medias son iguales

Pruebas de normal

Estadítica robusta

Metodos pareados

Bajo normalidad

Pruebas de normalidad.