Estadística para el Análisis Político 1 | Lección 6: Prueba T

Marylia Cruz

Estadística Bivariada

La descripción bivariada trata de estudiar la relación entre dos variables de una misma muestra o conjunto de datos. El análisis se realiza con el fin en determinar una relación empírica entre ellas. La elección de tipo de prueba para el análisis bivariado depende del tipo de las variables.

Motivación

Motivación 2

Prueba de hipótesis /1

Prueba de hipótesis /2

Prueba de hipótesis /3

Prueba de hipótesis /4

Prueba de hipótesis /5

Prueba de hipótesis

¿Dónde estamos? ¿A dónde vamos?

Comparación de grupos

Prueba T

Prueba T para muestras independientes

Esta prueba compara la media de una variable de numérica para dos grupos o categorías de una variable nominal u ordinal. Los grupos que forman la variable nominal/ordinal tienen que ser independientes. Es decir, cada observación debe pertenecer a un grupo o al otro, pero no a ambos.

Ejemplo Revisemos si hay diferencia del número de libros leídos ente hombres y mujeres.

Prueba T para muestras independientes

El Instituto Nacional de Estadística recientemente ha publicado los resultados de la Encuesta Nacional de Lectura 2022, la que tiene como objetivo medir los comportamientos, valoraciones y consumo de lectura. Los resultados de las prácticas lectoras se encuentran en la base de datos “V_ENL2022_300_400.sav”.

library(rio)
data=import("V_ENL2022_300_400.sav")
names(data)
  [1] "CONG"            "NSELV"           "HOGAR"           "PERS_NRO"       
  [5] "P201"            "TOT_HOGAR"       "CCDD"            "NOMBREDD"       
  [9] "CCPP"            "NOMBREPP"        "CCDI"            "NOMBREDI"       
 [13] "UBIGEO"          "VIVIENDA"        "DOMINIO"         "ESTRATO"        
 [17] "PER"             "P203"            "P206"            "P207"           
 [21] "P208"            "P209"            "P210_A"          "PERS_300"       
 [25] "INF_300"         "P301"            "P302_1"          "P302_2"         
 [29] "P302_3"          "P302_4"          "P302_5"          "P302_6"         
 [33] "P302_7"          "P302_8"          "P302_9"          "P302_10"        
 [37] "P302_11"         "P302_12"         "P302_13"         "P302_14"        
 [41] "P302_15"         "P303"            "P304"            "P306"           
 [45] "P306_A"          "P306_G"          "P306_C"          "P307_1"         
 [49] "P307_2"          "P308_1"          "P308_2_1"        "P308_2_2"       
 [53] "P308_2_3"        "P308_2_4"        "P308_2_5"        "P308_2_6"       
 [57] "P308_2_7"        "P308_2_8"        "P308_2_9"        "P309"           
 [61] "P310_N"          "P310_A"          "P310_G"          "P310_C"         
 [65] "P311"            "P312_1"          "P312_2"          "P312_3"         
 [69] "P312_4"          "P312_5"          "P312_6"          "P312_7_8"       
 [73] "P313"            "P314"            "P314_1"          "P315_1"         
 [77] "P315_2"          "P315_3"          "P315_4"          "P315_5"         
 [81] "P315_6"          "PERS_400"        "INF_400"         "P401_1"         
 [85] "P401"            "P402"            "P403"            "P404_1"         
 [89] "P404_2"          "P404_3"          "P404_4"          "P404_5"         
 [93] "P404_6"          "P404_7"          "P404_8"          "P404_9"         
 [97] "P405_1"          "P405_2"          "P405_3"          "P405_4"         
[101] "P405_5"          "P405_6"          "P405_7"          "P405_8"         
[105] "P405_9"          "P405_10"         "P405_11"         "P405_12"        
[109] "P405_13"         "P406_1"          "P406_2"          "P406_9"         
[113] "P406_3"          "P406_4"          "P406_5"          "P406_6"         
[117] "P406_7"          "P406_8"          "P407_1"          "P407_1_1"       
[121] "P407_2"          "P407_2_1"        "P407_3"          "P407_3_1"       
[125] "P407_4"          "P407_4_1"        "P408"            "P409"           
[129] "P410_1"          "P410_2"          "P410_3"          "P410_4"         
[133] "P410_5"          "P410_6"          "P410_7"          "P410_8"         
[137] "P410_9"          "P410_10"         "P411_1"          "P411_2"         
[141] "P411_9"          "P411_3"          "P411_4"          "P411_5"         
[145] "P411_6"          "P411_7"          "P411_8"          "P412_1"         
[149] "P412_2"          "P413_1"          "P413_2"          "P413_3"         
[153] "P413_4"          "P413_5"          "P413_6"          "P413_7"         
[157] "P413_8"          "P413_9"          "P413_10"         "P413_11"        
[161] "P413_12"         "P413_13"         "P413_14"         "P413_15"        
[165] "P413_16"         "P413_17"         "P413_18"         "P413_20"        
[169] "P414_1"          "P414_2"          "P414_3"          "P414_4"         
[173] "P414_5"          "P414_6"          "P414_7"          "P414_8"         
[177] "P414_9"          "P414_10"         "P414_11"         "P414_12"        
[181] "P414_13"         "P415"            "P416_1"          "P416_2"         
[185] "P416_3"          "P416_4"          "P416_5"          "P416_6"         
[189] "P416_8"          "P416_9"          "P416_10"         "P417_1"         
[193] "P417_2"          "P417_3"          "P417_4"          "P417_5"         
[197] "P417_6"          "P417_7"          "P417_8"          "P417_9"         
[201] "P417_10"         "P418"            "P418_1"          "P419_1"         
[205] "P419_2"          "P420_1"          "P420_2"          "P420_3"         
[209] "P420_4"          "P420_5"          "P420_6"          "P420_7"         
[213] "P420_8"          "P420_9"          "P420_10"         "P420_11"        
[217] "P421"            "P422_1"          "P422_2"          "P422_3"         
[221] "P422_4"          "P422_5"          "P422_6"          "P422_7"         
[225] "P422_8"          "P422_9"          "P422_10"         "P422_11"        
[229] "P422_12"         "P422_13"         "P422_14"         "P423"           
[233] "P423_1"          "P424_1"          "P424_2"          "P425_1"         
[237] "P425_2"          "P425_3"          "P425_4"          "P425_5"         
[241] "P425_6"          "P425_7"          "P425_8"          "P425_9"         
[245] "P425_10"         "P425_11"         "P426"            "P427_1"         
[249] "P427_2"          "P427_3"          "P427_4"          "P427_5"         
[253] "P427_6"          "P427_7"          "P427_8"          "P427_9"         
[257] "P427_10"         "P427_11"         "P427_12"         "P427_13"        
[261] "P428"            "P428_1"          "P429_1"          "P429_1_1"       
[265] "P429_2"          "P429_2_1"        "P429_3"          "P429_3_1"       
[269] "P429_4"          "P429_4_1"        "P429_5"          "P429_5_1"       
[273] "P429_6"          "P429_6_1"        "P429_7"          "P429_7_1"       
[277] "P429_8"          "P429_8_1"        "P429_9"          "P429_9_1"       
[281] "P429_10"         "P429_10_1"       "P430_1"          "P430_2"         
[285] "P430_3"          "P430_4"          "P430_5"          "P430_6"         
[289] "P430_7"          "P430_8"          "P431"            "P432"           
[293] "P433_1"          "P433_2"          "P433_3"          "P433_4"         
[297] "P433_5"          "P433_6"          "P433_7"          "P434_1"         
[301] "P434"            "P435"            "P436"            "P437_1"         
[305] "P437_2"          "P437_3"          "P437_4"          "P437_5"         
[309] "P437_6"          "P437_7"          "P437_8"          "P438_1"         
[313] "P438_2"          "P438_3"          "P438_4"          "P438_5"         
[317] "P438_6"          "P438_7"          "P439"            "P440"           
[321] "RESFIN"          "RESFIN_V"        "ESTRATOSOCIO"    "FACTOR200_FINAL"

Formateo de variable categóricas

data$sexo=as.factor(data$P209)
data$sexo=factor(data$sexo, levels=levels(data$sexo), labels=c("Hombre","Mujer"))
data$trabaja=as.factor(data$P311)
data$trabaja=factor(data$trabaja, levels=levels(data$trabaja), labels=c("Si","No"))

Prueba T para muestras independientes /1

Hipótesis Nula: No hay diferencia de medias de los libros impresos leídos entre hombres y mujeres

t.test(P412_1 ~ sexo, data = data)

    Welch Two Sample t-test

data:  P412_1 by sexo
t = -1.2393, df = 16399, p-value = 0.2153
alternative hypothesis: true difference in means between group Hombre and group Mujer is not equal to 0
95 percent confidence interval:
 -0.14774621  0.03328614
sample estimates:
mean in group Hombre  mean in group Mujer 
            2.435897             2.493127 

Prueba T para muestras independientes /Insumo

library(dplyr)
library(lsr)
tabla=data%>%
  group_by(sexo)%>%
  summarise(Media=mean(P412_1,na.rm = T),
            LimiteInferior=ciMean(P412_1,na.rm = T)[1],
            LimiteSuperior=ciMean(P412_1,na.rm = T)[2])
tabla
# A tibble: 2 × 4
  sexo   Media LimiteInferior LimiteSuperior
  <fct>  <dbl>          <dbl>          <dbl>
1 Hombre  2.44           2.36           2.51
2 Mujer   2.49           2.44           2.55

Prueba T para muestras independientes /Gráfico

library(ggplot2)
ggplot(tabla, aes(x = sexo, y = Media)) +
  geom_point(size = 5) +  # Punto de la media
  geom_errorbar(aes(ymin = LimiteInferior, ymax = LimiteSuperior), width = 0.2) +
  labs(y = "Valor", x = "") +
  theme_minimal() + ylab("Promedio ")+
  ylim(1, 3) 

Prueba T para muestras independientes /2

Hipótesis Nula: No hay diferencia de medias de los libros digitales entre hombres y mujeres

t.test(P412_2 ~ sexo, data = data)

    Welch Two Sample t-test

data:  P412_2 by sexo
t = 4.0239, df = 16660, p-value = 5.749e-05
alternative hypothesis: true difference in means between group Hombre and group Mujer is not equal to 0
95 percent confidence interval:
 0.1036096 0.3004174
sample estimates:
mean in group Hombre  mean in group Mujer 
            1.490189             1.288175 

Prueba T para muestras independientes /3

Hipótesis Nula: No hay diferencia de medias de los libros impresos entre quienes trabajaron y no la última semana


    Welch Two Sample t-test

data:  P412_1 by trabaja
t = 2.9961, df = 12929, p-value = 0.00274
alternative hypothesis: true difference in means between group Si and group No is not equal to 0
95 percent confidence interval:
 0.04743074 0.22691988
sample estimates:
mean in group Si mean in group No 
        2.509928         2.372752 

Prueba T para muestras independientes /4

Hipótesis Nula: No hay diferencia de medias de los libros digitales entre quienes trabajaron y no la última semana


    Welch Two Sample t-test

data:  P412_2 by trabaja
t = -1.3175, df = 9879.1, p-value = 0.1877
alternative hypothesis: true difference in means between group Si and group No is not equal to 0
95 percent confidence interval:
 -0.1835981  0.0360016
sample estimates:
mean in group Si mean in group No 
        1.354279         1.428077