En contraste, las muestras son independientes cuando los datos provienen de grupos de sujetos completamente diferentes y no relacionados, con lo cual las mediciones de un grupo no tienen influencia en las de otro
title: “Horas de estudio: Campus vs Internet” author: “” output: html_document: toc: true toc_float: true number_sections: true ———————-
Comparar el tiempo de estudio (horas) entre Campus y Internet con un flujo reproducible que incluye: exploración gráfica, pruebas de normalidad y contraste de hipótesis (no paramétrico y paramétrico).
campus <- c(28,16,42,29,31,22,50,42,23,25)
internet <- c(26,42,65,38,29,32,59,42,27,41,46,18)
length(campus); length(internet)
## [1] 10
## [1] 12
summary(campus); summary(internet)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 16.00 23.50 28.50 30.80 39.25 50.00
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 18.00 28.50 39.50 38.75 43.00 65.00
par(mfrow = c(1,2))
boxplot(list(Campus = campus, Internet = internet),
ylab = "Horas", main = "Boxplots")
stripchart(list(Campus = campus, Internet = internet),
method = "jitter", vertical = TRUE, pch = 19, col = rgb(0,0,0,.5), add = TRUE)
hist(campus, breaks = 6, freq = FALSE, main = "Distribuciones",
xlab = "Horas", col = rgb(0.2,0.5,0.9,.3))
hist(internet, breaks = 6, freq = FALSE, add = TRUE,
col = rgb(0.9,0.5,0.2,.3))
lines(density(campus), lwd = 2)
lines(density(internet), lwd = 2)
legend("topright", c("Campus","Internet"),
fill = c(rgb(0.2,0.5,0.9,.3), rgb(0.9,0.5,0.2,.3)), bty = "n")
par(mfrow = c(1,1))
shapiro.test(campus)
##
## Shapiro-Wilk normality test
##
## data: campus
## W = 0.93675, p-value = 0.5175
shapiro.test(internet)
##
## Shapiro-Wilk normality test
##
## data: internet
## W = 0.95465, p-value = 0.7057
par(mfrow = c(1,2))
qqnorm(campus, main = "QQ-plot Campus"); qqline(campus, col = 2)
qqnorm(internet, main = "QQ-plot Internet"); qqline(internet, col = 2)
par(mfrow = c(1,1))
Criterio: si alguna muestra no es aproximadamente normal (p ≤ 0.05 o QQ-plot con desvíos marcados), preferimos prueba no paramétrica.
# Fligner-Killeen (robusta a no normalidad)
fligner.test(list(campus, internet))
##
## Fligner-Killeen test of homogeneity of variances
##
## data: list(campus, internet)
## Fligner-Killeen:med chi-squared = 0.50512, df = 1, p-value = 0.4773
Queremos responder: ¿Internet estudia más que Campus? (unilateral derecha). También reportamos la versión bilateral.
Sea \(\Delta = \text{mediana}(\text{Internet}) - \text{mediana}(\text{Campus})\).
wilcox_greater <- wilcox.test(internet, campus, alternative = "greater")
## Warning in wilcox.test.default(internet, campus, alternative = "greater"):
## cannot compute exact p-value with ties
wilcox_two <- wilcox.test(internet, campus, alternative = "two.sided")
## Warning in wilcox.test.default(internet, campus, alternative = "two.sided"):
## cannot compute exact p-value with ties
wilcox_greater; wilcox_two
##
## Wilcoxon rank sum test with continuity correction
##
## data: internet and campus
## W = 80.5, p-value = 0.09294
## alternative hypothesis: true location shift is greater than 0
##
## Wilcoxon rank sum test with continuity correction
##
## data: internet and campus
## W = 80.5, p-value = 0.1859
## alternative hypothesis: true location shift is not equal to 0
Si las normalidades fueran razonables:
# Welch (varianzas desiguales por defecto)
t_welch_greater <- t.test(internet, campus, alternative = "greater")
t_welch_two <- t.test(internet, campus, alternative = "two.sided")
# Igualdad de varianzas (solo si fligner no rechaza y hay normalidad)
t_pooled_greater <- t.test(internet, campus, var.equal = TRUE, alternative = "greater")
list(Welch_greater = t_welch_greater$p.value,
Welch_two = t_welch_two$p.value,
Pooled_greater = t_pooled_greater$p.value)
## $Welch_greater
## [1] 0.0704593
##
## $Welch_two
## [1] 0.1409186
##
## $Pooled_greater
## [1] 0.07485344
# install.packages("effsize") # si hace falta
suppressWarnings(suppressMessages(library(effsize)))
cliff.delta(internet, campus) # Prob. dominancia
##
## Cliff's Delta
##
## delta estimate: 0.3416667 (medium)
## 95 percent confidence interval:
## lower upper
## -0.1752638 0.7109235
# Hodges–Lehmann: mediana de todas las diferencias (internet - campus)
HL <- median(as.vector(outer(internet, campus, `-`)))
HL
## [1] 8
p > 0.05
, no hay evidencia suficiente para afirmar que
Internet > Campus en horas de estudio; si
p ≤ 0.05
, sí hay evidencia.