Análisis inferencial Hipotesis

\[H_0:\mu_{g_1}=\mu_{g_2}=\mu_{g_3}=\mu_{g_4}=\mu_{g_5}=\mu_{g_6}\\H_a: H_0\text{ es falsa}\] Modelo \[y_{ij}=\mu_i+\epsilon_{ij}\\ i=1,2,3,1,5,6 ~; j=1,2,3,4,5,6\] \[y_{ij}=\text{Peso seco i-esimo genotipo y j-esima repetición}\] \[\mu_i=\text{La media de cada i-esimo genotipo}\] \[\epsilon_{ij}=\text(Residuales)\] Modelo en forma de efectos \[y_{ij} = \mu + \tau_i+\epsilon_{ij}\] \(\mu\) Media global \(\Tau_i\)

Modelo en forma matricial \[ Y = X\beta+ E\] \(x\)

\(x\) Matriz del diseño 36 filas y 7 columnas (1 columna representa la media y 6 una por genotipo) \(\beta\) Vector de parametros (\(\mu:\tau_1;\tau_2; \tau_3;\tau_4; \tau_5; \tau_6\)) Otra forma de plaNtear la hipotesis \[H_0: \tau_1;\tau_2; \tau_3;\tau_4; \tau_5; \tau_6\]

mod1=aov(PS ~ genotipo, data=datos)
smod1= summary(mod1)
pv1= smod1 [[1]][1,5]

ifelse(pv1 <0.05, 'Rechazo Ho', 'No rechazo Ho')

## [1] "Rechazo Ho"

Como el valor de F es 14.22, esto quiere decir que la variabilidad causada por los genotipos es 14.22 veces más grande que la causada por el error

Interpretando el p-value: Se rechaza la hipotesis nula, lo que sugiere que existen diferencias en al menos uno de los tratamientos en cuanto a peso seco. Los datos proporcionan evidencia en contra de la Hipotesis nula, es decir, a favor de la alterna.

####Estimar los efectos 1) Primero se saca la media global de todoslos datos 2) Luego se saca la media de cada tratamiento 3) Crear variable de cada efecto (Punto1-Punto2)

#Media global
(mu = mean(datos$PS))

## [1] 1379.69

#Media por genotipo
mu_i = tapply(datos$PS, datos$genotipo, mean)
#Efecto por genotipo
tau_i = mu_i - mu
tau_i

##       gen1       gen2       gen3       gen4       gen5       gen6 
## -134.97483 -185.56951  123.96002   82.81483   22.23648   91.53300

var_i = tapply(datos$PS, datos$genotipo, var)

boxplot(PS ~ genotipo, datos)
points (1:6, mu_i,pch=16, col='red')
abline(h=mu, lty=2, col='red')
segments(1:6-.2, mu_i, 1:6-.2, mu,col='blue',lwd=2, lty=2)
text(1:6, rep(1700,6), round(tau_i,2))
text(1:6, rep(1000,6), round(var_i,2))

####REVISIÓN DE SUPUESTOS \[H_0:\sigma^2_{g1}=\sigma^2_{g2}=\sigma^2_{g3}=\sigma^2_{g4}=\sigma^2_{g5}=\sigma^2_{g6}\] \[H_0:\epsilon\sim N(0, \sigma^2_e)\]

hist(mod1$residuals)

var_res = tapply(mod1$residuals, datos$genotipo, var)

#Igualdad de varianzas
bartlett.test(mod1$residuals, datos$genotipo)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  mod1$residuals and datos$genotipo
## Bartlett's K-squared = 5.5895, df = 5, p-value = 0.3482

#Normalidad de residuos
shapiro.test(mod1$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  mod1$residuals
## W = 0.97311, p-value = 0.5164

Como el p-value en la prueba de igualdad de varianzas es mayor al 5% estadisticamente se pueden considerar iguales.

Como el p-value en la prueba de normalidad es 51.64% (>5%) se considera que los residuales siguen una distribución normal

Comparación de medias posterior al análisis de varianza

#Prueba de máxima diferencia de Tukey
par(mar=c(5,6,3,1))
tt = TukeyHSD(mod1, 'genotipo')
plot(tt, las=1)
abline (v=0, lty=2, col='red', lwd = 2)

library(agricolae)
dt = duncan.test(mod1, 'genotipo', console = T)

## 
## Study: mod1 ~ "genotipo"
## 
## Duncan's new multiple range test
## for PS 
## 
## Mean Square Error:  7066.534 
## 
## genotipo,  means
## 
##            PS       std r      Min      Max
## gen1 1244.715  95.50024 6 1143.952 1371.506
## gen2 1194.121  90.44675 6 1073.494 1322.408
## gen3 1503.650  99.77789 6 1342.671 1642.953
## gen4 1462.505  52.69259 6 1414.574 1556.108
## gen5 1401.927 103.29604 6 1268.198 1532.843
## gen6 1471.223  41.18360 6 1393.444 1500.561
## 
## Alpha: 0.05 ; DF Error: 30 
## 
## Critical Range
##         2         3         4         5         6 
##  99.11886 104.16376 107.43409 109.76839 111.53078 
## 
## Means with the same letter are not significantly different.
## 
##            PS groups
## gen3 1503.650      a
## gen6 1471.223      a
## gen4 1462.505      a
## gen5 1401.927      a
## gen1 1244.715      b
## gen2 1194.121      b

plot(dt)

####Diseño 1. Incumplimento de supuestos

#FACTOR
genotipo = gl(n = 6, k = 6, length = 36,
              labels = paste0('gen', 1:6))

#VARIABLE RESPUESTA
set.seed(123)
PS = c(
  rnorm(12,1200,120),
  rnorm(12,1500,100),
  rnorm(12,1420,150)
  )

aleat = sample(36)
datos = data.frame (xy[aleat,],genotipo, PS)
head(datos)

##    x y genotipo       PS
## 15 2 2     gen1 1132.743
## 26 1 4     gen1 1172.379
## 31 0 5     gen1 1387.045
## 16 3 2     gen1 1208.461
## 20 1 3     gen1 1215.515
## 30 5 4     gen1 1405.808

ggplot(datos)+
  aes(genotipo, PS)+
  geom_boxplot()

mod1b= aov(PS ~ genotipo, datos)
smod1b = summary(mod1b)

shapiro.test(mod1b$residuals)

## 
##  Shapiro-Wilk normality test
## 
## data:  mod1b$residuals
## W = 0.98155, p-value = 0.7964

bartlett.test(mod1b$residuals, datos$genotipo)

## 
##  Bartlett test of homogeneity of variances
## 
## data:  mod1b$residuals and datos$genotipo
## Bartlett's K-squared = 5.964, df = 5, p-value = 0.3097

Como se rechaza la hipotesis de igualdad de varianzas, se incumple el supuesto lo cual complica la interpretación

####Análisis de varianza para un diseño de factorial simple en arreglo completamente al azar, en presencia de heterocedasticidad

mod1c = oneway.test(PS~genotipo, datos)
mod1c

## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  PS and genotipo
## F = 8.8156, num df = 5.000, denom df = 13.769, p-value = 0.0006281

Cuando se incumple normalidad e igualdad de varianzas

####Análisis de varianza no parametricos para un diseño en arreglo factorial simple en arreglo completamente al azar.

\[H_0: R_1 = R_2= R_3= R_4= R_5 = R_6\]

mod1d = kruskal.test(PS, genotipo)
mod1d

## 
##  Kruskal-Wallis rank sum test
## 
## data:  PS and genotipo
## Kruskal-Wallis chi-squared = 19.03, df = 5, p-value = 0.001898

####Comparación de rangos promedios posterior a kruskal.wallis

#library(PMCMR)
#posthoc.kruskal.nemenyi.test(PS,genotipo)
library(PMCMRplus)
kwAllPairsNemenyiTest(PS,genotipo)

## 
##  Pairwise comparisons using Tukey-Kramer-Nemenyi all-pairs test with Tukey-Dist approximation

## data: PS and genotipo

##      gen1  gen2  gen3  gen4  gen5 
## gen2 0.996 -     -     -     -    
## gen3 0.090 0.022 -     -     -    
## gen4 0.282 0.096 0.995 -     -    
## gen5 0.660 0.342 0.872 0.991 -    
## gen6 0.058 0.013 1.000 0.984 0.792

## 
## P value adjustment method: single-step

## alternative hypothesis: two.sided

library(FSA)

## ## FSA v0.9.4. See citation('FSA') if used in publication.
## ## Run fishR() for related website and fishR('IFAR') for related book.

dunnTest(PS, genotipo)

## Dunn (1964) Kruskal-Wallis multiple comparison

##   p-values adjusted with the Holm method.

##     Comparison          Z     P.unadj      P.adj
## 1  gen1 - gen2  0.4931970 0.621873424 1.00000000
## 2  gen1 - gen3 -2.6303838 0.008528852 0.10234623
## 3  gen2 - gen3 -3.1235808 0.001786648 0.02501308
## 4  gen1 - gen4 -2.1097870 0.034876706 0.34876706
## 5  gen2 - gen4 -2.6029840 0.009241628 0.10165790
## 6  gen3 - gen4  0.5205968 0.602647684 1.00000000
## 7  gen1 - gen5 -1.5069907 0.131813037 1.00000000
## 8  gen2 - gen5 -2.0001877 0.045480002 0.40932002
## 9  gen3 - gen5  1.1233931 0.261270588 1.00000000
## 10 gen4 - gen5  0.6027963 0.546644217 1.00000000
## 11 gen1 - gen6 -2.7947828 0.005193460 0.06751498
## 12 gen2 - gen6 -3.2879797 0.001009091 0.01513636
## 13 gen3 - gen6 -0.1643990 0.869417061 0.86941706
## 14 gen4 - gen6 -0.6849958 0.493346584 1.00000000
## 15 gen5 - gen6 -1.2877921 0.197818354 1.00000000

rangos = rank (PS, ties.method = 'average')
rangos

##  [1]  3  6 16  7  8 19 11  1  2  4 14  9 28 26 22 36 31 12 34 23 17 24 18 20 13
## [26]  5 30 21 10 35 25 15 33 32 29 27

boxplot(rangos~genotipo)

####Análisis de varianza permutacional

library(RVAideMemoire)

## *** Package RVAideMemoire v 0.9-81-2 ***

## 
## Attaching package: 'RVAideMemoire'

## The following object is masked from 'package:FSA':
## 
##     se

perm.anova(PS~genotipo, data = datos, nperm = 10000, progress = F)

## Permutation Analysis of Variance Table
## 
## Response: PS
## 10000 permutations
##           Sum Sq Df Mean Sq F value    Pr(>F)    
## genotipo  525407  5  105081   7.965 9.999e-05 ***
## Residuals 395789 30   13193                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diseño 1

Natalia Deaquiz

2023-04-18

Diseño 1- Factorial simple en arreglo complementamente al azar

Análisis inferencial Hipotesis

Comparación de medias posterior al análisis de varianza