En esta investigación se desarrollaron métodos alternativos cuando no se cumplen los supuestos básico en diseños factoriales. Esta presentación contiene funciones propias (creadas bajo el enfoque de la literatura), también funciones por defecto que se encuentran en el R, así mismo otras funciones de autores, que si bien es cierto no lo han formalizado en el R, pero se encuentran disponibles en la Web. La descripción de las funciones y métodos se encuentra en el artículo “Métodos alternativos ante la violación de supuestos en diseños de experimentos factoriales”.

CONTENIDO

CARGANDO LIBRERÍAS

library(reshape)
library(stringr)
library(MASS)
library(Rfit)
library(ART)
library(ARTool)

## Warning: package 'ARTool' was built under R version 4.0.4

## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car 
##   influence.merMod                car 
##   dfbeta.influence.merMod         car 
##   dfbetas.influence.merMod        car

library(asbio)

## Warning: package 'asbio' was built under R version 4.0.4

## Loading required package: tcltk

library(WRS2)

## Warning: package 'WRS2' was built under R version 4.0.5

library(rankFD)

## Warning: package 'rankFD' was built under R version 4.0.5

## Registered S3 method overwritten by 'coin':
##   method   from 
##   print.ci asbio

library(ez)
library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:MASS':
## 
##     select

## The following object is masked from 'package:stats':
## 
##     filter

NOTA IMPORTANTE

Uno de los aportes importantes de esta investigación son gracias a la funciones creada por Luepsen (2020), el cual desarrolló 4 funciones que se describen en esta investigación (Puri and Send, Van der Waerden, Brow-Forsythe y Wels-James. Las funciones están resumidas en el archivo descargable: anova.lib.

El Link de descarga (Luepsen, 2020) es : http://www.uni-koeln.de/~a0032/R/ Para leer el archivo en R, se conecta con el attach de la siguiente manera:

Guardar el archivo en una carpeta (en mi caso cree la carpeta ARTICULO en el disco E)
Luego con attach conectar a R:

attach("E:ARTICULO/anova.lib")

ELABORACIÓN DEL CONJUNTO DE DATOS DEL DISEÑO FACTORIAL

Factores A: Variedades de piña (Golden, Cayena Lisa y Hawaiana)
Factor B: tipo de manejo del cultivo (convencional y orgánico)
Variable de respuesta: Porcentaje promedio de grados brix

Brix <- c(16.24, 16.14, 15.56, 15.8, 15.36, 15.31, 15.56, 15.65, 14.74, 15.08, 14.39, 15.18, 12.02,     
          12.41, 11.68, 11.43, 12.56, 9.52, 8.3, 10.51, 9.87, 10.8, 11.58, 10.7)

Variedad <- c("V1","V1","V1","V1","V1","V1","V1","V1","V2","V2","V2","V2","V2","V2","V2","V2","V3",
              "V3","V3","V3","V3","V3","V3","V3") 

Manejo <- c("S1","S1","S1","S1","S2","S2","S2","S2","S1","S1","S1","S1","S2","S2","S2","S2","S1",
            "S1","S1","S1","S2","S2","S2","S2") 

dat <- data.frame(Brix,Variedad,Manejo) 
dat$Variedad=as.factor(dat$Variedad)
dat$Manejo=as.factor(dat$Manejo)
dat

##     Brix Variedad Manejo
## 1  16.24       V1     S1
## 2  16.14       V1     S1
## 3  15.56       V1     S1
## 4  15.80       V1     S1
## 5  15.36       V1     S2
## 6  15.31       V1     S2
## 7  15.56       V1     S2
## 8  15.65       V1     S2
## 9  14.74       V2     S1
## 10 15.08       V2     S1
## 11 14.39       V2     S1
## 12 15.18       V2     S1
## 13 12.02       V2     S2
## 14 12.41       V2     S2
## 15 11.68       V2     S2
## 16 11.43       V2     S2
## 17 12.56       V3     S1
## 18  9.52       V3     S1
## 19  8.30       V3     S1
## 20 10.51       V3     S1
## 21  9.87       V3     S2
## 22 10.80       V3     S2
## 23 11.58       V3     S2
## 24 10.70       V3     S2

1. ANOVA CLÁSICO

anova(lm(Brix~Variedad*Manejo,dat))

## Analysis of Variance Table
## 
## Response: Brix
##                 Df  Sum Sq Mean Sq F value    Pr(>F)    
## Variedad         2 109.501  54.751 78.8143 1.248e-09 ***
## Manejo           1   5.655   5.655  8.1406   0.01056 *  
## Variedad:Manejo  2  12.861   6.430  9.2565   0.00172 ** 
## Residuals       18  12.504   0.695                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

2. PRUEBAS BASADAS EN LA MEDIANA

2.1 Método de Wilson

options(warn=-1)
Wilson <- function(modelo,data)
{
  if (mode(modelo)!="call") stop("invalid formula")
  vars <-  names(attr(aov(modelo,data)$terms,"dataClasses"))
  data$Med <- ifelse(data[,vars[1]]>=median(data[,vars[1]]),1,2)
  A <- data[,vars[2]]
  B <- data[,vars[3]]
  Tb <- table(B,A,data$Med)
  Tot <- chisq.test(as.vector(Tb[,,1]))$sta +
    chisq.test(as.vector(Tb[,,2]))$sta
  m1.row <- margin.table(Tb[,,1],2)
  m2.row=margin.table(Tb[,,2],2)
  m1.col <- margin.table(Tb[,,1],1)
  m2.col <- margin.table(Tb[,,2],1)
  AA <- chisq.test(m1.row)$sta+chisq.test(m2.row)$sta
  BB <- chisq.test(m1.col)$sta+chisq.test(m2.col)$sta
  AB <- Tot-AA-BB
  GL <- anova(aov(modelo,data))$Df
  FV <- rstatix:::get_formula_right_hand_side(modelo)
  GL <- GL[-(length(GL))]
  ChiSq <- c(AA,BB,AB)
  Pvalue <- c(round(1-pchisq(c(AA,BB,AB),GL),5))
  cat(" Test de Wilson ","\n")
  return(data.frame(FV,GL,ChiSq,Pvalue))
}
Wilson(Brix~Variedad*Manejo,dat)

##  Test de Wilson

##                FV GL     ChiSq  Pvalue
## 1        Variedad  2 16.000000 0.00034
## 2          Manejo  1  2.666667 0.10247
## 3 Variedad:Manejo  2  5.333333 0.06948

2.2 Test de la mediana extendida (Shoemarker)

Shoemaker_Median <- function(formula,data)
{
  if (mode(formula)!="call") stop("invalid formula")
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  d1 <- aggregate(as.formula(paste0(vars[1],"~",vars[2])),data, median )
  hh1 <- merge(d1, data, by.x = vars[2], by.y=vars[2])
  hh1$dif <- hh1[,paste0(vars[1],".","y")]-hh1[,paste0(vars[1],".","x")]
  d2 <- aggregate(as.formula(paste0("dif","~",vars[3])),hh1, median )
  hh2 <- merge(d2, hh1, by.x = vars[3], by.y=vars[3])
  hh2$cond <- ifelse(hh2$dif.y > hh2$dif.x,"Arriba","Debajo")
  Tb <- table(hh2[,vars[3]],hh2[,vars[2]],hh2$cond)
  m1.row <- margin.table(Tb[,,1],1)
  m2.row <- margin.table(Tb[,,2],1)
  BB <- chisq.test(m1.row)$sta+chisq.test(m2.row)$sta
  m1.col <- margin.table(Tb[,,1],2)
  m2.col <- margin.table(Tb[,,2],2)
  AA <- chisq.test(m1.col)$sta+chisq.test(m2.col)$sta
  Total <- chisq.test(Tb[,,1])$sta+chisq.test(Tb[,,2])$sta
  AB=Total-AA-BB
  Df <- anova(lm(formula,data))$Df[1:3]
  ChiSq <- c(AA,BB,AB)
  Pvalue <- c(round(1-pchisq(c(AA,BB,AB),Df),5))
  FV <- rstatix:::get_formula_right_hand_side(formula)
  res<- data.frame(FV,Df,ChiSq,Pvalue)
  return(res)
}
Shoemaker_Median(Brix~Variedad*Manejo,dat)

##                FV Df ChiSq  Pvalue
## 1        Variedad  2     0 1.00000
## 2          Manejo  1     0 1.00000
## 3 Variedad:Manejo  2    12 0.00248

3. PRUEBAS BASADAS EN RANGOS

3.1 Test de Sheirer

Rank_derived<-function(modelo,data)
{
  if (mode(modelo)!="call") stop("invalid formula")
  vars <- names(attr(aov(modelo,data)$terms,"dataClasses"))
  data[,vars[1]] <- rank(data[,vars[1]])
  mm <- anova(lm(modelo,data=data))
  Df <- mm$Df[-length(mm$Df)]
  N <- dim(data)[1]
  H <- round(mm$`Sum Sq`[-length(mm$Df)]/((N*(N+1))/12),5)
  Pvalue <- round(1-pchisq(H,Df),5)
  FV <- rstatix:::get_formula_right_hand_side(modelo)
  res <- data.frame(FV,Df,H,Pvalue)
  cat(" Test de Scheirer (Mediana extendida) ","\n")
  return(res)
}
Rank_derived(Brix~Variedad*Manejo,dat)

##  Test de Scheirer (Mediana extendida)

##                FV Df        H  Pvalue
## 1        Variedad  2 19.00500 0.00007
## 2          Manejo  1  0.80083 0.37085
## 3 Variedad:Manejo  2  0.74667 0.68843

3.2 Gao y Alvo

Link de function original (Li Qinlong 2015)es: https://rdrr.io/a/cran/StatMethRank/src/R/interaction.test.R

Function adaptada (el original se encuentra en forma matricial)

Gao_Alvo_Test<-function(formula,data)
{
  if (mode(formula)!="call") stop("invalid formula")
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  Y <- data[,vars[1]]
  FA <- data[,vars[2]]
  FB <- data[,vars[3]]
  re <- mean(table(FA,FB)) 
  col <- nlevels(FA)
  fil <- nlevels(FB)
  j <- matrix(0,fil*col,re)
  for(i in 1:re){
    j[,i] <- seq(i, by=re, length.out = fil*col)}
  y <- data.frame(Y)[melt(j)[,3],]
  interaction.test <- function(X) {
    size <- dim(X) ; II = size[1] ; JJ = size[2] ; N = size[3]
    A = -matrix(1, ncol = II, nrow = II) %x% diag(JJ) / II +
      diag(II) %x% diag(JJ)
    B = -diag(II) %x% matrix(1, ncol = JJ, nrow = JJ) / JJ +
      diag(II) %x% diag(JJ)
    Row_rank = array(0, dim = dim(X)) ; Col_rank = array(0, dim = dim(X)
    )
    SN = array(0, dim = c(II, JJ))  ;   TN = array(0, dim = c(II, JJ))
    tempmat1 = apply(X, 1, rank) ;  tempmat2 = apply(X, 2, rank)
    for (i in 1:N){
      head = (i - 1) * JJ + 1 ; tail = head + JJ - 1
      Row_rank[, , i] = t(tempmat1[head:tail, ])
      SN = SN + Row_rank[, , i]
      head = (i - 1) * II + 1 ; tail = head + II - 1
      Col_rank[, , i] = tempmat2[head:tail, ]
      TN = TN + Col_rank[, , i] }
    SN = as.vector(t(SN)) / (N * JJ + 1)
    TN = as.vector(t(TN)) / (N * II + 1)
    Cmat = array(0, dim = c(II, JJ, JJ, N))
    for (i in 1:II){for (j in 1:JJ) {for (b in (1:JJ)[-j]) {
      Cmat[i, j, b, ] = -rowSums(matrix(X[i, b, ] >= rep(X[i, j, ], each
                                                         = N),
                                        nrow = N))}
      Cmat[i, j, j, ] = rowSums(matrix(X[i, j, ] >= rep(X[i, -j, ], each
                                                        = N),
                                       nrow = N))} }
    Cmat = Cmat / (N * JJ) ; Gmat = array(0, dim = c(II, JJ, II, N))
    for (i in 1:II){for (j in 1:JJ) {for (a in (1:II)[-i]){
      Gmat[i, j, a, ] = -rowSums(matrix(X[a, j, ] >= rep(X[i, j, ], each
                                                         = N),
                                        nrow = N))}
      Gmat[i, j, i, ] = rowSums(matrix(X[i, j, ] >= rep(X[-i, j, ], each
                                                        = N),
                                       nrow = N))}}
    Gmat = Gmat / (N * II)
    sigma1 = array(0, dim = c(II * JJ, II * JJ))
    sigma2 = array(0, dim = c(II * JJ, II * JJ))
    sigma12 = array(0, dim = c(II * JJ, II * JJ))
    for (iRow in 1:(II * JJ)){
      i1 = ceiling(iRow / JJ) ; j1 = iRow - (i1 - 1) * JJ
      for (iCol in iRow:(II * JJ)){
        i2 = ceiling(iCol / JJ) ;   j2 = iCol - (i2 - 1) * JJ
        if (i1 == i2) { vec1 = Cmat[i1, j1, , ] ; vec2 = Cmat[i2, j2, , ]
        sigma1[iRow, iCol] = 
          sum((vec1 - rowMeans(vec1)) * (vec2 - rowMeans(vec2))) / N
        sigma1[iCol, iRow] = sigma1[iRow, iCol] }
        if (j1 == j2)   { vec1 = Gmat[i1, j1, , ] ; vec2 = Gmat[i2, j2, ,
        ]
        sigma2[iRow, iCol] =
          sum((vec1 - rowMeans(vec1)) * (vec2 - rowMeans(vec2))) / N
        sigma2[iCol, iRow] = sigma2[iRow, iCol] }
        vec1 = Cmat[i1, j1, j2, ] ; vec2 = Gmat[i2, j2, i1, ]
        sigma12[iRow, iCol] = sum((vec1 - mean(vec1)) * (vec2 - mean(vec2
        ))) / N
        sigma12[iCol, iRow] = sigma12[iRow, iCol]}}
    SigmaAB = A %*% sigma1 %*% t(A) +B %*% sigma2 %*% t(B) +
      2 * A %*% sigma12 %*% t(B)
    SigmaA= A %*% sigma1 %*% t(A) ; SigmaB= B %*% sigma2 %*% t(B)
    tempvecA    = A %*% SN ; tempvecB   = B %*% TN
    tempvecAB = A %*% SN + B %*% TN
    WAB = round(t(tempvecAB) %*% ginv(SigmaAB) %*% tempvecAB / N,5)
    WA  = round(t(tempvecA) %*% ginv(SigmaA) %*% tempvecA / N,5)
    WB  = round(t(tempvecB) %*% ginv(SigmaB) %*% tempvecB / N,5)
    p_valueAB = round(1 - pchisq(WAB, df = (II - 1) * (JJ - 1)),5)
    p_valueA = round(1 - pchisq(WA, df = (II - 1)),5)
    p_valueB = round(1 - pchisq(WB, df = (JJ - 1)),5)
    Res=data.frame(FV=rstatix:::get_formula_right_hand_side(formula)
                   ,GL=c(II-1,JJ-1,(II - 1) * (JJ - 1)),
                   W=c(WA,WB,WAB),Pvalue=c(p_valueA,p_valueB,p_valueAB))
    return(Res)}
  interaction.test(array(y,c(fil,col,re)))
}
Gao_Alvo_Test(Brix~Variedad*Manejo,dat)

##                FV GL       W  Pvalue
## 1        Variedad  1 0.56805 0.45103
## 2          Manejo  2 9.32935 0.00942
## 3 Variedad:Manejo  2 9.80492 0.00743

3.3 Pury and Sen (Estadística L)

Puri_Sen_L <-function(modelo,data){
  if (mode(modelo)!="call") stop("invalid formula")
  vars = names(attr(aov(modelo,data)$terms,"dataClasses"))
  vars = str_replace(vars, paste0(substitute(data),"$*\\$"), "")
  data2 = data.frame(Yrank=rank(data[,vars[1]]),data[,vars[-1]])
  anova = anova(lm(data2$Yrank~.*.*.*.*.*.,data=data2))
  aovs <- drop1(aov(data2$Yrank~.*.*.*.*.*.,data=data2), ~. , test="F")
  mstotal <- sum(anova[,2])/sum(anova[,1])
  chisq <- aovs[,2]/mstotal
  df <- aovs[,1]
  pvalues <- 1-pchisq(chisq,df)
  aov2y <- data.frame(aovs[,1:2],chisq,df,pvalues)
  colnames(aov2y)[2] <- "Sum of Sq"
  aov2y[2:length(df),]
}
Puri_Sen_L(Brix~Variedad*Manejo,dat)

##                 Df Sum of Sq      chisq df     pvalues
## Variedad         2 606.79167 12.1411121  2 0.002309888
## Manejo           1  21.12500  0.4226838  1 0.515600921
## Variedad:Manejo  2  37.33333  0.7469914  2 0.688323930

3.4 Test de van der Waerder

model <- np.anova(Brix~Variedad*Manejo, dat, method=1)
model

## generalized van der Waerden tests
##                 Df  Sum Sq  Chi Sq  Pr(>Chi)  
## Variedad         2 10.8407 13.2838   0.00130  
## Manejo           1  0.6531  0.8003   0.37102  
## Variedad:Manejo  2  0.8714  1.0678   0.58630  
## Residuals       18  2.7022

3.5 R Anova (Hocking)

model<-raov(Brix~Variedad*Manejo,data=dat)
model

## 
## Robust ANOVA Table
##                 DF       RD  Mean RD        F p-value
## Variedad         2 38.92173 19.46086 58.44240 0.00000
## Manejo           1  3.98978  3.98978 11.98160 0.00279
## Variedad:Manejo  2  9.34487  4.67244 14.03167 0.00021

3.6 ATS y WTS

rankFD (Brix~Variedad*Manejo,data=dat)

## 
##  
##                Nonparametric Methods for General Factorial Designs      
##  
##  ---------------------------------------------------------------------------
##  #Hypotheses: Tested in Distribution Functions 
##  #Ranking Method: Pseudo-Ranks 
##  #Confidence Intervals: 95 % with Logit-Transformation 
##  
##  
## ---------------------------------------------------------------------------
## 
## Call:
## Brix ~ Variedad * Manejo
## 
## Descriptive:
##   Variedad Manejo Size Rel.Effect Std.Error  Lower  Upper
## 1       V1     S1    4     0.9010    0.0185 0.8583 0.9319
## 2       V1     S2    4     0.7656    0.0185 0.7274 0.8000
## 3       V2     S1    4     0.5833    0.0000 0.5833 0.5833
## 4       V2     S2    4     0.3646    0.0442 0.2830 0.4547
## 5       V3     S1    4     0.1771    0.0811 0.0674 0.3906
## 6       V3     S2    4     0.2083    0.0442 0.1347 0.3079
## 
## Wald.Type.Statistic:
##                 Statistic df p-Value
## Variedad         156.9374  2  0.0000
## Manejo             5.9138  1  0.0150
## Variedad:Manejo    4.7804  2  0.0916
## 
## ANOVA.Type.Statistic:
##                 Statistic    df1    df2 p-Value
## Variedad          70.1723 1.4941 7.3439  0.0000
## Manejo             5.9138 1.0000 7.3439  0.0437
## Variedad:Manejo    2.7569 1.4941 7.3439  0.1338
## 
## Kruskal-Wallis Test:
## NULL
## 
## MCTP:
## NULL
## 
## Factor.Information:
## NULL

3.7 RT (Rango transformado)

RT <- function(modelo,data)
{ 
  if (mode(modelo)!="call") stop("invalid formula")
  vars <- names(attr(aov(modelo,data)$terms,"dataClasses"))
  data[,vars[1]] <- rank(data[,vars[1]])
  cat(" RT (Conover & Iman) ","\n")
  return(anova(lm(modelo,data=data)))
}

RT(Brix~Variedad*Manejo,dat)

##  RT (Conover & Iman)

## Analysis of Variance Table
## 
## Response: Brix
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## Variedad         2 950.25  475.12 70.1723 3.17e-09 ***
## Manejo           1  40.04   40.04  5.9138  0.02569 *  
## Variedad:Manejo  2  37.33   18.67  2.7569  0.09027 .  
## Residuals       18 121.88    6.77                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3.8 ART (Rango transformado alieneado)

model1 <- anova(art(Brix~Variedad*Manejo,data=dat))
cat(" ART (aligned rank transform) ","\n")

##  ART (aligned rank transform)

model1

## Analysis of Variance of Aligned Rank Transformed Data
## 
## Table Type: Anova Table (Type III tests) 
## Model: No Repeated Measures (lm)
## Response: art(Brix)
## 
##                   Df Df.res F value     Pr(>F)    
## 1 Variedad         2     18  73.434  2.204e-09 ***
## 2 Manejo           1     18  14.008 0.00148989  **
## 3 Variedad:Manejo  2     18  12.929 0.00033031 ***
## ---
## Signif. codes:   0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4. PRUEBAS BASADAS EN VARIANZAS HETEROGÉNEAS

4.1 Test de Brow-Forsythe

model <- bf.f(Brix~Variedad*Manejo, dat)
model

## Analysis of Variance Table
## 
## Response: Brix
##                 Df  Df.err  Sum Sq Mean Sq F value    Pr(>F)    
## Variedad         2 14.0357 109.501  54.751 37.0653 2.508e-06 ***
## Manejo           1 20.7444   5.655   5.655  0.9225   0.34789    
## Variedad:Manejo  2  4.8119  12.861   6.430  9.2565   0.02242 *  
## Residuals       18          12.504   0.695                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

4.2 Test WJ (Welch-James)

wf <-function(modelo,data){
  modelo <- Brix~Variedad*Manejo
  data <- dat
  if (mode(modelo)!="call") stop("invalid formula")
  vars = names(attr(aov(modelo,data)$terms,"dataClasses"))
  cat(" Wels-James Test ","\n")
  wj.anova(data, vars[1],vars[2],vars[3])
}
wf(Brix~Variedad*Manejo,dat)

##  Wels-James Test

##                       Chi Sq   df  P(Chi>value)
## Variedad          288.249855    2    0.00001000
## Manejo              8.709206    1    0.02150785
## Variedad : Manejo  60.560475    2    0.00001000

4.3 Test BDM

BDM.2way (dat$Brix,dat$Variedad,dat$Manejo)

## 
## Two way Brunner-Dette-Munk test 
## 
##            df1    df2        F*    P(F > F*)
## X1    1.555533 7.3439 70.172308 2.432348e-05
## X2    1.000000 7.3439  5.913846 4.370973e-02
## X1:X2 1.555533 7.3439  2.756923 1.330867e-01

BDM<-function(formula,data){
  if (mode(formula)!="call") stop("invalid formula")
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  BDM.2way (data[,vars[1]],data[,vars[2]],data[,vars[3]])
}
BDM(Brix~Variedad*Manejo,dat)

## 
## Two way Brunner-Dette-Munk test 
## 
##            df1    df2        F*    P(F > F*)
## X1    1.555533 7.3439 70.172308 2.432348e-05
## X2    1.000000 7.3439  5.913846 4.370973e-02
## X1:X2 1.555533 7.3439  2.756923 1.330867e-01

5. PRUEBAS BASADAS EN MÉTODOS ROBUSTOS

5.1 Medias recortadas (Trimmed means)

model <- t2way(Brix~Variedad*Manejo,data=dat)
model

## Call:
## t2way(formula = Brix ~ Variedad * Manejo, data = dat)
## 
##                    value p.value
## Variedad        288.2499   0.001
## Manejo            8.1406   0.026
## Variedad:Manejo  60.5605   0.001

5.2 Pruebas de mediana (Median test)

model<-med2way(Brix~Variedad*Manejo, data = dat)
model

## Call:
## med2way(formula = Brix ~ Variedad * Manejo, data = dat)
## 
##      value       df p.value
## 1 173.1340 F(2,Inf)  0.0000
## 2   9.1806 F(1,Inf)  0.0024
## 3 113.0021 Chisq(2)  0.0000

5.3 M-Estimadores (M-estimators)

model3<-pbad2way(Brix~Variedad*Manejo, data = dat)
model3

## Call:
## pbad2way(formula = Brix ~ Variedad * Manejo, data = dat)
## 
##                 p.value
## Variedad          0.000
## Manejo            0.005
## Variedad:Manejo   0.000

6. PRUEBAS BASADAS EN PERMUTACIONES

6.1 Permutación de Lawrence

permute_ez<- function(formula,data){
  FV <- rstatix:::get_formula_right_hand_side(formula)
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  data <-data.frame(resp=data[,vars[1]],fact1=data[,vars[2]],
                    fact2=data[,vars[3]])
  dat.Per <- data.frame(id=seq(1:dim(data)[1]),data)
  res <- ezPerm(data=dat.Per,dv=resp,wid=id, between =fact1*fact2, perms =1e3,
                parallel =T)
  res$Effect <- FV
  return(res)
}
permute_ez(Brix~Variedad*Manejo,dat)

## Progress disabled when using parallel plyr

##            Effect     p p<.05
## 1        Variedad 0.000     *
## 2          Manejo 0.005     *
## 3 Variedad:Manejo 0.006     *

Las siguientes funciones para las pruebas de permutation fueron obtenidas del autor (Howell, 2013)

Link: https://www.uvm.edu/~statdhtx/StatPages/Permutation%20Anova/PermTestsAnova.html

6.1 Permutación bajo el enfoque de Manly

Permute_Manly<-function(formula,data,nreps=1000,seed=120)
{
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  formula <-as.formula(paste0(vars[1],"~",vars[2],"*",vars[3]))
  FV <- rstatix:::get_formula_right_hand_side(formula)
  response<-vars[1]
  var1<-vars[2]
  var2<-vars[3]
  fact1<-data[,var1]
  fact2<-data[,var2]
  resp <-data[,response]
  mod1 <- lm(resp ~ fact1 + fact2 + fact1:fact2)
  Pvalues=function(mod){
    ANOVA <- summary(aov(mod))
    Ffact1 <-   ANOVA[[1]]$"F value"[1]
    Ffact2 <-   ANOVA[[1]]$"F value"[2]
    Finteract <-    ANOVA[[1]]$"F value"[3]
    FS <- numeric(nreps)
    FM <- numeric(nreps)
    FSM <- numeric(nreps)
    FS[1] <- Ffact1
    FM[1] <- Ffact2
    FSM[1] <- Finteract
    for (i in 2:nreps) {
      newresp <- sample(resp, dim(data)[1])
      mod2 <- lm(newresp ~ fact1 + fact2 + fact1:fact2)
      b <- summary(aov(mod2))
      FS[i] <- b[[1]]$"F value"[1]
      FM[i] <- b[[1]]$"F value"[2]
      FSM[i] <- b[[1]]$"F value"[3]
    }
    probS <- length(FS[FS >= Ffact1 + .Machine$double.eps ^0.5])/nreps 
    probM <- length(FM[FM >= Ffact2+ .Machine$double.eps ^0.5])/nreps
    probSM  <-  length(FSM[FSM >= Finteract + .Machine$double.eps ^0.5])/
      nreps
    pvalue=c(probS,probM,probSM)
    return(pvalue)
  }
  set.seed(seed)
  Pvalue.fin=Pvalues(mod1)
  Model_Manly<-data.frame(FV=FV,Pvalue=Pvalue.fin)
  cat(" Permutation method Manly ","\n")
  return(Model_Manly)}

Permute_Manly(Brix~Variedad*Manejo,dat,nreps=1000,200)

##  Permutation method Manly

##                FV Pvalue
## 1        Variedad  0.000
## 2          Manejo  0.017
## 3 Variedad:Manejo  0.002

6.2 Permutación de residuos de Still y White

Permute_Still_White<-function(formula,data,nreps=1000,seed=120)
{
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  formula <-as.formula(paste0(vars[1],"~",vars[2],"*",vars[3]))
  FV <- rstatix:::get_formula_right_hand_side(formula)
  response<-vars[1]
  var1<-vars[2]
  var2<-vars[3]
  fact1<-data[,var1]
  fact2<-data[,var2]
  resp <-data[,response]
  mod1 <- lm(resp ~ fact1 + fact2 + fact1:fact2)
  Pvalues=function(mod){
    ANOVA <- summary(aov(mod))
    Ffact1 <-   ANOVA[[1]]$"F value"[1]
    Ffact2 <-   ANOVA[[1]]$"F value"[2]
    Finteract <-    ANOVA[[1]]$"F value"[3]
    FS <- numeric(nreps)
    FM <- numeric(nreps)
    FSM <- numeric(nreps)
    FS[1] <- Ffact1
    FM[1] <- Ffact2
    FSM[1] <- Finteract
    for (i in 2:nreps) {
      newresp <- sample(resp, dim(data)[1])
      mod2 <- lm(newresp ~ fact1 + fact2 + fact1:fact2)
      b <- summary(aov(mod2)) 
      FS[i] <- b[[1]]$"F value"[1]
      FM[i] <- b[[1]]$"F value"[2]
      FSM[i] <- b[[1]]$"F value"[3]
    }
    probS <- length(FS[FS >= Ffact1 + .Machine$double.eps ^0.5])/nreps
    probM <- length(FM[FM >= Ffact2+ .Machine$double.eps ^0.5])/nreps
    probSM  <-  length(FSM[FSM >= Finteract + .Machine$double.eps ^0.5])/
      nreps
    pvalue=c(probS,probM,probSM)
    return(pvalue)
  }
  set.seed(seed)
  Pvalue.fin=Pvalues(mod1)
  mod2 <- lm(resp ~ fact2 + fact1)
  res <- mod2$residuals
  SHint <- numeric(nreps)
  ANOVA<- summary(aov(mod1))
  Finteract <-  ANOVA[[1]]$"F value"[3]
  SHint[1] <- Finteract
  for (i in 2:nreps) {
    newresp <- sample(res, dim(data)[1], replace = FALSE)
    mod7 <- summary(aov(newresp ~ fact1 + fact2 + fact1:fact2))
    SHint[i] <- mod7[[1]]$"F value"[3]
  }
  probInt <- length(SHint[SHint >= Finteract + .Machine$double.eps ^0.5])/nreps
  cat(" Permutation method Still-White ","\n")
  Model_Still_White<-data.frame(FV=FV,Pvalue=c(Pvalue.fin[1:2],probInt))
  return(Model_Still_White)}

Permute_Still_White(Brix~Variedad*Manejo,dat,nreps=1000,200)

##  Permutation method Still-White

##                FV Pvalue
## 1        Variedad  0.000
## 2          Manejo  0.017
## 3 Variedad:Manejo  0.000

6.3 Permutación con el método de Ter Braak

Permute_ter_Braak<-function(formula,data,nreps=1000,seed=120)
{
  vars <- names(attr(aov(formula,data)$terms,"dataClasses"))
  formula <-as.formula(paste0(vars[1],"~",vars[2],"*",vars[3]))
  FV <- rstatix:::get_formula_right_hand_side(formula)
  response<-vars[1]
  var1<-vars[2]
  var2<-vars[3] 
  fact1<-data[,var1]
  fact2<-data[,var2]
  resp <-data[,response]
  mod1 <- lm(resp ~ fact1 + fact2 + fact1:fact2)
  Pvalues=function(mod){
    ANOVA <- summary(aov(mod))
    Ffact1 <-   ANOVA[[1]]$"F value"[1]
    Ffact2 <-   ANOVA[[1]]$"F value"[2]
    Finteract <-    ANOVA[[1]]$"F value"[3]
    FS <- numeric(nreps)
    FM <- numeric(nreps)
    FSM <- numeric(nreps)
    FS[1] <- Ffact1
    FM[1] <- Ffact2
    FSM[1] <- Finteract
    for (i in 2:nreps) {
      newresp <- sample(resp, dim(data)[1])
      mod2 <- lm(newresp ~ fact1 + fact2 + fact1:fact2)
      b <- summary(aov(mod2))
      FS[i] <- b[[1]]$"F value"[1]
      FM[i] <- b[[1]]$"F value"[2]
      FSM[i] <- b[[1]]$"F value"[3]
    }
    probS <- length(FS[FS >= Ffact1 + .Machine$double.eps ^0.5])/nreps
    probM <- length(FM[FM >= Ffact2+ .Machine$double.eps ^0.5])/nreps
    probSM  <-  length(FSM[FSM >= Finteract + .Machine$double.eps ^0.5])/
      nreps
    pvalue=c(probS,probM,probSM)
    return(pvalue)
  }
  set.seed(seed)
  Pvalue.fin=Pvalues(mod1)
  mod9 <- lm(resp ~ fact2 + fact1 + fact2:fact1)
  res2 <- mod9$residuals
  TBint <- numeric(nreps)
  ANOVA<- summary(aov(mod1))
  Finteract <-  ANOVA[[1]]$"F value"[3]
  TBint[1] <- Finteract
  for (i in 2:nreps) {
    newresp <- sample(res2, dim(data)[1], replace = FALSE)
    mod10 <- summary(aov(newresp ~ fact1 + fact2 + fact1:fact2))
    TBint[i] <- mod10[[1]]$"F value"[3]
  }
  probInt1 <- length(TBint[TBint >= Finteract   + .Machine$double.eps ^0.5
  ])/nreps
  cat(" Permutation method ter Braak ","\n")
  Model_ter_Braak<-data.frame(FV=FV,Pvalue=c(Pvalue.fin[1:2],probInt1))
  return(Model_ter_Braak)}

Permute_ter_Braak(Brix~Variedad*Manejo,dat,nreps=1000,200)

##  Permutation method ter Braak

##                FV Pvalue
## 1        Variedad  0.000
## 2          Manejo  0.017
## 3 Variedad:Manejo  0.000

7. TRANSFORMACIÓN BOX-COX

Box_cox<-function(formula,data){
  vars <- all.vars(formula)
  Cox <- data.frame(boxcox(formula,data=data,lambda= seq(-15,15,0.1),plotit=
                             F))
  lbd <- Cox[with(Cox,order(-Cox$y)),][1,"x"]
  data[,vars[1]]=((data[,vars[1]])^lbd-1)/lbd
  mod <- lm(formula,data)
  res <- summary(aov(mod))
  results <- list(mod=mod,anova=res)
  cat(" Transformation Box-Cox ","\n")
  return(results)
}
Box_cox(Brix~Variedad*Manejo,dat)

##  Transformation Box-Cox

## $mod
## 
## Call:
## lm(formula = formula, data = data)
## 
## Coefficients:
##         (Intercept)           VariedadV2           VariedadV3  
##               95509               -26912               -81133  
##            ManejoS2  VariedadV2:ManejoS2  VariedadV3:ManejoS2  
##              -12559               -31826                13494  
## 
## 
## $anova
##                 Df    Sum Sq   Mean Sq F value   Pr(>F)    
## Variedad         2 2.230e+10 1.115e+10  204.07 4.28e-13 ***
## Manejo           1 2.091e+09 2.091e+09   38.27 7.71e-06 ***
## Variedad:Manejo  2 2.166e+09 1.083e+09   19.82 2.83e-05 ***
## Residuals       18 9.836e+08 5.464e+07                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

8. REFERENCIAS BIBLIOGRÁFICAS

Adam, L., & Bejda, P. (2018). Robust estimators based on generalization of trimmed mean. Communications in Statistics - Simulation and Computation, 47(7), 2139-2151. https://doi.org/10.1080/03610918.2017.1337136

Aho, K. A. (2013). Robust ANOVA. ln Chapman & Hall/CRC (Eds.). Foundational and Applied Statistics for Biologists using R (1st ed., pp. 493). Taylor & Francis Group. URL: https://libgen.is/book/index.php?md5=FC7D4BB1EF0423899FFE8E20DED19E09

Aho, K. A. (2020). asbio: A collection of statistical tools for biologists. R package version 1.6-7. https://rdrr.io/cran/asbio/

Akritas, M. G., Arnold, S. F., & Brunner, E. (1997). Nonparametric hypotheses and rank statistics for unbalanced factorial designs. Journal of the American Statistical Association, 92(437), 258-265. https://doi.org/10.1080/01621459.1997.10473623

Afonso, A., & Pereira, D. G. (2019). Comparação entre métodos não paramétricos para a análise de variância com dois fatores: um estudo de simulação. Instituto Nacional de Estatística, 147-158.

Algina, J., & Olejnik, S. F. (1984). Implementing the Welch-James procedure with factorial designs. Educational and psychological measurement, 44(1), 39-48. https://doi.org/10.1177/0013164484441004

Andriani, S. (2017). Uji Park Dan Uji Breusch Pagan Godfrey Dalam pendeteksian heteroskedastisitas pada analisis regresi. Al-Jabar. Jurnal Pendidikan Matematika, 8(1), 63-72. https://doi.org/10.24042/ajpm.v8i1.1014

Beasley, T. M., & Zumbo, B. D. (2009). Aligned rank tests for interactions in Split- Plot Designs: distributional assumptions and stochastic heterogeneity. Journal of Modern Applied Statistical Methods, 8(1), 16-50. http://dx.doi.org/10.22237/jmasm/1241136180

Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2), 211-252. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x

Brown, M. B., & Forsythe, A. B. (1974). The anova and multiple comparisons for data with heterogeneous variances. Biometrics, 30(4), 719-724. https://doi.org/10.2307/2529238

Brunner, E., Dette, H., & Munk, A. (1997). Box-Type approximations in nonparametric factorial designs. Journal of the American Statistical Association, 92(440), 1494-1502. https://doi.org/10.1080/01621459.1997.10473671

Brunner, E., Konietschke, F., Pauly, M., & Puri, M. (2016). Rank-Based procedures in factorial designs: Hypotheses about nonparametric treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(5), 1463-1485. https://doi.org/10.1111/rssb.12222

Brunner, E.; Bathke, A.; Konietschke, F. (2018). Two-Factor Crossed Designs. Rank and Pseudo - Rank Procedures for Independent Observations in Factorial Designs Using R and SAS. (1st ed, pp. 287-292). Editorial Springer, suiza. url: http://library.lol/main/96D22A7D31F896490321AEBE84933872

Conover, W. J., & Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics. American Statistician 35(3), 124-129. https://doi.org/10.2307/2683975

Delgado, J. (1992). Algunos problemas básicos del análisis de varianza. (1era ed.). Editorial Ediciones Universidad de Salamanca, España.

De Neve, J., & Thas, O. (2017). A Mann–Whitney type effect measure of interaction for factorial designs. Communications in Statistics - Theory and Methods 46(22), 11243-11260. https://doi.org/10.1080/03610926.2016.1263739

Feys, J. (2016). Nonparametric tests for the interaction in two-way factorial designs using R. The R Journal 8(1), 367-378. https://doi.org/10.32614/RJ-2016-027 Friedrich, S., Konietschke, F., & Pauly, M. (2017). GFD: An R Package for the analysis of general factorial designs. Journal of Statistical Software, 79(1), 1-18. https://doi.org/10.18637/jss.v079.c01

Frossard, J., & Renaud, O. (2019). permuco: Permutation tests for regression, (Repeated Measures) ANOVA/ANCOVA and Comparison of Signals. R package version 1.1.0. https://cran.r-project.org/web/packages/permuco/index.html

Gao, X., & Alvo, M. (2005). A nonparametric test for interaction in two‐way layouts. Canadian Journal of Statistics, 33(4): 529-543. doi:10.1002/cjs.5550330405

Higgins, J., & Tashtoush, S. (1994). An aligned rank transform test for interaction. Nonlinear World, 1(2), 201-211.

Hocking, R. (1985). The Analysis of Linear Models (1st ed.). Editorial Brooks/Cole, Monterey, California.

Howell, D. (2013). Statistical Methods for Psychology. Wadsworth (9th ed.). https://www.uvm.edu/~statdhtx/StatPages/

Jimenes, C.; Pérez, J. (1989). Diseños experimentales en ciencias de la conducta: un método de análisis de varianza de libre distribución (no paramétrico). Anuario de psicología, 42, 31-48.

Kassambara, A. (2019). Homogeneity of variance (1st ed. pp. 14). Practical Statistics in R for Comparing Groups: Numerical Variables. Editorial Datanovia, Francia.

Kay, M., & Wobbrock, J. (2020). ARTool: Aligned rank transform. R package version 0.10.7. https://cran.r-project.org/web/packages/ARTool/index.html.

Keselman, H. J., Carriere, K. C., & Lix, L. M. (1996). Robust and powerful nonorthogonal analyses. Psychometrika, 60(3), 395-418. https://doi.org/10.1007/BF02294383

Kloke, J., & McKean, J. (2020). Rfit: Rank-Based estimation for linear models. R package version 0.24.2. Disponible en: https://cran.r-project.org/web/packages/Rfit/index.html

Konietschke, F.; Friedrich, S.; Brunner, E.; Pauly, M. (2020). rankFD: Rank-Based Tests for General Factorial Designs. R package version 0.0.5. Diponible en: https://cran.r-project.org/web/packages/rankFD/index.html

Kreutzmann, A., Medina, & Rojas, N. (2018). trafo: Estimation, comparison and selection of transformations. R package version 1.0.1. Disponible en: https://cran.r-project.org/web/packages/trafo/index.html

Lawrence, M. (2016). ez: Easy analysis and visualization of factorial experiments. R package version 4.4.0. Disponible en: https://cran.r-project.org/web/packages/ez/index.html

Lawson, J. (2015). Completely randomized designs with one factor (1st ed. Pp. 31-36). Design and Analysis of Experiments with R. Editorial CHAPMAN & HALL/CRC, Brigham Young University Provo, Utah, USA. http://library.lol/main/BE472FED84FD3B4E5D5D53D6C1719C30

Luepsen, H. (2018). Comparison of nonparametric analysis of variance methods: A vote for van der Waerden. Journal Communications in Statistics - Simulation and Computation, 47(9), 2547-2576. https://doi.org/10.1080/03610918.2017.1353613

Luepsen, H. (2020). R Functions for the analysis of variance. Disponible en: http://www.uni-koeln.de/~a0032/R/

Mair, P., & Wilcox, R. (2020). WRS2: A collection of robust statistical methods. R package version 1.1-0. Disponible en: https://cran.r-project.org/web/packages/WRS2/

Mangiafico, S. (2016). Summary and Analysis of Extension Program Evaluation. R package version 1.18.1. Disponible en: rcompanion.org/documents/RHandbookProgramEvaluation.pdf

Manly, B. (1997). Randomization, bootstrap, and Monte Carlo methods in biology (2nd ed.). Editorial Chapman Hall, London.

Mansouri, H., & Chang, G. (1995). A comparative study of some rank tests for interaction. Computational Statistics & Data Analysis, 19(1), 85-96. https://doi.org/10.1016/0167-9473(93)E0045-6

Melo, O. O., López, L. A, & Melo, S. (2020). Comparaciones múltiples y validación de supuestos (2da ed.). Diseño de experimentos Métodos y aplicaciones. Editorial Coordinación de publicaciones - Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, D. C., Colombia.

Mewhort, D. J., Brendan, B. T., & Matthew, k. (2010). Applying the permutation test to factorial designs. Behavior Research Methods, 42(2), 366-372.

Puri, M., & Sen, P. (1985). Nonparametric Methods in General Linear Models. (1st ed. pp. 235). Editorial Wiley, New York.

Pelea, L. (2018). ¿Cómo proceder ante el incumplimiento de las premisas de los métodos paramétricos? o ¿cómo trabajar con variables biológicas no normales? Revista Del Jardín Botánico Nacional, 39, 1-12.

Quinglong, L. (2015). StatMethRank: Statistical methods for ranking data. R package version 1.3. Disponible en: https://cran.r-project.org/web/packages/StatMethRank

Ribeiro-Oliveira, J. P., Garcia, D., Pereira, V. & Machado, C. (2018). Data transformation: an underestimated tool by inappropriate use. Acta Scientiarum-agronomy, 40, 35-300. https://doi.org/10.4025/actasciagron.v40i1.35300

Ripley, B., Venables, B., Bates, D., Hornik, K., Gebhardt, A., & Firth, D. (2020). MASS: Support Functions and Datasets for Venables and Ripley’s MASS. R package version 7.3-53. Disponible en: https://cran.r-project.org/web/packages/MASS/index.html

Saste, S., Sananse, S., & Sonar, C. (2016). On parametric and nonparametric analysis of two factor factorial experiment. International Journal of Applied Research, 2(7), 653-656.

Sawilowsky, S. S (1990). Nonparametric Tests of Interaction in Experimental Design. Review of Educational Research, 60(1), 91-126. https://doi.org/10.2307/1170226

Scheirer, C. J., Ray, W. S., & Hare, N. (1976). The analysis of ranked data derived from completely randomized factorial designs. Biometrics, 32(2), 429-434. https://doi.org/10.2307/2529511

Shoemaker, L. H. (1986). A nonparametric method for analysis of variance. Communications in Statistics - Simulation and Computation, 15(3), 609-632. https://doi.org/10.1080/03610918608812528

Still, A. W., & White, A. P. (1981). The approximate randomization test as an alternative to the F-test in the analysis of variance. Journal of Mathematical and Statistical Psychology, 34(2): 243–252. https://doi.org/10.1111/j.2044-8317.1981.tb00634.x

ter Braak, C. (1992). In Bootstrapping and Related Techniques (k. ed. Pp 79-86). Permutation versus bootstrap significance testss in multiple regression and ANOVA. Jockel. Editorial Springer-Verlag, Berlin.

Thongteeraparp, A. (2019). The comparison of nonparametric statistical tests for interaction effects in factorial design. Decision Science Letters, 8(3), 309-316. doi: 10.5267/j.dsl.2018.11.003

Toothaker, L. E. & Chang, H. (1980). On “The Analysis of Ranked Data Derived from Completely Randomized Factorial Designs”. Journal of Educational Statistics, 5(2), 169-176. https://doi.org/10.2307/1164679

Torchiano, M., & Wheeler, B. (2018). lmPerm:Permutation Tests for Linear Models. R package version 1.6-5. Disponible en: https://cran.r-project.org/web/packages/lmPerm/index.html

Vallejo G., Fernández P., & Livacic P. (2010). Pruebas robustas para modelos ANOVA de dos factores con varianzas heterogéneas. Psicologica, 31(1), 129-148.

van der Waerden, B. (1953). Order tests for the two-sample problem. II, III, Proceedings of the Royal Academy of Arts and Sciences. The Netherlands, Serie A, 564, 303-316.

Wilcox, R. (2017). One-Way and Higher Designs for Independent Group. Introduction to Robust Estimation and Hypothesis Testing (4ta ed. Pp. 396). Editorial Academic Press, Elsevier, United States. http://library.lol/main/579507B7CA6B37C405383FC170F60243

Wijekularathna, D. K., Manage, A. B., & Scariano, S. M. (2019). Power analysis of several normality tests: A Monte Carlo simulation study. Communications in Statistics - Simulation and Computation. https://doi.org/10.1080/03610918.2019.1658780

Wilson, K. (1956). A distribution-free test of analysis of variance hypotheses. Psychological Bulletin, 53(1), 96-101. https://doi.org/10.1037/h0047975

Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only ANOVA Procedures. In Proceedings of the ACM Conference on Human Factors in Computing Systems, Vancouver, British Columbia, New York, 7-12. https://doi.org/10.1145/1978942.1978963

Métodos alternativos ante la violación de supuestos en diseños de experimentos factoriales

Autor: Aldo R. Meza Rodríguez

Fecha: 19/10/2021