###Diseño en medidas repetidas Se utiliza cuando se realizan mediciones en el tiempo a lo largo del desarrollo del cultivo - Factor intrasujeto = Tiempo - Factores entre sujetos = F. simple, completamente al azar, en bloques al azar, completo en bloques completamente al azar, etc. Lo unico nuevo es que se le agrega el Tiempo

Medidas repetidas en una, dos y tres vías.

#install.packages("datarium")

data("selfesteem", package = "datarium")

datos = selfesteem
head(datos, 3)

##   id       t1       t2       t3
## 1  1 4.005027 5.182286 7.107831
## 2  2 2.558124 6.912915 6.308434
## 3  3 3.244241 4.443434 9.778410

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)
# install.packages("datarium")
data("selfesteem", package = "datarium")

datos = selfesteem
datos = datos  %>%
  gather(key = "tiempo",
         value = "rto",
         t1, t2, t3) %>%
  mutate_at(vars(id, tiempo), as.factor)
View(datos)

Se debe convertir el formato en formato largo, esto se hace con la funcion “gather” utilizada anteriormente

# Resumen estadistico

datos %>%
  group_by(tiempo) %>%
  summarise(media = mean(rto),
            desv = sd(rto),
            n = n(),
            cv = 100*desv/media)

## # A tibble: 3 × 5
##   tiempo media  desv     n    cv
##   <fct>  <dbl> <dbl> <int> <dbl>
## 1 t1      3.14 0.552    10  17.6
## 2 t2      4.93 0.863    10  17.5
## 3 t3      7.64 1.14     10  15.0

boxplot(datos$rto ~ datos$tiempo)

Hasta ahora el efecto del tercer corte parece ser el mejor

library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

# Deteccion de datos atipicos
datos %>%
  group_by(tiempo) %>%
  identify_outliers(rto)

## # A tibble: 2 × 5
##   tiempo id      rto is.outlier is.extreme
##   <fct>  <fct> <dbl> <lgl>      <lgl>     
## 1 t1     6      2.05 TRUE       FALSE     
## 2 t2     2      6.91 TRUE       FALSE

# Probando normalidad

datos %>%
group_by(tiempo) %>%
  shapiro_test(rto)

## # A tibble: 3 × 4
##   tiempo variable statistic     p
##   <fct>  <chr>        <dbl> <dbl>
## 1 t1     rto          0.967 0.859
## 2 t2     rto          0.876 0.117
## 3 t3     rto          0.923 0.380

-Los datos son atipicos, pero nos on extremos. Como los cv dieron el 20% no hay preocupacion por tener datos atipicos. cuando los datos son extremos significa que si se tienen datos atipicos. -Los datos parece que son normales, no se tiene problema con los supuestos de normalidad. - supuesto de esfericidad: compara si la variabilidad de los datos entre tiempos sucesivos son similares.

# Supuesto de esfericidad (igualdad de varianzas)

res.aov <- anova_test(data = datos,
                      dv = rto,
                      wid = id,
                      within = tiempo)

# Esfericidad
res.aov$`Mauchly's Test for Sphericity`

##   Effect     W     p p<.05
## 1 tiempo 0.551 0.092

get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##   Effect DFn DFd      F        p p<.05   ges
## 1 tiempo   2  18 55.469 2.01e-08     * 0.829

En caso de que no funcione la prueba de Mauchly, se debe buscar otra metodologia que se pueda usar cuando no se consigue esfericidad.

Es falso que los tres cortes son iguales. El corte 3 es el mejor rendimiento.

# Se usa el p.adj
datos %>%
  pairwise_t_test(
    rto ~ tiempo, paired = TRUE,
    p.adjust.method = 
      "bonferroni")

## # A tibble: 3 × 10
##   .y.   group1 group2    n1    n2 statistic    df           p p.adj p.adj.signif
## * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>       <dbl> <dbl> <chr>       
## 1 rto   t1     t2        10    10     -4.97     9 0.000772     2e-3 **          
## 2 rto   t1     t3        10    10    -13.2      9 0.000000334  1e-6 ****        
## 3 rto   t2     t3        10    10     -4.87     9 0.000886     3e-3 **

Efectivamente todos los tiempos son diferentes, siendo el t3 el de mayor rendimiento.

Medidas repetidas de dos vias

data("selfesteem2", package = "datarium")

datos2 = selfesteem2
datos2$treatment = gl(2,12,24, c('con fert', 'sin fert'))
View(datos2)

Convertir de formato ancho a formato largo

datos2 = datos2 %>% 
  gather(key='tiempo', value = 'rto',
         t1,t2,t3)

datos2 %>%
  group_by(treatment, tiempo) %>%
  summarise(media = mean(rto),
            desv = sd(rto),
            n = n(),
            cv = 100*desv/media)

## `summarise()` has grouped output by 'treatment'. You can override using the
## `.groups` argument.

## # A tibble: 6 × 6
## # Groups:   treatment [2]
##   treatment tiempo media  desv     n    cv
##   <fct>     <chr>  <dbl> <dbl> <int> <dbl>
## 1 con fert  t1      88    8.08    12  9.18
## 2 con fert  t2      83.8 10.2     12 12.2 
## 3 con fert  t3      78.7 10.5     12 13.4 
## 4 sin fert  t1      87.6  7.62    12  8.70
## 5 sin fert  t2      87.8  7.42    12  8.45
## 6 sin fert  t3      87.7  8.14    12  9.28

Los cv son menores al 20%, esto indica una buena homogeneidad de los datos.

library(ggplot2)

ggplot(datos2)+
  aes(tiempo, rto, fill=treatment)+
  geom_boxplot()

El tratamiento con fertilizante tiene mayor variabulidad.

# Supuestos de outliers

datos2 %>%
  group_by(treatment, tiempo) %>%
  identify_outliers(rto)

## [1] treatment  tiempo     id         rto        is.outlier is.extreme
## <0 rows> (or 0-length row.names)

No hay presencia de datos atipicos

# Supuestos de normalidad

datos2 %>%
  group_by(treatment, tiempo) %>%
  shapiro_test(rto)

## # A tibble: 6 × 5
##   treatment tiempo variable statistic      p
##   <fct>     <chr>  <chr>        <dbl>  <dbl>
## 1 con fert  t1     rto          0.828 0.0200
## 2 con fert  t2     rto          0.868 0.0618
## 3 con fert  t3     rto          0.887 0.107 
## 4 sin fert  t1     rto          0.919 0.279 
## 5 sin fert  t2     rto          0.923 0.316 
## 6 sin fert  t3     rto          0.886 0.104

Hay anormalidad con el primer tiempo de fertilizacion.

# ANOVA

res.aov <- anova_test(
  data = datos2,
  dv = rto,
  wid = id,
  within = c(treatment,
             tiempo)
  )
get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##             Effect  DFn   DFd      F        p p<.05   ges
## 1        treatment 1.00 11.00 15.541 2.00e-03     * 0.059
## 2           tiempo 1.31 14.37 27.369 5.03e-05     * 0.049
## 3 treatment:tiempo 2.00 22.00 30.424 4.63e-07     * 0.050

Si existe una interaccion, por ende se realiza un grafico de interaccion, pero no se realizan comparaciones de bontferroni.

# Grafico de interaccion

interaction.plot(datos2$tiempo,
                datos2$treatment,
                datos2$rto)

datos2 %>%
  group_by(tiempo, treatment) %>%
  summarise(mean_rto = mean(rto)) %>%
  ggplot()+
  aes(tiempo, mean_rto,
      color=treatment,
      group=treatment)+
  geom_point(size=5)+
  geom_line(linewidth=3)

## `summarise()` has grouped output by 'tiempo'. You can override using the
## `.groups` argument.

Hay una interaccion. En el t1 un patron diferente al de t3. si no hubiera diferencias, no habria interaccion.

Medidas repetidas 3 vias

library(stats)
library (rstatix)
library(datarium)

set.seed(123)
data("weightloss", package = "datarium")

datos3 = weightloss %>% 
  rename(FQ = diet,
         FO = exercises) %>% 
  pivot_longer (c(t1,t2,t3),
               names_to = 'tiempo',
               values_to = 'rto')

datos3

## # A tibble: 144 × 5
##    id    FQ    FO    tiempo   rto
##    <fct> <fct> <fct> <chr>  <dbl>
##  1 1     no    no    t1      10.4
##  2 1     no    no    t2      13.2
##  3 1     no    no    t3      11.6
##  4 2     no    no    t1      11.6
##  5 2     no    no    t2      10.7
##  6 2     no    no    t3      13.2
##  7 3     no    no    t1      11.4
##  8 3     no    no    t2      11.1
##  9 3     no    no    t3      11.4
## 10 4     no    no    t1      11.1
## # ℹ 134 more rows

table(datos3$FQ, datos3$FO, datos3$tiempo)

## , ,  = t1
## 
##      
##       no yes
##   no  12  12
##   yes 12  12
## 
## , ,  = t2
## 
##      
##       no yes
##   no  12  12
##   yes 12  12
## 
## , ,  = t3
## 
##      
##       no yes
##   no  12  12
##   yes 12  12

datos3 %>%
  group_by(FQ, FO, tiempo) %>%
  get_summary_stats(rto, type = "mean_sd")

## # A tibble: 12 × 7
##    FQ    FO    tiempo variable     n  mean    sd
##    <fct> <fct> <chr>  <fct>    <dbl> <dbl> <dbl>
##  1 no    no    t1     rto         12  10.9 0.868
##  2 no    no    t2     rto         12  11.6 1.30 
##  3 no    no    t3     rto         12  11.4 0.935
##  4 no    yes   t1     rto         12  10.8 1.27 
##  5 no    yes   t2     rto         12  13.4 1.01 
##  6 no    yes   t3     rto         12  16.8 1.53 
##  7 yes   no    t1     rto         12  11.7 0.938
##  8 yes   no    t2     rto         12  12.4 1.42 
##  9 yes   no    t3     rto         12  13.8 1.43 
## 10 yes   yes   t1     rto         12  11.4 1.09 
## 11 yes   yes   t2     rto         12  13.2 1.22 
## 12 yes   yes   t3     rto         12  14.7 0.625

Descritivo grafico

library(ggpubr)
ggboxplot(
  datos3, x = "FQ", y = "rto",
  color = "tiempo", palette = "jco",
  facet.by = "FO", short.panel.labs = FALSE
  )

datos3 %>%
  group_by(FQ, FO, tiempo) %>%
  identify_outliers(rto)

## # A tibble: 5 × 7
##   FQ    FO    tiempo id      rto is.outlier is.extreme
##   <fct> <fct> <chr>  <fct> <dbl> <lgl>      <lgl>     
## 1 no    no    t3     2      13.2 TRUE       FALSE     
## 2 yes   no    t1     1      10.2 TRUE       FALSE     
## 3 yes   no    t1     3      13.2 TRUE       FALSE     
## 4 yes   no    t1     4      10.2 TRUE       FALSE     
## 5 yes   no    t2     10     15.3 TRUE       FALSE

ggplot(datos3)+
  aes(rto)+
  geom_density()

ggplot(datos3)+
  aes(rto, fill=tiempo)+
  geom_density(alpha=0.5)

library(nortest)
shapiro.test(datos3$rto)

## 
##  Shapiro-Wilk normality test
## 
## data:  datos3$rto
## W = 0.96377, p-value = 0.0007474

lillie.test(datos3$rto)

## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  datos3$rto
## D = 0.10013, p-value = 0.001229

cvm.test(datos3$rto)

## 
##  Cramer-von Mises normality test
## 
## data:  datos3$rto
## W = 0.23031, p-value = 0.002205

sf.test(datos3$rto)

## 
##  Shapiro-Francia normality test
## 
## data:  datos3$rto
## W = 0.96406, p-value = 0.001295

ad.test(datos3$rto)

## 
##  Anderson-Darling normality test
## 
## data:  datos3$rto
## A = 1.432, p-value = 0.001021

# install.packages("lawstat")
library(lawstat)
symmetry.test(datos3$rto)

## 
##  m-out-of-n bootstrap symmetry test by Miao, Gel, and Gastwirth (2006)
## 
## data:  datos3$rto
## Test statistic = 2.2399, p-value = 0.032
## alternative hypothesis: the distribution is asymmetric.
## sample estimates:
## bootstrap optimal m 
##                  19

res.aov <- anova_test(
  data = datos3, dv = rto, wid = id,
  within = c(FQ, FO, tiempo)
  )
get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##         Effect  DFn   DFd       F        p p<.05   ges
## 1           FQ 1.00 11.00   6.021 3.20e-02     * 0.028
## 2           FO 1.00 11.00  58.928 9.65e-06     * 0.284
## 3       tiempo 2.00 22.00 110.942 3.22e-12     * 0.541
## 4        FQ:FO 1.00 11.00  75.356 2.98e-06     * 0.157
## 5    FQ:tiempo 1.38 15.17   0.603 5.01e-01       0.013
## 6    FO:tiempo 2.00 22.00  20.826 8.41e-06     * 0.274
## 7 FQ:FO:tiempo 2.00 22.00  14.246 1.07e-04     * 0.147

interaction.plot(datos3$tiempo,
                 datos3$FQ,
                 datos3$rto)

interaction.plot(datos3$tiempo,
                 datos3$FO,
                 datos3$rto)

TABLA CRUZADA

tbl = tapply(datos3$rto, list(datos3$FQ,
                              datos3$FO,
                              datos3$tiempo),
             mean)
tbl

## , , t1
## 
##           no      yes
## no  10.90917 10.79417
## yes 11.74250 11.39333
## 
## , , t2
## 
##           no      yes
## no  11.56583 13.42083
## yes 12.41583 13.22500
## 
## , , t3
## 
##           no     yes
## no  11.45000 16.8175
## yes 13.78667 14.6550

Diseño con medidas repetidas

Luisa Perea

2023-06-02

Medidas repetidas de dos vias

Medidas repetidas 3 vias