Diseño en medidas repetidas

Diseño en medidas repetidas: Una vía, dos vías, tres vias Entra Factor Intrasujetos: Tiempo Tambien estan los factores entre sujetos: FSCA, FSBA, FCCA, FCBA, etc.

#Codigo para convertir los datos de formato ancho a en formato largo ##Una via

library(dplyr)

## Warning: package 'dplyr' was built under R version 4.2.2

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)

## Warning: package 'tidyr' was built under R version 4.2.2

# install.packages("datarium")
data("selfesteem", package = "datarium")

datos = selfesteem
datos = datos  %>%
  gather(key = "tiempo",
         value = "rto",
         t1, t2, t3) %>%
  mutate_at(vars(id, tiempo), as.factor)
View(datos)

#Resumen estadistico

datos %>%
  group_by(tiempo) %>%
  summarise(media = mean(rto),
            desv = sd(rto),
            n = n(),
            cv = 100*desv/media)

## # A tibble: 3 × 5
##   tiempo media  desv     n    cv
##   <fct>  <dbl> <dbl> <int> <dbl>
## 1 t1      3.14 0.552    10  17.6
## 2 t2      4.93 0.863    10  17.5
## 3 t3      7.64 1.14     10  15.0

Coeiciente de varianza todos menores al 20%, cuando sea mayor se puede presentar errores en los datos

boxplot(datos$rto~datos$tiempo)

#Detección de outliers (datos atipicos)

library(rstatix)

## Warning: package 'rstatix' was built under R version 4.2.3

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

datos %>%
  group_by(tiempo) %>%
  identify_outliers(rto)

## # A tibble: 2 × 5
##   tiempo id      rto is.outlier is.extreme
##   <fct>  <fct> <dbl> <lgl>      <lgl>     
## 1 t1     6      2.05 TRUE       FALSE     
## 2 t2     2      6.91 TRUE       FALSE

Se evidenia is.ouler en t1_6 y t2_2, sin embargo no hay extremos, por lo tanto se confirma que no hay atipicos

#Supesto de Normalidad

datos %>%
  group_by(tiempo) %>%
  shapiro_test(rto)

## # A tibble: 3 × 4
##   tiempo variable statistic     p
##   <fct>  <chr>        <dbl> <dbl>
## 1 t1     rto          0.967 0.859
## 2 t2     rto          0.876 0.117
## 3 t3     rto          0.923 0.380

Segun los p-value que son >5% por lo cual se cumple el supuesto de normalidad, los datos son normales

#Supuesto de esfericidad Variabilidad de la varianza de los datos entre tiempos sucesivos

res.aov <- anova_test(data=datos,
                       dv=rto,
                       wid = id, 
                       within= tiempo)
get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##   Effect DFn DFd      F        p p<.05   ges
## 1 tiempo   2  18 55.469 2.01e-08     * 0.829

#Esfericidad

res.aov$`Mauchly's Test for Sphericity`

##   Effect     W     p p<.05
## 1 tiempo 0.551 0.092

get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##   Effect DFn DFd      F        p p<.05   ges
## 1 tiempo   2  18 55.469 2.01e-08     * 0.829

P-valuie es menor al 4% se rechaza la hipotesis nula, el rendimiento no es lo mismo en los cortes

#Se usa el p.adj
datos %>%
  pairwise_t_test(
    rto ~ tiempo, paired = TRUE,
    p.adjust.method = "bonferroni")

## # A tibble: 3 × 10
##   .y.   group1 group2    n1    n2 statistic    df           p    p.adj p.adj.s…¹
## * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>       <dbl>    <dbl> <chr>    
## 1 rto   t1     t2        10    10     -4.97     9 0.000772    0.002    **       
## 2 rto   t1     t3        10    10    -13.2      9 0.000000334 0.000001 ****     
## 3 rto   t2     t3        10    10     -4.87     9 0.000886    0.003    **       
## # … with abbreviated variable name ¹p.adj.signif

En vez de tukey se utiliza bonferroni. Si miramos los p-value son menores al 5% se rechaza la hipotesis nula, por ende los rendimientos entre los tiempo son diferentes

##Dos vias

data("selfesteem2", package = "datarium")

datos2 = selfesteem2
datos2$treatment = gl(2,12,24, c('con fert', 'sin fert'))
View(datos2)

Convertir los datos a formato largo

datos2 = datos2 %>% 
  gather(key='tiempo', value = 'rto',
         t1,t2,t3)
View(datos2)

#Tabla de resumen estadistico

datos2 %>%
  group_by(treatment, tiempo) %>%
  summarise(media = mean(rto),
            desv = sd(rto),
            n = n(),
            cv = 100*desv/media)

## `summarise()` has grouped output by 'treatment'. You can override using the
## `.groups` argument.

## # A tibble: 6 × 6
## # Groups:   treatment [2]
##   treatment tiempo media  desv     n    cv
##   <fct>     <chr>  <dbl> <dbl> <int> <dbl>
## 1 con fert  t1      88    8.08    12  9.18
## 2 con fert  t2      83.8 10.2     12 12.2 
## 3 con fert  t3      78.7 10.5     12 13.4 
## 4 sin fert  t1      87.6  7.62    12  8.70
## 5 sin fert  t2      87.8  7.42    12  8.45
## 6 sin fert  t3      87.7  8.14    12  9.28

#Table resumen con visualización

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.2.2

ggplot(datos2)+
  aes(tiempo, rto, fill=treatment)+
  geom_boxplot()

#Revisión de outliers

datos2 %>%
  group_by(treatment, tiempo) %>%
  identify_outliers(rto)

## [1] treatment  tiempo     id         rto        is.outlier is.extreme
## <0 rows> (or 0-length row.names)

#Supuesto de normalidad

datos2 %>%
  group_by(tiempo) %>%
  shapiro_test(rto)

## # A tibble: 3 × 4
##   tiempo variable statistic      p
##   <chr>  <chr>        <dbl>  <dbl>
## 1 t1     rto          0.893 0.0151
## 2 t2     rto          0.899 0.0205
## 3 t3     rto          0.919 0.0559

res.aov <- anova_test(
  data = datos2,
  dv = rto,
  wid = id,
  within = c(treatment,
             tiempo)
  )
get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##             Effect  DFn   DFd      F        p p<.05   ges
## 1        treatment 1.00 11.00 15.541 2.00e-03     * 0.059
## 2           tiempo 1.31 14.37 27.369 5.03e-05     * 0.049
## 3 treatment:tiempo 2.00 22.00 30.424 4.63e-07     * 0.050

Si hay interacción porque el p-value es <5%

#Gráfico de interacción

interaction.plot(datos2$tiempo,
                 datos2$treatment,
                 datos2$rto)

datos2 %>% 
  group_by(tiempo, treatment) %>% 
  summarise(mean_rto = mean(rto)) %>% 
  ggplot()+
  aes(tiempo, mean_rto,
      color=treatment,
      group=treatment)+
  geom_point(size=5)+
  geom_line(linewidth=3)

## `summarise()` has grouped output by 'tiempo'. You can override using the
## `.groups` argument.

El fertilizante genera la interacción, en el t1 no hay diferencia pero en el t3 si la hay, hay un cambio de patrón

res.aov$`Mauchly's Test for Sphericity`

##             Effect     W     p p<.05
## 1           tiempo 0.469 0.023     *
## 2 treatment:tiempo 0.616 0.089

#Tres vias

Diseño en medidas repetidas

Oscar A. Gómez

2023-06-02