Cortes:

t1 30 días t2 60 días t3 90 días

library(dplyr)

## Warning: package 'dplyr' was built under R version 4.2.3

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)

## Warning: package 'tidyr' was built under R version 4.2.3

# install.packages("datarium")
data("selfesteem", package = "datarium")

datos = selfesteem

# Datos de formato ancho a largo 
datos = datos  %>%
  gather(key = "tiempo",
         value = "rto",
         t1, t2, t3) %>%
  mutate_at(vars(id, tiempo), as.factor)
View(datos)

Este codigo es para convertir el formato ancho en formato largo, en una sola columna.

# Resumen estadistico 
 datos  %>%
  group_by(tiempo)  %>%
  summarise(media=mean(rto),
            desv = sd(rto),
             n = n(),
            cv = 100*desv/media)

## # A tibble: 3 × 5
##   tiempo media  desv     n    cv
##   <fct>  <dbl> <dbl> <int> <dbl>
## 1 t1      3.14 0.552    10  17.6
## 2 t2      4.93 0.863    10  17.5
## 3 t3      7.64 1.14     10  15.0

Coheficientes de variacion todos son menores al 20%, si son mayores hay problema en la toma de muestra etc.
Todos son menores al 20%, hay poa variabilidad (es bueno)

boxplot(datos$rto ~ datos$tiempo)

#verificar si hay datos atipicos

library(rstatix)

## Warning: package 'rstatix' was built under R version 4.2.3

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

datos %>%
  group_by(tiempo) %>%
  identify_outliers(rto)

## # A tibble: 2 × 5
##   tiempo id      rto is.outlier is.extreme
##   <fct>  <fct> <dbl> <lgl>      <lgl>     
## 1 t1     6      2.05 TRUE       FALSE     
## 2 t2     2      6.91 TRUE       FALSE

En t1, parcela 6 identificador 6, si es outlier (sospechoso de ser atipico), pero no son extremos.
En t2, parcela 2 identificador 2, si es outlier (sospechoso de ser atipico), pero no son extremos.
NO es necesario hacer este calculo normalmente, ya que en el analisis de coheficiente de variación es menor al 20%.

#Supuesto de normalidad. (Prueba) Normalmente se sacan los residuos primero. 

datos %>%
  group_by(tiempo) %>%
  shapiro_test(rto)

## # A tibble: 3 × 4
##   tiempo variable statistic     p
##   <fct>  <chr>        <dbl> <dbl>
## 1 t1     rto          0.967 0.859
## 2 t2     rto          0.876 0.117
## 3 t3     rto          0.923 0.380

Parece que todos los datos son normales.

#SUPUESTO DE ESFERICIDAD: (solo para medidas repetidas), esta relacionado con analisi de varianzas

res.aov <- anova_test(data =datos,
                      dv = rto,
                      wid = id,
                      within =tiempo)

#Esfericidad = Test de Mauchly
res.aov$'Mauchlys Test for Sphericity'

## NULL

get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##   Effect DFn DFd      F        p p<.05   ges
## 1 tiempo   2  18 55.469 2.01e-08     * 0.829

*Comparaciones: Compara si los tiempo sucesivos, (variabilidad entre tiempo sucesivos) es la misma. ~ varianzas debido al tiempo t1 - t2 t1- t3 t- t3

Puedo asumir varianzas iguales, se cumple suspuesto de esfericidad.

el unico factor es el tiempo, grado de libertad 2, f calculado 55, p valor 2.01e-08 (< 5% rechaza la hipotesis nula), ges = tamaño del efecto (no lo necesito por ahora)

Con base al p valor: El aceite que se produce en los 3 tiempo no es el mismo. El corte 3 tiene el mayor rendimieto de aceite.

#Normalidad bonferroni (más recomendable que tukey en medidas repetidas) - comparación de medias. 

datos %>%
  pairwise_t_test(
    rto ~ tiempo, paired = TRUE,
    p.adjust.method = "bonferroni")

## # A tibble: 3 × 10
##   .y.   group1 group2    n1    n2 statistic    df           p p.adj p.adj.signif
## * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>       <dbl> <dbl> <chr>       
## 1 rto   t1     t2        10    10     -4.97     9 0.000772     2e-3 **          
## 2 rto   t1     t3        10    10    -13.2      9 0.000000334  1e-6 ****        
## 3 rto   t2     t3        10    10     -4.87     9 0.000886     3e-3 **

Compara todos los pares de tiempo con p valores ajustados p.adj, (todos pequeños) todos menores al 5%, los que significa que todos los tiempos son diferentes en rendimiento.

*El tiempo 3 es el mejor, es cuando se eleva el rendimiento, es posible llevar a la planta hasta el 3er corte.

MEDIDAS REPETIDAS 2 VÍAS:

data("selfesteem2", package = "datarium")

datos2 = selfesteem2
datos2$treatment = gl(2,12,24, c('con fert', 
                                 'sin fert'))
View(datos2)

*Suponemos que el tratamieto es fertilizante y el contro sin fertilizar.

#2 vías Tiempo (t1, t2, t3) Fertilización (Control; fert)

#De formato ancho a formato largo:

datos2 = datos2 %>% 
            gather(key='tiempo', value = 'rto',t1,t2,t3)

datos2 %>%
  group_by(treatment, tiempo) %>%
  summarise(media = mean(rto),
            desv = sd(rto),
            n = n(),
            cv = 100*desv/media)

## `summarise()` has grouped output by 'treatment'. You can override using the
## `.groups` argument.

## # A tibble: 6 × 6
## # Groups:   treatment [2]
##   treatment tiempo media  desv     n    cv
##   <fct>     <chr>  <dbl> <dbl> <int> <dbl>
## 1 con fert  t1      88    8.08    12  9.18
## 2 con fert  t2      83.8 10.2     12 12.2 
## 3 con fert  t3      78.7 10.5     12 13.4 
## 4 sin fert  t1      87.6  7.62    12  8.70
## 5 sin fert  t2      87.8  7.42    12  8.45
## 6 sin fert  t3      87.7  8.14    12  9.28

Todos son menores al 20%, hay poca variabilidad (es bueno).-

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.2.2

ggplot(datos2)+
  aes(tiempo, rto, fill=treatment)+
  geom_boxplot()

* Con fertilizante, t1 y t2 se comportan igual. Hay una diferencia en t3 donde es mejor sin fertilizar. De igual forma no hay diferencias significativas entre fertilizar y no fertilizar.

#Supuesto de atipicos: 

datos2 %>%
  group_by(treatment, tiempo) %>%
  identify_outliers(rto)

## [1] treatment  tiempo     id         rto        is.outlier is.extreme
## <0 rows> (or 0-length row.names)

#Supuesto de normalidad 

datos2 %>%
  group_by(treatment, tiempo) %>%
  shapiro_test(rto)

## # A tibble: 6 × 5
##   treatment tiempo variable statistic      p
##   <fct>     <chr>  <chr>        <dbl>  <dbl>
## 1 con fert  t1     rto          0.828 0.0200
## 2 con fert  t2     rto          0.868 0.0618
## 3 con fert  t3     rto          0.887 0.107 
## 4 sin fert  t1     rto          0.919 0.279 
## 5 sin fert  t2     rto          0.923 0.316 
## 6 sin fert  t3     rto          0.886 0.104

*Estaprueba nos dice que no son normlaes, pero se esta corriendo con los datos de rendimiento yno con los residuales.

#Analisis de varianza: 

res.aov <- anova_test(data =datos2,
                      dv = rto,
                      wid = id,
                      within = c(treatment, 
                                 tiempo)
)


get_anova_table(res.aov)

## ANOVA Table (type III tests)
## 
##             Effect  DFn   DFd      F        p p<.05   ges
## 1        treatment 1.00 11.00 15.541 2.00e-03     * 0.059
## 2           tiempo 1.31 14.37 27.369 5.03e-05     * 0.049
## 3 treatment:tiempo 2.00 22.00 30.424 4.63e-07     * 0.050

< 5%, si hay interacción (treatment:tiempo), no se hacen comparaciones.

#Grafico de interacción: 

interaction.plot(datos2$tiempo, 
                 datos2$treatment,
                 datos2$rto)

datos2 %>% 
  group_by(tiempo, treatment) %>% 
  summarise(mean_rto = mean(rto)) %>% 
  ggplot()+
  aes(tiempo, mean_rto,
      color=treatment,
      group=treatment)+
  geom_point(size=5)+
  geom_line(linewidth=3)

## `summarise()` has grouped output by 'tiempo'. You can override using the
## `.groups` argument.

En el t1 no hay difrencia entre fertilizar y no fertilizar, sin embargo en el t3, hay diferencia significativa entre la fertilización y no fertilizar. (No fertilizar en t3)

#Esfericidad: 

res.aov$`Mauchly's Test for Sphericity`

##             Effect     W     p p<.05
## 1           tiempo 0.469 0.023     *
## 2 treatment:tiempo 0.616 0.089

Diferencias entre las varianza en el tiempo. pero no en la interacción.

Diseño en medidas repetidas

Geraldine Barbosa

2023-06-02

DISEÑO EN MEDIDAS REPETIDAS:

1 VÍA

Cortes:

MEDIDAS REPETIDAS 2 VÍAS: