title: “ANOVA”
author: “Ruben Ortiz Mendoza”
date: “Friday, August 08, 2014”
output: html_document
data(InsectSprays)
str(InsectSprays)
## 'data.frame':    72 obs. of  2 variables:
##  $ count: num  10 7 20 14 14 12 10 23 17 20 ...
##  $ spray: Factor w/ 6 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...

Determinar las medias de cada grupo

tapply(InsectSprays$count, InsectSprays$spray, mean)
##      A      B      C      D      E      F 
## 14.500 15.333  2.083  4.917  3.500 16.667
tapply(InsectSprays$count, InsectSprays$spray, sd)
##     A     B     C     D     E     F 
## 4.719 4.271 1.975 2.503 1.732 6.213
tapply(InsectSprays$count, InsectSprays$spray, var)
##      A      B      C      D      E      F 
## 22.273 18.242  3.902  6.265  3.000 38.606

Conocer el tamaño de la muestra

tapply(InsectSprays$count, InsectSprays$spray, length)
##  A  B  C  D  E  F 
## 12 12 12 12 12 12

Representación visual de los datos

boxplot(InsectSprays$count~ InsectSprays$spray, col="gray")

plot of chunk unnamed-chunk-4

Insecto.aov<- aov(InsectSprays$count~ InsectSprays$spray)
summary(Insecto.aov)
##                    Df Sum Sq Mean Sq F value Pr(>F)    
## InsectSprays$spray  5   2669     534    34.7 <2e-16 ***
## Residuals          66   1015      15                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Insecto.aov
## Call:
##    aov(formula = InsectSprays$count ~ InsectSprays$spray)
## 
## Terms:
##                 InsectSprays$spray Residuals
## Sum of Squares                2669      1015
## Deg. of Freedom                  5        66
## 
## Residual standard error: 3.922
## Estimated effects may be unbalanced

Con esto se observa que el valor de mean sq f es > en InsectSprays$spray que en residual, mostraron una alta diferencia significativa (value Pr(>F) ), cuando el residual es mayor entonces no hay diferencias significativas

Prueba de tukey HSD para revidar

Se aplica si hay diferencia significativa se aplica prueva de tukey y si no, hay queda

TukeyHSD(Insecto.aov)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = InsectSprays$count ~ InsectSprays$spray)
## 
## $`InsectSprays$spray`
##         diff     lwr    upr  p adj
## B-A   0.8333  -3.866  5.533 0.9952
## C-A -12.4167 -17.116 -7.717 0.0000
## D-A  -9.5833 -14.283 -4.884 0.0000
## E-A -11.0000 -15.699 -6.301 0.0000
## F-A   2.1667  -2.533  6.866 0.7542
## C-B -13.2500 -17.949 -8.551 0.0000
## D-B -10.4167 -15.116 -5.717 0.0000
## E-B -11.8333 -16.533 -7.134 0.0000
## F-B   1.3333  -3.366  6.033 0.9603
## D-C   2.8333  -1.866  7.533 0.4921
## E-C   1.4167  -3.283  6.116 0.9489
## F-C  14.5833   9.884 19.283 0.0000
## E-D  -1.4167  -6.116  3.283 0.9489
## F-D  11.7500   7.051 16.449 0.0000
## F-E  13.1667   8.467 17.866 0.0000
plot(TukeyHSD(Insecto.aov))

plot of chunk unnamed-chunk-5

Todas las lineas que tocan la linea de cero no presentan diferencia significativa las que no tocan la linea de cero presentan diferencia significativa por tal razon me comviene ocupar el espray que mata mas insectos, siendo el F.