MSCA 605 HW 7

Question 1

One-Factor Anova Test

H0: All 4 means for the Drugs will be equal HA: At least one of the means for the Drugs will not be equal.

## Call:
##    aov(formula = value ~ variable, data = mydrug2)
## 
## Terms:
##                 variable Residuals
## Sum of Squares     698.2     793.6
## Deg. of Freedom        3        16
## 
## Residual standard error: 7.042727
## Estimated effects may be unbalanced

## Analysis of Variance Table
## 
## Response: value
##           Df Sum Sq Mean Sq F value  Pr(>F)  
## variable   3  698.2  232.73  4.6922 0.01553 *
## Residuals 16  793.6   49.60                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  mydrug2$value and mydrug$variable 
## 
##       Drug1 Drug2 Drug3
## Drug2 1.000 -     -    
## Drug3 0.165 0.235 -    
## Drug4 1.000 1.000 0.012
## 
## P value adjustment method: bonferroni

## 
##  Pairwise comparisons using t tests with non-pooled SD 
## 
## data:  mydrug2$value and mydrug$variable 
## 
##       Drug1 Drug2 Drug3
## Drug2 1.00  -     -    
## Drug3 0.29  0.14  -    
## Drug4 1.00  1.00  0.04 
## 
## P value adjustment method: bonferroni

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = value ~ variable, data = mydrug2)
## 
## $variable
##              diff        lwr       upr     p adj
## Drug2-Drug1  -0.8 -13.543587 11.943587 0.9978542
## Drug3-Drug1 -10.8 -23.543587  1.943587 0.1122136
## Drug4-Drug1   5.6  -7.143587 18.343587 0.6014607
## Drug3-Drug2 -10.0 -22.743587  2.743587 0.1533294
## Drug4-Drug2   6.4  -6.343587 19.143587 0.4959400
## Drug4-Drug3  16.4   3.656413 29.143587 0.0097855

The anova test shows that not all of the means are equal. The test produced a p-value less than .05 meaning we would reject the Null and accept the alternate. Drug 3 and Drug 4 appear to have the most significant difference among the different drugs.

Descriptive Statistics

When looking at the descriptive statistics, you can clearly see that the means for Drugs 3 and 4 are very different than those of 1 and 2. 1 and 2 are actually very close together.

## 
##  Descriptive statistics by group 
## group: Drug1
##           vars n mean   sd median trimmed  mad min max range  skew kurtosis
## variable*    1 5  1.0 0.00      1     1.0 0.00   1   1     0   NaN      NaN
## value        2 5 26.4 8.76     26    26.4 5.93  14  38    24 -0.09    -1.58
##             se
## variable* 0.00
## value     3.92
## ------------------------------------------------------------ 
## group: Drug2
##           vars n mean   sd median trimmed mad min max range skew kurtosis   se
## variable*    1 5  2.0 0.00      2     2.0 0.0   2   2     0  NaN      NaN 0.00
## value        2 5 25.6 6.54     28    25.6 8.9  18  34    16    0    -1.98 2.93
## ------------------------------------------------------------ 
## group: Drug3
##           vars n mean   sd median trimmed  mad min max range  skew kurtosis
## variable*    1 5  3.0 0.00      3     3.0 0.00   3   3     0   NaN      NaN
## value        2 5 15.6 3.85     16    15.6 2.97  10  20    10 -0.28    -1.72
##             se
## variable* 0.00
## value     1.72
## ------------------------------------------------------------ 
## group: Drug4
##           vars n mean sd median trimmed  mad min max range skew kurtosis   se
## variable*    1 5    4  0      4       4 0.00   4   4     0  NaN      NaN 0.00
## value        2 5   32  8     30      32 5.93  22  44    22 0.28     -1.5 3.58

Visualization

The boxplot shows that there are 3 different data points that are outliers. It also appears that Drug3 and Drug4 are pretty different from Drug1 and Drug2.

Check Assumptions

Assumption 1: Normality

The Shapiro-Wilk tests show that each of the drugs have normal distributions. Outliers were not removed, so some of the normality may be affected by this, but it appears not too significantly.

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.98549, p-value = 0.9617

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.92221, p-value = 0.5443

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.97872, p-value = 0.9276

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.9491, p-value = 0.7308

Assumption 2: Check for Outliers

Three outliers were identified. One for Drug1 and two for Drug4.

Multivariate Repeated Measure Analysis

The multivariate repeated measure analysis shows that Drug3 and and Drug4 are significantly different at the p = .05 confidence level. The analysis shows that Drug3’s mean is 10.8 points less than that of Drug1. Drug 4 has a mean that is 5.6 points higher than that of Drug1. Drug2 is very similar to Drug1.

## Linear mixed-effects model fit by maximum likelihood
##  Data: mydrug 
##        AIC      BIC    logLik
##   123.5902 129.5646 -55.79509
## 
## Random effects:
##  Formula: ~1 | Subjects
##         (Intercept) Residual
## StdDev:    5.670979 2.742262
## 
## Fixed effects: value ~ variable 
##               Value Std.Error DF   t-value p-value
## (Intercept)    26.4  3.149603 12  8.382008  0.0000
## variableDrug2  -0.8  1.939072 12 -0.412568  0.6872
## variableDrug3 -10.8  1.939072 12 -5.569675  0.0001
## variableDrug4   5.6  1.939072 12  2.887979  0.0136
##  Correlation: 
##               (Intr) vrblD2 vrblD3
## variableDrug2 -0.308              
## variableDrug3 -0.308  0.500       
## variableDrug4 -0.308  0.500  0.500
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -1.53063969 -0.57831784 -0.04002289  0.69521552  1.52978267 
## 
## Number of Observations: 20
## Number of Groups: 5

## Warning: Converting "Subjects" to factor for ANOVA.

## $ANOVA
##        Effect DFn DFd     SSn   SSd        F            p p<.05       ges
## 1 (Intercept)   1   4 12400.2 680.8 72.85664 1.033910e-03     * 0.9398505
## 2    variable   3  12   698.2 112.8 24.75887 1.992501e-05     * 0.4680252
## 
## $`Mauchly's Test for Sphericity`
##     Effect         W         p p<.05
## 2 variable 0.1864953 0.4957455      
## 
## $`Sphericity Corrections`
##     Effect      GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2 variable 0.604874 0.0006490326         * 1.078854 1.992501e-05         *

Question 2

Boxplots

The box plots show that there are a few outliers. In addition, Versicolor and Virginica (the second two plots) have similar measurements in each category.

Covariances and Correlations

The first portion of the data represents Setosa. The second, versicolor. The third, Virginica. There appears to be much more correlation with Setosa than the other two types of Iris flowers.

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   0.12424898 0.099216327  0.016355102 0.010330612
## Sepal.Width    0.09921633 0.143689796  0.011697959 0.009297959
## Petal.Length   0.01635510 0.011697959  0.030159184 0.006069388
## Petal.Width    0.01033061 0.009297959  0.006069388 0.011106122

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000   0.7425467    0.2671758   0.2780984
## Sepal.Width     0.7425467   1.0000000    0.1777000   0.2327520
## Petal.Length    0.2671758   0.1777000    1.0000000   0.3316300
## Petal.Width     0.2780984   0.2327520    0.3316300   1.0000000

## [1] 0.3533595

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   0.26643265  0.08518367   0.18289796  0.05577959
## Sepal.Width    0.08518367  0.09846939   0.08265306  0.04120408
## Petal.Length   0.18289796  0.08265306   0.22081633  0.07310204
## Petal.Width    0.05577959  0.04120408   0.07310204  0.03910612

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000   0.5259107    0.7540490   0.5464611
## Sepal.Width     0.5259107   1.0000000    0.5605221   0.6639987
## Petal.Length    0.7540490   0.5605221    1.0000000   0.7866681
## Petal.Width     0.5464611   0.6639987    0.7866681   1.0000000

## [1] 0.08359418

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length   0.40434286  0.09376327   0.30328980  0.04909388
## Sepal.Width    0.09376327  0.10400408   0.07137959  0.04762857
## Petal.Length   0.30328980  0.07137959   0.30458776  0.04882449
## Petal.Width    0.04909388  0.04762857   0.04882449  0.07543265

##              Sepal.Length Sepal.Width Petal.Length Petal.Width
## Sepal.Length    1.0000000   0.4572278    0.8642247   0.2811077
## Sepal.Width     0.4572278   1.0000000    0.4010446   0.5377280
## Petal.Length    0.8642247   0.4010446    1.0000000   0.3221082
## Petal.Width     0.2811077   0.5377280    0.3221082   1.0000000

## [1] 0.1373902

Shapiro Tests

The Shapiro tests show that Versicolor and Virginica are not normally distributed using the p = .05 interval. Setosa, although not statistically significant, is nearly not normally distributed. At the p = .05 level, it would be considered normally distributed.

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.95878, p-value = 0.07906

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.93043, p-value = 0.005739

## 
##  Shapiro-Wilk normality test
## 
## data:  Z
## W = 0.93414, p-value = 0.007955

Profile

The graph below shows that the the measurements of the different groups are non-linear. This is supported by the p-values found below.

## 
## Data Summary:
##              setosa versicolor virginica
## Sepal.Length  5.006      5.936     6.588
## Sepal.Width   3.428      2.770     2.974
## Petal.Length  1.462      4.260     5.552
## Petal.Width   0.246      1.326     2.026

## Call:
## pbg(data = iris[, 1:4], group = iris[, 5], original.names = TRUE, 
##     profile.plot = TRUE)
## 
## Hypothesis Tests:
## $`Ho: Profiles are parallel`
##   Multivariate.Test   Statistic   Approx.F num.df den.df       p.value
## 1             Wilks  0.04115317  189.92337      6    290  2.395832e-97
## 2            Pillai  0.96909246   45.74852      6    292  2.472886e-39
## 3  Hotelling-Lawley 23.05050400  553.21210      6    288 7.441946e-155
## 4               Roy 23.03969817 1121.26531      3    146 1.477137e-100
## 
## $`Ho: Profiles have equal levels`
##              Df Sum Sq Mean Sq F value Pr(>F)    
## group         2  77.40   38.70   422.4 <2e-16 ***
## Residuals   147  13.47    0.09                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## $`Ho: Profiles are flat`
##          F df1 df2       p-value
## 1 4847.465   3 145 3.788058e-145