Homework Week 5

Problem 3.7

Null hypothesis: All the means are the same

Alternative hypothesis: At least one mean is different

library(agricolae)

## Warning: package 'agricolae' was built under R version 4.0.5

trt1 <- c(3129,3000,2865,2890)
trt2 <- c(3200,3300,2975,3150)    # Treatment Data
trt3 <- c(2800,2900,2985,3050)
trt4 <- c(2600,2700,2600,2765)
strength <- c(trt1,trt2,trt3,trt4)
technique <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
df <- cbind(technique,strength)
df <- as.data.frame(df)
df$technique <- as.factor(df$technique)
aov.model<-aov(strength~technique,data=df)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## technique    3 489740  163247   12.73 0.000489 ***
## Residuals   12 153908   12826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

t <- 2.179 # t-statistic
MSE <- 12826 # Mean square of errors
n = 4 # number of populations
LSD = t*sqrt(2*MSE/n)
abs(mean(trt1)-mean(trt2))

## [1] 185.25

abs(mean(trt1)-mean(trt3))

## [1] 37.25

abs(mean(trt1)-mean(trt4)) # means comparisons

## [1] 304.75

abs(mean(trt2)-mean(trt3))

## [1] 222.5

abs(mean(trt2)-mean(trt4))

## [1] 490

abs(mean(trt3)-mean(trt4))

## [1] 267.5

plot(aov.model)

(c) From the Fisher LSD method, the only pair of means that we would reject is mu1 and mu3, because the difference in means is less than the LSD

(d) The probability plot is roughly linear, making the normality assumption valid

(e) The plot shows the variances are close to each other, implying that the variances are equal

Problem 3.10

trt1 <- c(7,7,15,11,9)
trt2 <- c(12,17,12,18,18)
trt3 <- c(14,19,19,18,18)
trt4 <- c(19,25,22,19,23)
trt5 <- c(7,10,11,15,11)
strength <- c(trt1,trt2,trt3,trt4,trt5)
group <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
df <- cbind(group,strength)
df <- as.data.frame(df)
df$group <- as.factor(df$group)
aov.model<-aov(strength~group,data=df)
summary(aov.model)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## group        4  475.8  118.94   14.76 9.13e-06 ***
## Residuals   20  161.2    8.06                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

LSD.test(aov.model,"group",p.adj = "none",console=TRUE)

## 
## Study: aov.model ~ "group"
## 
## LSD t Test for strength 
## 
## Mean Square Error:  8.06 
## 
## group,  means and individual ( 95 %) CI
## 
##   strength      std r       LCL      UCL Min Max
## 1      9.8 3.346640 5  7.151566 12.44843   7  15
## 2     15.4 3.130495 5 12.751566 18.04843  12  18
## 3     17.6 2.073644 5 14.951566 20.24843  14  19
## 4     21.6 2.607681 5 18.951566 24.24843  19  25
## 5     10.8 2.863564 5  8.151566 13.44843   7  15
## 
## Alpha: 0.05 ; DF Error: 20
## Critical Value of t: 2.085963 
## 
## least Significant Difference: 3.745452 
## 
## Treatments with the same letter are not significantly different.
## 
##   strength groups
## 4     21.6      a
## 3     17.6      b
## 2     15.4      b
## 5     10.8      c
## 1      9.8      c

plot(aov.model)

(b) The Fisher LSD method demonstrates that the pair means of mu2 and mu3 are roughly equal, as well as the pair means of mu1 and mu5. All other pair means differ.

(c) The residuals have a roughly uniform shape, so the model is adequate

Problem 3.44

library(pwr)

## Warning: package 'pwr' was built under R version 4.0.5

mu1 = 50
mu2 = 60
mu3 = 50   # population means
mu4 = 60
v = 25  # error variance
avg_u = (mu1+mu2+mu3+mu4)/4  # Average of population means
pwr.anova.test(k=4,n=NULL,f=5/sqrt(v),sig.level=0.05,power=0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 4.658119
##               f = 1
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

n = 4.658119, so 5 observations are needed from each population

Problem 3.45

mu1 = 50
mu2 = 60
mu3 = 50   # population means
mu4 = 60
v = 36  # error variance
avg_u = (mu1+mu2+mu3+mu4)/4  # Average of population means
pwr.anova.test(k=4,n=NULL,f=5/sqrt(v),sig.level=0.05,power=0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 6.180857
##               f = 0.8333333
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

pwr.anova.test(k=4,n=NULL,f=5/sqrt(49),sig.level=0.05,power=0.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 7.998751
##               f = 0.7142857
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

(a) n = 6.180857, so 7 observations are needed from each population

(b) n = 7.998751, so 9 observations are needed from each population

(c) As the variance estimate increases, the number of samples needed increases.

(d) This approach should be used when there is no available estimate of variability, and should be used to estimate roughly how many samples will be needed for a range of possible variances.

Homework Week 5

Yonatan Nega

10/1/2021

Problem 3.7

Null hypothesis: All the means are the same

Alternative hypothesis: At least one mean is different

(c) From the Fisher LSD method, the only pair of means that we would reject is mu1 and mu3, because the difference in means is less than the LSD

(d) The probability plot is roughly linear, making the normality assumption valid

(e) The plot shows the variances are close to each other, implying that the variances are equal

Problem 3.10

(b) The Fisher LSD method demonstrates that the pair means of mu2 and mu3 are roughly equal, as well as the pair means of mu1 and mu5. All other pair means differ.

(c) The residuals have a roughly uniform shape, so the model is adequate

Problem 3.44

n = 4.658119, so 5 observations are needed from each population

Problem 3.45

(a) n = 6.180857, so 7 observations are needed from each population

(b) n = 7.998751, so 9 observations are needed from each population

(c) As the variance estimate increases, the number of samples needed increases.

(d) This approach should be used when there is no available estimate of variability, and should be used to estimate roughly how many samples will be needed for a range of possible variances.