Homework, week 5

3.7 (c,d,e,f)

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:

# Read in the data table for 3.7
dat37<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/37Table.csv")
names(dat37) <- sub('X', '', names(dat37))
dat37

##      1    2    3    4
## 1 3129 3200 2800 2600
## 2 3000 3300 2900 2700
## 3 2865 2975 2985 2600
## 4 2890 3150 3050 2765

(c) Use the Fisher LSD method with α = 0.05 to make comparisons between pairs of means.

Given:

N = 16
a = 4
n = 4
α = 0.05

Calculate:

y̅₁ = 2971
y̅₂ = 3156.25
y̅₃ = 2933.75
y̅₄ = 2666.25
MS_Error = 12826

# One way ANOVA test 
aov37 <- aov(values ~ ind,data=(stack(dat37)))
summary(aov37)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## ind          3 489740  163247   12.73 0.000489 ***
## Residuals   12 153908   12826                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Table Lookup:

t_α/2,N-a = t_0.025,16-4 = 2.179

Calculate:

LSD = t_α/2,N-a √(2(MS_Error)/n) = 2.179 √(2(12826)/4) = 174.495

Now test each difference of averages:

₁

₂

₁

₃

₁

₄

₂

₃

₂

₄

₃

₄

(d) Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?

The normal probability plot of the residuals looks nearly linear, so the assumption of normality holds.

# AOV residual normal probability plot 
aov37<-aov(values ~ ind,data=(stack(dat37)))
plot(aov37, 2)

(e) Plot the residuals versus the predicted tensile strength. Comment on the plot.

The variance (spread) of the residuals is inconsistent across the samples, so the assumption of equal error variance is not well supported. Expect the LSD test to be off.

# AOV residual normal probability plot 
aov37<-aov(values ~ ind,data=(stack(dat37)))
plot(aov37, 1)

(f) Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.

# scatterplot of each formulations
stripchart(values ~ ind,data=stack(dat37), vertical = TRUE, pch = 20, col = 'steelblue')

3.10 (b,c)

A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.

# Read in the data table for 3.10
dat310<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/310Table.csv")
names(dat310) <- sub('X', '', names(dat310))
names(dat310) <- sub("\\.", '', names(dat310))
dat310

##   15 20 25 30 35
## 1  7 12 14 19  7
## 2  7 17 19 25 10
## 3 15 12 19 22 11
## 4 11 18 18 19 15
## 5  9 18 18 23 11

(b) Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?

Given:

N = 25
a = 5
n = 5
α = 0.05

Calculate:

y̅_15% = 9.8
y̅_20% = 15.4
y̅_25% = 17.6
y̅_30% = 21.6
y̅_35% = 10.8
MS_Error = 8.06

# One way ANOVA test 
aov310 <- aov(values ~ ind,data=(stack(dat310)))
summary(aov310)

##             Df Sum Sq Mean Sq F value   Pr(>F)    
## ind          4  475.8  118.94   14.76 9.13e-06 ***
## Residuals   20  161.2    8.06                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Table Lookup:

t_α/2,N-a = t_0.025,25-5 = 2.068

Calculate:

LSD = t_α/2,N-a √(2(MS_Error)/n) = 2.068 √(2(8.06)/5) = 3.713

Now test each difference of averages:

_15%

_20%

_15%

_25%

_15%

_30%

_15%

_35%

_20%

_25%

_20%

_30%

_20%

_35%

_25%

_30%

_25%

_35%

_30%

_35%

CONCLUSION: All pairs are significantly different except the strengths of 15% vs. 35% cotton content and 20% vs. 25% cotton content.

(c) Analyze the residuals from this experiment and comment on model adequacy.

The normal probability plot of the residuals looks nearly linear, so the assumption of normality holds.

# AOV residual normal probability plot 
aov310<-aov(values ~ ind,data=(stack(dat310)))
plot(aov310, 2)

The variance (spread) of the residuals is inconsistent across the samples, so the assumption of equal error variance is not well supported. Expect the LSD test to be off.

# AOV residual normal probability plot 
aov310<-aov(values ~ ind,data=(stack(dat310)))
plot(aov310, 1)

3.44

Suppose that four normal populations have means of µ₁ = 50, µ₂ = 60, µ₃ = 50, and µ₄ = 60. How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that α = 0.05 and that a reasonable estimate of the error variance is σ² = 25.

Given

groups: k = 4
σ² = 25
effect: f > √((Δ_µ)²/25) where Δ_µ = average difference of (mean(µ) = 55) and each µ, so Δ_µ = 5
α = 0.05
power (1-β) > 0.90

We need at least 5 samples

# balanced one-way analysis effect size = 1^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(5^2/25),sig.level=0.05,power=.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 4.658119
##               f = 1
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

3.45

Refer to Problem 3.44.

(a) How would your answer change if a reasonable estimate of the experimental error variance were σ² = 36?

We need at least 7 samples

# balanced one-way analysis effect size = 1^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(5^2/36),sig.level=0.05,power=.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 6.180857
##               f = 0.8333333
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

(b) How would your answer change if a reasonable estimate of the experimental error variance were σ² = 49?

We need at least 8 samples, but it is very close, so maybe 9.

# balanced one-way analysis effect size = 1^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(5^2/49),sig.level=0.05,power=.90)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 7.998751
##               f = 0.7142857
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

(c) Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of σ affects the decision about sample size?

We require more samples to maintain the same power as the variance increases.

(d) Can you make any recommendations about how we should use this general approach to choosing n in practice?

We might get a range of variances that would be appropriate for a given power target. This would rule out certain experiments where n is too large for the target power and support experiments where n is a reasonable sample size.

Homework, week 5 - IE 5342

Forest Kingfisher

2021-10-02

3.7 (c,d,e,f)

3.10 (b,c)

3.44

3.45