Problems

  • 3.7 (c,d,e,f)
  • 3.10 (b,c)
  • 3.44
  • 3.45

Setup

Setup Libraries

# setup Libraries
library(dplyr)
library(car)
library(agricolae)
library(knitr)
library(lawstat)
library(BSDA)
library(tidyr)
library(pwr)
library(WebPower)

3.7 (c,d,e,f)

The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically. A completely randomized experiment was conducted and the following data were collected:

cementTestOG <- data.frame(
  MixTech1 = c(3129,3000,2865,2890),
  MixTech2 = c(3200,3300,2975,3150),
  MixTech3 = c(2800,2900,2985,3050),
  MixTech4 = c(2600,2700,2600,2765))

cementTest <- pivot_longer(cementTestOG,names(cementTestOG))
cementTest
## # A tibble: 16 x 2
##    name     value
##    <chr>    <dbl>
##  1 MixTech1  3129
##  2 MixTech2  3200
##  3 MixTech3  2800
##  4 MixTech4  2600
##  5 MixTech1  3000
##  6 MixTech2  3300
##  7 MixTech3  2900
##  8 MixTech4  2700
##  9 MixTech1  2865
## 10 MixTech2  2975
## 11 MixTech3  2985
## 12 MixTech4  2600
## 13 MixTech1  2890
## 14 MixTech2  3150
## 15 MixTech3  3050
## 16 MixTech4  2765

(c)

Use the Fisher LSD method with alpha = 0.05 to make comparisons between pairs of means.

cementAnalysis <- (aov(value ~ name, data = cementTest))
cementAnalysisLSD <- LSD.test(cementAnalysis,"name")
cementAnalysisLSD
## $statistics
##    MSerror Df     Mean       CV  t.value      LSD
##   12825.69 12 2931.812 3.862817 2.178813 174.4798
## 
## $parameters
##         test p.ajusted name.t ntr alpha
##   Fisher-LSD      none   name   4  0.05
## 
## $means
##            value       std r      LCL      UCL  Min  Max     Q25    Q50     Q75
## MixTech1 2971.00 120.55704 4 2847.624 3094.376 2865 3129 2883.75 2945.0 3032.25
## MixTech2 3156.25 135.97641 4 3032.874 3279.626 2975 3300 3106.25 3175.0 3225.00
## MixTech3 2933.75 108.27242 4 2810.374 3057.126 2800 3050 2875.00 2942.5 3001.25
## MixTech4 2666.25  80.97067 4 2542.874 2789.626 2600 2765 2600.00 2650.0 2716.25
## 
## $comparison
## NULL
## 
## $groups
##            value groups
## MixTech2 3156.25      a
## MixTech1 2971.00      b
## MixTech3 2933.75      b
## MixTech4 2666.25      c
## 
## attr(,"class")
## [1] "group"
plot(cementAnalysisLSD)

plot(cementAnalysisLSD,variation="IQR")

plot(cementAnalysisLSD,variation="SD")

(d)

Construct a normal probability plot of the residuals. What conclusion would you draw about the validity of the normality assumption?

qqnorm(cementAnalysis$residuals)
qqline(cementAnalysis$residuals)

(e)

Plot the residuals versus the predicted tensile strength. Comment on the plot.

plot(cementTest$value,cementAnalysis$residuals)
abline(0,0)

(f)

Prepare a scatter plot of the results to aid the interpretation of the results of this experiment.

scatterplot(cementTest$value,cementAnalysis$residuals)

3.10 (b,c)

A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts. Strength is usually affected by the percentage of cotton used in the blend of materials for the fiber. The engineer conducts a completely randomized experiment with five levels of cotton content and replicates the experiment five times. The data are shown in the following table.

fiberStrOG <- data.frame(
  obv1 = c(7 ,7 ,15,11, 9),
  obv2 = c(12,17,12,18,18),
  obv3 = c(14,19,19,18,18),
  obv4 = c(19,25,22,19,23),
  obv5 = c(7 ,10,11,15,11))

fiberStr <- pivot_longer(fiberStrOG,names(fiberStrOG))
fiberStr
## # A tibble: 25 x 2
##    name  value
##    <chr> <dbl>
##  1 obv1      7
##  2 obv2     12
##  3 obv3     14
##  4 obv4     19
##  5 obv5      7
##  6 obv1      7
##  7 obv2     17
##  8 obv3     19
##  9 obv4     25
## 10 obv5     10
## # ... with 15 more rows

(b)

Use the Fisher LSD method to make comparisons between the pairs of means. What conclusions can you draw?

fiberStrAnalysis <- (aov(value ~ name, data = fiberStr))
fiberStrAnalysisLSD <- LSD.test(fiberStrAnalysis,"name")
fiberStrAnalysisLSD
## $statistics
##   MSerror Df  Mean       CV  t.value      LSD
##      8.06 20 15.04 18.87642 2.085963 3.745452
## 
## $parameters
##         test p.ajusted name.t ntr alpha
##   Fisher-LSD      none   name   5  0.05
## 
## $means
##      value      std r       LCL      UCL Min Max Q25 Q50 Q75
## obv1   9.8 3.346640 5  7.151566 12.44843   7  15   7   9  11
## obv2  15.4 3.130495 5 12.751566 18.04843  12  18  12  17  18
## obv3  17.6 2.073644 5 14.951566 20.24843  14  19  18  18  19
## obv4  21.6 2.607681 5 18.951566 24.24843  19  25  19  22  23
## obv5  10.8 2.863564 5  8.151566 13.44843   7  15  10  11  11
## 
## $comparison
## NULL
## 
## $groups
##      value groups
## obv4  21.6      a
## obv3  17.6      b
## obv2  15.4      b
## obv5  10.8      c
## obv1   9.8      c
## 
## attr(,"class")
## [1] "group"
plot(fiberStrAnalysisLSD)

plot(fiberStrAnalysisLSD,variation="IQR")

plot(fiberStrAnalysisLSD,variation="SD")

(c)

Analyze the residuals from this experiment and comment on model adequacy.

plot(fiberStr$value,fiberStrAnalysis$residuals)
abline(0,0)

Nothing out of the ordinary, just a couple more values on the positive side of the residual.

3.44

Suppose that four normal populations have means of mu1 = 50, mu2 = 60, mu3 = 50, and mu 4 = 60. How many observations should be taken from each population so that the probability of rejecting the null hypothesis of equal population means is at least 0.90? Assume that alpha = 0.05 and that a reasonable estimate of the error variance is sigma^2 = 25.

alpha = 0.05
power = 0.90
sigmaSq = 25
mu = c(50,60,50,60)
obvsNeeded = power.anova.test(n = NULL,
               groups = length(mu),
               between.var = var(mu),
               within.var = sigmaSq,
               sig.level = alpha,
               power=power)
obvsNeeded
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 4.658128
##     between.var = 33.33333
##      within.var = 25
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

3.45

Refer to Problem 3.44.

(a)

How would your answer change if a reasonable estimate of the experimental error variance were sigma^2 = 36?

sigmaSq = 36
obvsNeededA = power.anova.test(n = NULL,
                              groups = length(mu),
                              between.var = var(mu),
                              within.var = sigmaSq,
                              sig.level = alpha,
                              power=power)
obvsNeededA
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 6.180885
##     between.var = 33.33333
##      within.var = 36
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

(b)

How would your answer change if a reasonable estimate of the experimental error variance were sigma^2 = 49?

sigmaSq = 49
obvsNeededB = power.anova.test(n = NULL,
                               groups = length(mu),
                               between.var = var(mu),
                               within.var = sigmaSq,
                               sig.level = alpha,
                               power=power)
obvsNeededB
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 7.998751
##     between.var = 33.33333
##      within.var = 49
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

(c)

Can you draw any conclusions about the sensitivity of your answer in this particular situation about how your estimate of sigma affects the decision about sample size?

As sigma increase in value, so does sample size.

(d)

Can you make any recommendations about how we should use this general approach to choosing n in practice?

Higher sigma should resualt in selecting a higer n value.

Source Code

# clear memory and environment
rm(list = ls())
gc()

# Homework 5
# IE 5342 - Dr. Timothy I. Matis
# Jesus R. Rosila Mares
# 02 Oct 2021
# Problems:

# 3.7  (c,d,e,f)
# 3.10 (b,c)
# 3.44
# 3.45

# setup Libraries
library(dplyr)
library(car)
library(agricolae)
library(knitr)
library(lawstat)
library(BSDA)
library(tidyr)
library(pwr)
library(WebPower)

#------------------------------------------------------------------------------

# 3.7 (c,d,e,f)
# The tensile strength of Portland cement is being studied.
#   Four different mixing techniques can be used economically.
#   A completely randomized experiment was conducted
#   and the following data were collected:

cementTestOG <- data.frame(
  MixTech1 = c(3129,3000,2865,2890),
  MixTech2 = c(3200,3300,2975,3150),
  MixTech3 = c(2800,2900,2985,3050),
  MixTech4 = c(2600,2700,2600,2765))

cementTest <- pivot_longer(cementTestOG,names(cementTestOG))

# (c) Use the Fisher LSD method with alpha = 0.05 to make
#      comparisons between pairs of means.
cementAnalysis <- (aov(value ~ name, data = cementTest))
cementAnalysisLSD <- LSD.test(cementAnalysis,"name")
summary(cementAnalysisLSD)
plot(cementAnalysisLSD)
plot(cementAnalysisLSD,variation="IQR")
plot(cementAnalysisLSD,variation="SD")

TukeyHSD(cementAnalysis)
plot(TukeyHSD(cementAnalysis))
# (d) Construct a normal probability plot of the residuals.
#       What conclusion would you draw about the validity of
#       the normality assumption?
qqnorm(cementAnalysis$residuals)
qqline(cementAnalysis$residuals)

# (e) Plot the residuals versus the predicted tensile strength.
#       Comment on the plot.
plot(cementTest$value,cementAnalysis$residuals)
abline(0,0)

# (f) Prepare a scatter plot of the results to aid the interpretation
#      of the results of this experiment.
scatterplot(cementTest$value,cementAnalysis$residuals)

#------------------------------------------------------------------------------

# 3.10 (b,c)
# A product developer is investigating the tensile strength
#   of a new synthetic fiber that will be used to make cloth for
#   men's shirts. Strength is usually affected by the percentage of
#   cotton used in the blend of materials for the fiber. The engineer
#   conducts a completely randomized experiment with five levels
#   of cotton content and replicates the experiment five times. The
#   data are shown in the following table.

fiberStrOG <- data.frame(
  obv1 = c(7 ,7 ,15,11, 9),
  obv2 = c(12,17,12,18,18),
  obv3 = c(14,19,19,18,18),
  obv4 = c(19,25,22,19,23),
  obv5 = c(7 ,10,11,15,11))

fiberStr <- pivot_longer(fiberStrOG,names(fiberStrOG))
  
# fiberPrettyTable <- fiberStr %>% spread(idx,obv)
# kable(fiberPrettyTable) %>% kable_styling("striped",full_width = F)

# (b) Use the Fisher LSD method to make comparisons between 
#       the pairs of means. What conclusions can you draw?
fiberStrAnalysis <- (aov(value ~ name, data = fiberStr))
fiberStrAnalysisLSD <- LSD.test(fiberStrAnalysis,"name")
summary(fiberStrAnalysisLSD)
plot(fiberStrAnalysisLSD)
plot(fiberStrAnalysisLSD,variation="IQR")
plot(fiberStrAnalysisLSD,variation="SD")


# (c) Analyze the residuals from this experiment and comment
#      on model adequacy.
plot(fiberStr$value,fiberStrAnalysis$residuals)
abline(0,0)

# Nothing out of the ordinary, just a couple more values on the positive side of
# the residual.

#------------------------------------------------------------------------------

# 3.44
# Suppose that four normal populations have means of
#   mu1 = 50, mu2 = 60, mu3 = 50, and mu 4 = 60. How many observations
#   should be taken from each population so that the
#   probability of rejecting the null hypothesis of equal population
#   means is at least 0.90? Assume that alpha = 0.05 and that a
#   reasonable estimate of the error variance is sigma^2 = 25.
alpha = 0.05
power = 0.90
sigmaSq = 25
mu = c(50,60,50,60)
obvsNeeded = power.anova.test(n = NULL,
               groups = length(mu),
               between.var = var(mu),
               within.var = sigmaSq,
               sig.level = alpha,
               power=power)
obvsNeeded

#------------------------------------------------------------------------------

# 3.45
#Refer to Problem 3.44.

# (a) How would your answer change if a reasonable estimate
#      of the experimental error variance were sigma^2 = 36?
sigmaSq = 36
obvsNeededA = power.anova.test(n = NULL,
                              groups = length(mu),
                              between.var = var(mu),
                              within.var = sigmaSq,
                              sig.level = alpha,
                              power=power)
obvsNeededA

# (b) How would your answer change if a reasonable estimate
#      of the experimental error variance were sigma^2 = 49?
sigmaSq = 49
obvsNeededB = power.anova.test(n = NULL,
                               groups = length(mu),
                               between.var = var(mu),
                               within.var = sigmaSq,
                               sig.level = alpha,
                               power=power)
obvsNeededB

# (c) Can you draw any conclusions about the sensitivity of
#      your answer in this particular situation about how your
#      estimate of sigma affects the decision about sample size?

# as sigma increase in value, so does sample size.

# (d) Can you make any recommendations about how we
#      should use this general approach to choosing n in
#      practice?

# Higher sigma should resualt in selecting a higer n value.

#------------------------------------------------------------------------------