We are testing our assumptions about reality by experimenting.

Statistical design of experiments is one of many methods that allows to implement the scientific method. This example presents the systematic solution of a problem with uknown causes by experimenting; therefore, we can draw some lessons from it.

The engine

Engine and piston rings
Engine and piston rings

Piston ring working in a gasoline engine.

The problem

Example of casting defects.

Defects
Defects

A gray–iron casting product is been molded with a green sand stacking molding process; which is known to produce pin–hole defects. The current yield is 76%. The plant manager wants to increase the plant’s yield.

Pouring molten metal into the ladle
Pouring molten metal into the ladle

The pin–holes are associated with the molten metal temperature at the furnace, at the time of the pouring process. The acceptable minimum is 1127 C. However, the gray iron alloy is known to be stable at 1204 C.

Pouring molten metal into the stack of molds
Pouring molten metal into the stack of molds

An experiment was conducted to verify if:

\[yield=f(temp)+\epsilon\] This is the research question. That leads to the experimental hypothesis:

The casting’s yield depends on the temperature at the laddle.

This experimental hypothesis can be modelled with the following pair of statistical hypotheses:

\[H_0:\mu_{1127}=\mu_{1178}=\mu_{1204}\] \[H_1:\mu_{1127}\neq\mu_{1178}\quad \mu_{1127}\neq\mu_{1204}\quad \mu_{1178}\neq\mu_{1204}\] at three temperatures. The runs’ results are shown in table 1.

1127 1178 1204
73.4 74.4 78.9
76.3 74.2 79.3
76.9 72.7 77.9
77.9 73.5 78.9
77.7 71.4 79.5

\(Yield=f(T)\)?

Data

data.set <- data.frame(yield=
                     c(73.4,76.3,76.9,77.9,77.7,
                       74.4,74.2,72.7,73.5,71.4,
                       78.9,79.3,77.9,78.9,79.5),
                     temp=as.factor(
                       c(rep('1127',5),
                         rep('1178',5),
                         rep('1204',5)
                         )))
                     
data.set
##    yield temp
## 1   73.4 1127
## 2   76.3 1127
## 3   76.9 1127
## 4   77.9 1127
## 5   77.7 1127
## 6   74.4 1178
## 7   74.2 1178
## 8   72.7 1178
## 9   73.5 1178
## 10  71.4 1178
## 11  78.9 1204
## 12  79.3 1204
## 13  77.9 1204
## 14  78.9 1204
## 15  79.5 1204

Descriptive statistics

#Boxplot
fn <- yield~temp
boxplot (fn,data.set)

library(gplots)
plotmeans(fn,data.set,ci.label=T,n.label=F)

Variance Homogeneity

bartlett.test(fn,data.set)
## 
##  Bartlett test of homogeneity of variances
## 
## data:  yield by temp
## Bartlett's K-squared = 3.6215, df = 2, p-value = 0.1635
fligner.test(fn,data.set)
## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  yield by temp
## Fligner-Killeen:med chi-squared = 1.9389, df = 2, p-value = 0.3793

Statistical Model—ANOVA 1w.

The potential relation of the yield and the temperature are explored as a fixed effect model:

\[Y_{ij}=\mu+\tau_i+\epsilon_{ij}\]

Where:

  • \(Y_{ij}\)—The runs’ results, \(i\in(1,...,k)\) treatments (levels) with \(j\)-observations;
  • \(\mu\)—The mean of means;
  • \(\tau\)—The fixed treatment’s effects;
  • \(\epsilon_{ij}\)—The runs’ errors.
fit_1 <- aov(fn,data.set)
summary(fit_1)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## temp         2  80.55   40.27   23.32 7.34e-05 ***
## Residuals   12  20.72    1.73                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(fit_1)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = fn, data = data.set)
## 
## $temp
##            diff        lwr        upr     p adj
## 1178-1127 -3.20 -5.4173783 -0.9826217 0.0060433
## 1204-1127  2.46  0.2426217  4.6773783 0.0297891
## 1204-1178  5.66  3.4426217  7.8773783 0.0000517

Power analysis

power.anova.test(groups = 3, n = 5, between.var = 40.27, within.var = 1.73, power = NULL)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 3
##               n = 5
##     between.var = 40.27
##      within.var = 1.73
##       sig.level = 0.05
##           power = 1
## 
## NOTE: n is number in each group

Marginal effects

pacman::p_load(phia)
IM.t = interactionMeans(fit_1)
IM.t ; plot(IM.t)
##   temp adjusted mean std. error
## 1 1127         76.44  0.5877074
## 2 1178         73.24  0.5877074
## 3 1204         78.90  0.5877074

Graphical analysis \((\tau_i)\)

## $grandsum
##     Grandmean        df.bet       df.with        MS.bet       MS.with 
##         76.19          2.00         12.00         40.27          1.73 
##        F.stat        F.prob SS.bet/SS.tot 
##         23.32          0.00          0.80 
## 
## $stats
##       Size Contrast Coef Wt'd Mean  Mean Trim'd Mean Var. St. Dev.
## X1178    5         -2.95     73.24 73.24       73.47 1.50     1.23
## X1127    5          0.25     76.44 76.44       76.97 3.30     1.82
## X1204    5          2.71     78.90 78.90       79.03 0.38     0.62

Error bars

#Barplot with CI
require (sciplot)
## Loading required package: sciplot
bargraph.CI(x.factor=temp, yield, data = data.set)