The Experiment

Myers and Montgomery (1995) describe a gasoline blending experiment involving three mixture components. There are no constraints on the mixture components, and the following 10-run design was used:

library(mixexp)
## Warning: package 'mixexp' was built under R version 4.1.3
## Loading required package: lattice
## Loading required package: grid
## Loading required package: daewr
## Warning: package 'daewr' was built under R version 4.1.3
## Registered S3 method overwritten by 'DoE.base':
##   method           from       
##   factorize.factor conf.design
# y is mpg
gasoline.data <- read.table(header = TRUE, text="
point   x1    x2    x3    y
1       0     0     0     24.5
1       0     0     0     25.1
2       0     1     0     24.8
2       0     1     0     23.9
3       0     0     1     22.7
3       0     0     1     23.6
4       0.5   0.5   0     25.1
5       0.5   0     0.5   24.3
6       0     0.5   0.5   23.5
7       0.333 0.334 0.333 24.8
7       0.333 0.334 0.333 24.1
8       0.666 0.167 0.167 24.2
9       0.167 0.666 0.167 23.9
10      0.167 0.167 0.666 23.7
")

1

What type of design did the experimenters use?

# They used a simplex design of (3,5) 
SLD(3,5)
##     x1  x2  x3
## 1  1.0 0.0 0.0
## 2  0.8 0.2 0.0
## 3  0.6 0.4 0.0
## 4  0.4 0.6 0.0
## 5  0.2 0.8 0.0
## 6  0.0 1.0 0.0
## 7  0.8 0.0 0.2
## 8  0.6 0.2 0.2
## 9  0.4 0.4 0.2
## 10 0.2 0.6 0.2
## 11 0.0 0.8 0.2
## 12 0.6 0.0 0.4
## 13 0.4 0.2 0.4
## 14 0.2 0.4 0.4
## 15 0.0 0.6 0.4
## 16 0.4 0.0 0.6
## 17 0.2 0.2 0.6
## 18 0.0 0.4 0.6
## 19 0.2 0.0 0.8
## 20 0.0 0.2 0.8
## 21 0.0 0.0 1.0

2

Fit a quadratic mixture model to the data. Is the model adequate? (consider the r-square value and p-values for the predictors).

MixturePlot(x=gasoline.data$x1, y=gasoline.data$x2, z=gasoline.data$x3, w=gasoline.data$y, 
            cols=TRUE, mod=2, n.breaks=9,
            corner.labs = c("x1", "x2", "x3"),
            x1lab="fraction x1", 
            x2lab="fraction x2", 
            x3lab="fraction x3")

summary(
  testQ2<-MixModel(gasoline.data,"y",mixcomps=c("x1","x2","x3"),model=2)  
)
##      
##       coefficients   Std.err    t.value         Prob
## x1       24.744651 0.3224457 76.7405194 9.265921e-13
## x2       24.311247 0.3224501 75.3953812 1.067146e-12
## x3       23.177523 0.3224457 71.8803897 1.562750e-12
## x2:x1     1.513408 1.8159639  0.8333908 4.288029e-01
## x3:x1     1.111051 1.8167043  0.6115751 5.577941e-01
## x2:x3    -1.087369 1.8159639 -0.5987835 5.658822e-01
##      
## Residual standard error:  0.4651956  on  8 degrees of freedom 
## Corrected Multiple R-squared:  0.7092421
## 
## Call:
## lm(formula = mixmodnI, data = frame)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.4659 -0.2239  0.1097  0.4498 25.1000 
## 
## Coefficients:
##       Estimate Std. Error t value Pr(>|t|)  
## x1      23.385     43.487   0.538   0.6054  
## x2      24.300      8.608   2.823   0.0224 *
## x3      23.166      8.608   2.691   0.0274 *
## x1:x2    4.263     98.872   0.043   0.9667  
## x1:x3    3.861     98.906   0.039   0.9698  
## x2:x3   -1.572     50.761  -0.031   0.9761  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.41 on 8 degrees of freedom
## Multiple R-squared:  0.8494, Adjusted R-squared:  0.7364 
## F-statistic: 7.518 on 6 and 8 DF,  p-value: 0.00598
# We see that the only significant predictors with a p-value less then 0.05 are x2 and x3.
# overall the model is significant with a p-value = 0.00598 and 73.64% of the variation in the model is explained by the predictors.

3

Plot the response surface contours. What blend would you recommend to maximize the miles per gallon?

ModelPlot(model = testQ2,dimensions = list(x1="x1",x2="x2",x3="x3"), 
           main="Thread y",constraints=FALSE,contour=TRUE,
           at=c(23,23.5,24,24.5,25,25.5,26),fill=FALSE,
           axislabs=c("X1", "X2", "X3"),
           cornerlabs = c("X1", "X2", "X3"),pseudo=FALSE)

shapiro.test(testQ2$residuals)
## 
##  Shapiro-Wilk normality test
## 
## data:  testQ2$residuals
## W = 0.4648, p-value = 3.176e-06

The region around the point (0.5, 0.5, 0.5) appears to have the highest responses.

4

Estimate the response in your optimal region.

testpoints <- data.frame(x1=c(0.5, 0.5, 0.5), 
                         x2=c(0.0, 0.0, 0.0), 
                         x3=c(0.5, 0.5, 0.8))
predict(testQ2, newdata=testpoints, interval="prediction")
##        fit       lwr      upr
## 1 24.24074 -13.17085 61.65232
## 2 24.24074 -13.17085 61.65232
## 3 31.76970 -25.01354 88.55294