Setting up RMarkdown when opening it enables you to create dynamic, reproducible, and visually appealing reports, presentations, and documents, that can help you communicate your data analysis and research findings more effectively.
R is an open-source programming language and software environment for statistical computing and graphics. It is widely used by data analysts and statisticians for data analysis, visualization, and modeling. R provides a comprehensive set of tools for statistical analysis, including linear and nonlinear modeling, classical statistical tests, time-series analysis, and data visualization. RStudio is an integrated development environment (IDE) for R that provides a user-friendly interface for working with R. It includes a code editor, debugging tools, and data visualization tools that make it easier to work with R code. RStudio also supports the creation of interactive web applications using the Shiny package, making it a powerful tool for data visualization and analysis.
One of the key benefits of R and RStudio is their flexibility and extensibility. R has a large community of developers who contribute packages that extend its functionality, making it possible to perform a wide range of analyses with just a few lines of code. RStudio provides an interface for managing these packages and installing new ones. R and RStudio can be used for a wide range of data analysis tasks, including data cleaning and preparation, statistical modeling, and data visualization. They are particularly useful for working with large datasets, as they can handle data from a variety of sources and perform complex analyses with ease.
In summary, R and RStudio are powerful tools for statistical computing and data analysis. Their flexibility, extensibility, and user-friendly interface make them popular among data analysts and statisticians, and their ability to handle large datasets and perform complex analyses make them essential tools for many data-driven industries.
Response Surface Methodology (RSM) is a statistical technique used to model and optimize the relationship between a response variable and one or more independent variables. RSM involves designing experiments, fitting mathematical models to the experimental data, and using the models to predict the optimal conditions for a process. RSM is particularly useful for optimizing processes that involve multiple variables, as it allows researchers to explore the relationship between the variables and determine the optimal settings for each.
RSM has been used in a wide range of fields, including chemistry, engineering, agriculture, and social sciences. In chemistry, RSM has been used to optimize the conditions for chemical reactions, such as temperature, pressure, and concentrations of reactants. In engineering, RSM has been used to optimize the parameters of manufacturing processes, such as cutting speed and feed rate. In agriculture, RSM has been used to optimize the conditions for crop growth, such as fertilizer application and irrigation.
The main goal of RSM is to find the optimal values of the independent variables that result in the maximum or minimum value of the response variable. RSM accomplishes this by constructing a response surface, which is a mathematical model that describes the relationship between the independent variables and the response variable. The response surface can be visualized as a contour plot or a three-dimensional surface, which allows researchers to identify the optimal values of the independent variables that result in the desired response.
Overall, RSM is a powerful statistical technique that is used to optimize processes and improve the efficiency of experiments. Its ability to model the relationship between multiple variables makes it a valuable tool for researchers and practitioners in a wide range of fields.
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_TAB_02-08.txt"
df.2.8 <- read.table(fn.data, header=TRUE, skip=1)
str(df.2.8)
'data.frame': 12 obs. of 3 variables:
$ x1: int -1 1 -1 1 -9 9 0 0 0 0 ...
$ x2: int -1 -1 1 1 0 0 -9 9 0 0 ...
$ y : int 43 78 69 73 48 76 65 74 76 79 ...
head(df.2.8,5)
x1 x2 y
1 -1 -1 43
2 1 -1 78
3 -1 1 69
4 1 1 73
5 -9 0 48
df.2.8[,c("x1","x2")] <- replace(df.2.8[,c("x1","x2")], (df.2.8[,c("x1","x2")] == 9)
, sqrt(2))
df.2.8[,c("x1","x2")] <- replace(df.2.8[,c("x1","x2")], (df.2.8[,c("x1","x2")] == -9)
, -sqrt(2))
head(df.2.8,5)
x1 x2 y
1 -1.000000 -1 43
2 1.000000 -1 78
3 -1.000000 1 69
4 1.000000 1 73
5 -1.414214 0 48
The Scatterplot matrix below shows some relationships between y and other variables.
library(ggplot2)
library(reshape)
library(GGally)
suppressMessages(suppressWarnings(library(GGally)))
p <- ggpairs(df.2.8, alpha = 0.1)
# put scatterplots on top so y axis is vertical
#p <- ggpairs(df.2.8, upper = list(continuous = "points")
# , lower = list(continuous = "cor")
# )
print(p)
# detach package after use so reshape2 works (old reshape (v.1) conflicts)
detach("package:GGally", unload=TRUE)
detach("package:reshape", unload=TRUE)
Correlation matrix indicates some (linear) correlations with y are different than zero, but if curvature exists, this summary is not very meaningful.
# correlation matrix and associated p-values testing "H0: rho == 0"
library(Hmisc)
rcorr(as.matrix(df.2.8))
x1 x2 y
x1 1.00 0.00 0.66
x2 0.00 1.00 0.28
y 0.66 0.28 1.00
n= 12
P
x1 x2 y
x1 1.0000 0.0193
x2 1.0000 0.3718
y 0.0193 0.3718
library(corrplot)
library(dplyr)
library(magrittr)
df.2.8 %>%
dplyr::select(x1:y) %>%
cor() %>%
round(3) %>%
corrplot(method = "color", addCoef.col="white", type = "upper",
title="Correlation between response variable and Predictor variables",
mar=c(0,0,2,0),
tl.cex=0.5, number.cex = 0.4)
Because this is a special kind of model (a full second-order model), we can get the test for higher order terms and lack of it simply by using rsm().
# load the rsm package
library(rsm)
# fit second-order (SO) model
# -- look up ?SO and see other options
rsm.2.8.y.SOx12 <- rsm(y ~ SO(x1, x2), data = df.2.8)
# which variables are available in the rsm object?
names(rsm.2.8.y.SOx12)
[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "xlevels" "call" "terms" "model"
[13] "data" "b" "order" "B"
[17] "newlabs"
names(summary(rsm.2.8.y.SOx12))
[1] "call" "terms" "residuals" "coefficients"
[5] "aliased" "sigma" "df" "r.squared"
[9] "adj.r.squared" "fstatistic" "cov.unscaled" "canonical"
[13] "lof"
summary(rsm.2.8.y.SOx12)
Call:
rsm(formula = y ~ SO(x1, x2), data = df.2.8)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 79.75000 1.21361 65.7133 8.352e-10 ***
x1 9.82475 0.85815 11.4488 2.665e-05 ***
x2 4.21599 0.85815 4.9129 0.0026759 **
x1:x2 -7.75000 1.21361 -6.3859 0.0006938 ***
x1^2 -8.87500 0.95944 -9.2502 9.017e-05 ***
x2^2 -5.12500 0.95944 -5.3417 0.0017585 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.98, Adjusted R-squared: 0.9634
F-statistic: 58.85 on 5 and 6 DF, p-value: 5.119e-05
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 914.40 457.20 77.6054 5.155e-05
TWI(x1, x2) 1 240.25 240.25 40.7801 0.0006938
PQ(x1, x2) 2 578.92 289.46 49.1328 0.0001906
Residuals 6 35.35 5.89
Lack of fit 3 8.60 2.87 0.3214 0.8119178
Pure error 3 26.75 8.92
Stationary point of response surface:
x1 x2
0.55819271 -0.01073203
Eigenanalysis:
eigen() decomposition
$values
[1] -2.695206 -11.304794
$vectors
[,1] [,2]
x1 0.5312434 -0.8472193
x2 -0.8472193 -0.5312434
rsm.2.8.y.SOx12$studres <- rstudent(rsm.2.8.y.SOx12)
rsm.2.8.y.SOx12$studres
1 2 3 4 5 6
-0.61072616 -0.71491975 0.71491975 0.61072616 -0.06495026 0.06495026
7 8 9 10 11 12
0.98067965 -0.98067965 -2.37660148 -0.32922247 1.81973167 0.55959278
The following illustrates ftting several model types using rsm().
# fit the first-order model
rsm.2.8.y.FOx12 <- rsm(y ~ FO(x1, x2), data = df.2.8)
# externally Studentized residuals
rsm.2.8.y.FOx12$studres <- rstudent(rsm.2.8.y.FOx12)
summary(rsm.2.8.y.FOx12)
Call:
rsm(formula = y ~ FO(x1, x2), data = df.2.8)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.4167 2.8129 25.0338 1.244e-09 ***
x1 9.8247 3.4450 2.8519 0.01903 *
x2 4.2160 3.4450 1.2238 0.25210
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.5169, Adjusted R-squared: 0.4096
F-statistic: 4.815 on 2 and 9 DF, p-value: 0.03785
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 914.40 457.20 4.8154 0.03785
Residuals 9 854.51 94.95
Lack of fit 6 827.76 137.96 15.4722 0.02333
Pure error 3 26.75 8.92
Direction of steepest ascent (at radius 1):
x1 x2
0.9189626 0.3943447
Corresponding increment in original units:
x1 x2
0.9189626 0.3943447
# fit the first-order with two-way interaction model.
rsm.2.8.y.TWIx12 <- rsm(y ~ FO(x1, x2) + TWI(x1, x2), data = df.2.8)
# externally Studentized residuals
rsm.2.8.y.TWIx12$studres <- rstudent(rsm.2.8.y.TWIx12)
summary(rsm.2.8.y.TWIx12)
Call:
rsm(formula = y ~ FO(x1, x2) + TWI(x1, x2), data = df.2.8)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.4167 2.5295 27.8377 2.993e-09 ***
x1 9.8247 3.0980 3.1713 0.01317 *
x2 4.2160 3.0980 1.3609 0.21066
x1:x2 -7.7500 4.3813 -1.7689 0.11488
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.6527, Adjusted R-squared: 0.5225
F-statistic: 5.013 on 3 and 8 DF, p-value: 0.03039
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 914.40 457.20 5.9544 0.02607
TWI(x1, x2) 1 240.25 240.25 3.1289 0.11488
Residuals 8 614.26 76.78
Lack of fit 5 587.51 117.50 13.1779 0.02966
Pure error 3 26.75 8.92
Stationary point of response surface:
x1 x2
0.5439987 1.2677094
Eigenanalysis:
eigen() decomposition
$values
[1] 3.875 -3.875
$vectors
[,1] [,2]
x1 -0.7071068 -0.7071068
x2 0.7071068 -0.7071068
# Fit the second-order without interactions model
rsm.2.8.y.PQx12 <- rsm(y ~ FO(x1, x2) + PQ(x1, x2), data = df.2.8)
# externally Studentized residuals
rsm.2.8.y.PQx12$studres <- rstudent(rsm.2.8.y.PQx12)
summary(rsm.2.8.y.PQx12)
Call:
rsm(formula = y ~ FO(x1, x2) + PQ(x1, x2), data = df.2.8)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 79.7500 3.1373 25.4198 3.724e-08 ***
x1 9.8247 2.2184 4.4287 0.003049 **
x2 4.2160 2.2184 1.9004 0.099141 .
x1^2 -8.8750 2.4803 -3.5782 0.008997 **
x2^2 -5.1250 2.4803 -2.0663 0.077639 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.8442, Adjusted R-squared: 0.7552
F-statistic: 9.482 on 4 and 7 DF, p-value: 0.005903
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 914.40 457.20 11.613 0.005978
PQ(x1, x2) 2 578.92 289.46 7.352 0.019052
Residuals 7 275.60 39.37
Lack of fit 4 248.85 62.21 6.977 0.071196
Pure error 3 26.75 8.92
Stationary point of response surface:
x1 x2
0.5535069 0.4113161
Eigenanalysis:
eigen() decomposition
$values
[1] -5.125 -8.875
$vectors
[,1] [,2]
x1 0 -1
x2 -1 0
# compare the reduced first-order model to the full second-order model
anova(rsm.2.8.y.FOx12, rsm.2.8.y.SOx12)
Analysis of Variance Table
Model 1: y ~ FO(x1, x2)
Model 2: y ~ FO(x1, x2) + TWI(x1, x2) + PQ(x1, x2)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 9 854.51
2 6 35.35 3 819.17 46.349 0.0001524 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## conf int for parameters
confint(rsm.2.8.y.SOx12)
2.5 % 97.5 %
(Intercept) 76.780415 82.719585
FO(x1, x2)x1 7.724934 11.924561
FO(x1, x2)x2 2.116176 6.315804
TWI(x1, x2) -10.719585 -4.780415
PQ(x1, x2)x1^2 -11.222663 -6.527337
PQ(x1, x2)x2^2 -7.472663 -2.777337
# conf int for regression line
predict(rsm.2.8.y.SOx12, df.2.8[1:dim(df.2.8)[1],], interval = "confidence")
fit lwr upr
1 43.95926 39.26394 48.65459
2 79.10876 74.41343 83.80408
3 67.89124 63.19592 72.58657
4 72.04074 67.34541 76.73606
5 48.10571 43.41038 52.80104
6 75.89429 71.19896 80.58962
7 63.53769 58.84236 68.23302
8 75.46231 70.76698 80.15764
9 79.75000 76.78041 82.71959
10 79.75000 76.78041 82.71959
11 79.75000 76.78041 82.71959
12 79.75000 76.78041 82.71959
# pred int for new observations
predict(rsm.2.8.y.SOx12, df.2.8[1:dim(df.2.8)[1],], interval = "prediction")
fit lwr upr
1 43.95926 36.38828 51.53025
2 79.10876 71.53777 86.67974
3 67.89124 60.32026 75.46223
4 72.04074 64.46975 79.61172
5 48.10571 40.53472 55.67670
6 75.89429 68.32330 83.46528
7 63.53769 55.96670 71.10868
8 75.46231 67.89132 83.03330
9 79.75000 73.10981 86.39019
10 79.75000 73.10981 86.39019
11 79.75000 73.10981 86.39019
12 79.75000 73.10981 86.39019
The lack-of-fit (LOF) test is equivalent to a comparison between two models. First, we define the full model by setting up a categorical group variable that will take unique values for each distinct pair of (x1, x2) values. This group variable fits a model that is equivalent to a one-way ANOVA. The SSE for the full model is 26.75 with 3 df. This is the pure error SS and df. (Try this for yourself.) Second, we define the reduced model (reduced compared to the one-way ANOVA above) as the regression model fit (taking x1 and x2 as continuous variables). The SSE for the reduced model is 35.35 with 6 df. This is the residual SS. The LOF SS is the difference of SSE between the reduced and the full model: 35:35 26:75 = 8:6 with 6 3 = 3 df. The F-test is then the LOF SSE and df vs the full SSE and df, F = (8:6=3)=(26:75=3) = 0:32, where there are 3 df and 3 df in the numerator and denominator
summary(rsm.2.8.y.SOx12)$lof
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 914.40 457.20 77.6054 5.155e-05 ***
TWI(x1, x2) 1 240.25 240.25 40.7801 0.0006938 ***
PQ(x1, x2) 2 578.92 289.46 49.1328 0.0001906 ***
Residuals 6 35.35 5.89
Lack of fit 3 8.60 2.87 0.3214 0.8119178
Pure error 3 26.75 8.92
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot diagnistics
par(mfrow=c(2,4))
plot(df.2.8$x1, rsm.2.8.y.SOx12$studres, main="Residuals vs x1")
# horizontal line at zero
abline(h = 0, col = "gray75")
plot(df.2.8$x2, rsm.2.8.y.SOx12$studres, main="Residuals vs x2")
# horizontal line at zero
abline(h = 0, col = "gray75")
# residuals vs order of data
plot(rsm.2.8.y.SOx12$studres, main="Residuals vs Order of data")
# horizontal line at zero
abline(h = 0, col = "gray75")
plot(rsm.2.8.y.SOx12, which = c(1,4,6))
# Normality of Residuals
library(car)
qqPlot(rsm.2.8.y.SOx12$studres, las = 1, id = 3, main="QQ Plot")
[1] 9 11
# first-order model
par(mfrow=c(2,2))
contour(rsm.2.8.y.FOx12, ~ x1 + x2, image = TRUE, main="first-order model")
persp(rsm.2.8.y.FOx12, x2 ~ x1, zlab = "y", main="first-order model")
# second-order model
contour(rsm.2.8.y.SOx12, ~ x1 + x2, image = TRUE, main="second-order model")
persp(rsm.2.8.y.SOx12, x2 ~ x1, zlab = "y", main="second-order model")
#### 3.1
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_03-01.txt"
df.3.1 <- read.table(fn.data, header=TRUE)
str(df.3.1)
'data.frame': 8 obs. of 5 variables:
$ a : int -1 1 -1 1 -1 1 -1 1
$ b : int -1 -1 1 1 -1 -1 1 1
$ c : int -1 -1 -1 -1 1 1 1 1
$ y1: int 247 470 429 435 837 551 775 660
$ y2: int 400 446 405 445 850 670 865 530
head(df.3.1,5)
a b c y1 y2
1 -1 -1 -1 247 400
2 1 -1 -1 470 446
3 -1 1 -1 429 405
4 1 1 -1 435 445
5 -1 -1 1 837 850
library(reshape2)
df.3.1.long <- melt(df.3.1, id.vars = c("a", "b", "c")
, variable.name = "rep", value.name = "y")
head(df.3.1.long,10)
a b c rep y
1 -1 -1 -1 y1 247
2 1 -1 -1 y1 470
3 -1 1 -1 y1 429
4 1 1 -1 y1 435
5 -1 -1 1 y1 837
6 1 -1 1 y1 551
7 -1 1 1 y1 775
8 1 1 1 y1 660
9 -1 -1 -1 y2 400
10 1 -1 -1 y2 446
library(rsm)
rsm.3.1.y.SOabc <- rsm(y ~ SO(a, b, c), data = df.3.1.long)
rsm.3.1.y.SOabc
Call:
rsm(formula = y ~ SO(a, b, c), data = df.3.1.long)
Coefficients:
(Intercept) FO(a, b, c)a FO(a, b, c)b FO(a, b, c)c
563.438 -37.562 4.563 153.812
TWI(a, b, c)a:b TWI(a, b, c)a:c TWI(a, b, c)b:c PQ(a, b, c)a^2
-12.937 -76.938 -14.312 NA
PQ(a, b, c)b^2 PQ(a, b, c)c^2
NA NA
rsm.3.1.y.SOabc$studres <- rstudent(rsm.3.1.y.SOabc)
summary(rsm.3.1.y.SOabc)
Call:
rsm(formula = y ~ SO(a, b, c), data = df.3.1.long)
Residuals:
Min 1Q Median 3Q Max
-91.438 -27.469 5.688 27.719 79.937
Coefficients: (3 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 563.438 15.716 35.852 5.06e-11 ***
FO(a, b, c)a -37.562 15.716 -2.390 0.040549 *
FO(a, b, c)b 4.563 15.716 0.290 0.778154
FO(a, b, c)c 153.812 15.716 9.787 4.28e-06 ***
TWI(a, b, c)a:b -12.937 15.716 -0.823 0.431654
TWI(a, b, c)a:c -76.938 15.716 -4.896 0.000853 ***
TWI(a, b, c)b:c -14.312 15.716 -0.911 0.386188
PQ(a, b, c)a^2 NA NA NA NA
PQ(a, b, c)b^2 NA NA NA NA
PQ(a, b, c)c^2 NA NA NA NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 62.86 on 9 degrees of freedom
Multiple R-squared: 0.9339, Adjusted R-squared: 0.8898
F-statistic: 21.18 on 6 and 9 DF, p-value: 7.875e-05
lm.3.1.y.3WIabc <- lm(y ~ (a + b + c)^3, data = df.3.1.long)
# externally Studentized residuals
lm.3.1.y.3WIabc$studres <- rstudent(lm.3.1.y.3WIabc)
summary(lm.3.1.y.3WIabc)
Call:
lm(formula = y ~ (a + b + c)^3, data = df.3.1.long)
Residuals:
Min 1Q Median 3Q Max
-76.50 -20.25 0.00 20.25 76.50
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 563.438 15.810 35.637 4.21e-10 ***
a -37.562 15.810 -2.376 0.04484 *
b 4.563 15.810 0.289 0.78024
c 153.812 15.810 9.729 1.04e-05 ***
a:b -12.937 15.810 -0.818 0.43688
a:c -76.938 15.810 -4.866 0.00125 **
b:c -14.312 15.810 -0.905 0.39177
a:b:c 14.937 15.810 0.945 0.37242
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 63.24 on 8 degrees of freedom
Multiple R-squared: 0.9405, Adjusted R-squared: 0.8884
F-statistic: 18.06 on 7 and 8 DF, p-value: 0.0002605
Here are the residual diagnostic plots, but we’ll skip over them since our interest is in the interaction plots below.
# plot diagnistics
par(mfrow=c(2,4))
plot(df.3.1.long$a, lm.3.1.y.3WIabc$studres, main="Residuals vs a")
# horizontal line at zero
abline(h = 0, col = "gray75")
plot(df.3.1.long$b, lm.3.1.y.3WIabc$studres, main="Residuals vs b")
# horizontal line at zero
abline(h = 0, col = "gray75")
plot(df.3.1.long$c, lm.3.1.y.3WIabc$studres, main="Residuals vs c")
# horizontal line at zero
abline(h = 0, col = "gray75")
# residuals vs order of data
plot(lm.3.1.y.3WIabc$studres, main="Residuals vs Order of data")
# horizontal line at zero
abline(h = 0, col = "gray75")
plot(lm.3.1.y.3WIabc, which = c(4,6))
# Normality of Residuals
library(car)
qqPlot(lm.3.1.y.3WIabc$studres, las = 1, id = 3, main="QQ Plot")
[1] 1 9
cooks.distance(lm.3.1.y.3WIabc)
1 2 3 4 5 6
0.365817068 0.009001266 0.009001266 0.001562720 0.002640996 0.221296745
7 8 9 10 11 12
0.126580300 0.264099639 0.365817068 0.009001266 0.009001266 0.001562720
13 14 15 16
0.002640996 0.221296745 0.126580300 0.264099639
# interaction plot
par(mfcol=c(2,3))
interaction.plot(df.3.1.long$a, df.3.1.long$b, df.3.1.long$y)
interaction.plot(df.3.1.long$b, df.3.1.long$a, df.3.1.long$y)
interaction.plot(df.3.1.long$a, df.3.1.long$c, df.3.1.long$y)
interaction.plot(df.3.1.long$c, df.3.1.long$a, df.3.1.long$y)
interaction.plot(df.3.1.long$b, df.3.1.long$c, df.3.1.long$y)
interaction.plot(df.3.1.long$c, df.3.1.long$b, df.3.1.long$y)
520
[1] 520
# interaction plot
par(mfcol=c(2,3))
interaction.plot(df.3.1.long$a, df.3.1.long$b * df.3.1.long$c, df.3.1.long$y)
interaction.plot(df.3.1.long$b * df.3.1.long$c, df.3.1.long$a, df.3.1.long$y)
interaction.plot(df.3.1.long$b, df.3.1.long$a * df.3.1.long$c, df.3.1.long$y)
interaction.plot(df.3.1.long$a * df.3.1.long$c, df.3.1.long$b, df.3.1.long$y)
interaction.plot(df.3.1.long$c, df.3.1.long$a * df.3.1.long$b, df.3.1.long$y)
interaction.plot(df.3.1.long$a * df.3.1.long$b, df.3.1.long$c, df.3.1.long$y)
540
[1] 540
# Interaction plots, ggplot
library(plyr)
# Calculate the cell means for each (a, b) combination
# create factor version for ggplot categories
df.3.1.factor <- df.3.1.long
df.3.1.factor$a <- factor(df.3.1.factor$a)
df.3.1.factor$b <- factor(df.3.1.factor$b)
df.3.1.factor$c <- factor(df.3.1.factor$c)
#mean(df.3.1.factor[, "y"])
df.3.1.factor.mean <- ddply(df.3.1.factor, .(), summarise, m = mean(y))
#df.3.1.factor.mean
df.3.1.factor.mean.a <- ddply(df.3.1.factor, .(a), summarise, m = mean(y))
#df.3.1.factor.mean.a
df.3.1.factor.mean.b <- ddply(df.3.1.factor, .(b), summarise, m = mean(y))
#df.3.1.factor.mean.b
df.3.1.factor.mean.c <- ddply(df.3.1.factor, .(c), summarise, m = mean(y))
#df.3.1.factor.mean.c
df.3.1.factor.mean.ab <- ddply(df.3.1.factor, .(a,b), summarise, m = mean(y))
#df.3.1.factor.mean.ab
df.3.1.factor.mean.ac <- ddply(df.3.1.factor, .(a,c), summarise, m = mean(y))
#df.3.1.factor.mean.ac
df.3.1.factor.mean.bc <- ddply(df.3.1.factor, .(b,c), summarise, m = mean(y))
#df.3.1.factor.mean.bc
df.3.1.factor.mean.abc <- ddply(df.3.1.factor, .(a,b,c), summarise, m = mean(y))
#df.3.1.factor.mean.abc
library(ggplot2)
## (a, b)
p <- ggplot(df.3.1.factor, aes(x = a, y = y, colour = b, shape = b))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.ab, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.ab, aes(y = m, group = b), size = 1.5)
p <- p + labs(title = "Interaction plot, b by a")
print(p)
## ymax not defined: adjusting position using y instead
p <- ggplot(df.3.1.factor, aes(x = b, y = y, colour = a, shape = a))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.ab, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.ab, aes(y = m, group = a), size = 1.5)
p <- p + labs(title = "Interaction plot, a by b")
print(p)
## ymax not defined: adjusting position using y instead
## (a, c)
p <- ggplot(df.3.1.factor, aes(x = a, y = y, colour = c, shape = c))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.ac, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.ac, aes(y = m, group = c), size = 1.5)
p <- p + labs(title = "Interaction plot, c by a")
print(p)
## ymax not defined: adjusting position using y instead
p <- ggplot(df.3.1.factor, aes(x = c, y = y, colour = a, shape = a))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.ac, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.ac, aes(y = m, group = a), size = 1.5)
p <- p + labs(title = "Interaction plot, a by c")
print(p)
## ymax not defined: adjusting position using y instead
## (b, c)
p <- ggplot(df.3.1.factor, aes(x = b, y = y, colour = c, shape = c))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.bc, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.bc, aes(y = m, group = c), size = 1.5)
p <- p + labs(title = "Interaction plot, c by b")
print(p)
## ymax not defined: adjusting position using y instead
p <- ggplot(df.3.1.factor, aes(x = c, y = y, colour = b, shape = b))
p <- p + geom_hline(aes(yintercept = 0), colour = "black"
, linetype = "solid", size = 0.2, alpha = 0.3)
p <- p + geom_boxplot(alpha = 0.25, outlier.size=0.1)
p <- p + geom_point(alpha = 0.5, position=position_dodge(width=0.75))
p <- p + geom_point(data = df.3.1.factor.mean.bc, aes(y = m), size = 4)
p <- p + geom_line(data = df.3.1.factor.mean.bc, aes(y = m, group = b), size = 1.5)
p <- p + labs(title = "Interaction plot, b by c")
print(p)
design <- expand.grid("a" = c(-1, 1)
, "b" = c(-1, 1)
, "c" = c(-1, 1)
, "d" = c(-1, 1)
, "y" = NA)
design
a b c d y
1 -1 -1 -1 -1 NA
2 1 -1 -1 -1 NA
3 -1 1 -1 -1 NA
4 1 1 -1 -1 NA
5 -1 -1 1 -1 NA
6 1 -1 1 -1 NA
7 -1 1 1 -1 NA
8 1 1 1 -1 NA
9 -1 -1 -1 1 NA
10 1 -1 -1 1 NA
11 -1 1 -1 1 NA
12 1 1 -1 1 NA
13 -1 -1 1 1 NA
14 1 -1 1 1 NA
15 -1 1 1 1 NA
16 1 1 1 1 NA
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_03-02.txt"
df.3.2 <- read.table(fn.data, header=TRUE)
str(df.3.2)
'data.frame': 16 obs. of 5 variables:
$ a: int -1 1 -1 1 -1 1 -1 1 -1 1 ...
$ b: int -1 -1 1 1 -1 -1 1 1 -1 -1 ...
$ c: int -1 -1 -1 -1 1 1 1 1 -1 -1 ...
$ d: int -1 -1 -1 -1 -1 -1 -1 -1 1 1 ...
$ y: int 45 71 48 65 68 60 80 65 43 100 ...
head(df.3.2,5)
a b c d y
1 -1 -1 -1 -1 45
2 1 -1 -1 -1 71
3 -1 1 -1 -1 48
4 1 1 -1 -1 65
5 -1 -1 1 -1 68
lm.3.2.y.4WIabcd <- lm(y ~ (a + b + c + d)^4, data = df.3.2)
## externally Studentized residuals
lm.3.2.y.4WIabcd$studres <- rstudent(lm.3.2.y.4WIabcd)
summary(lm.3.2.y.4WIabcd)
Call:
lm(formula = y ~ (a + b + c + d)^4, data = df.3.2)
Residuals:
ALL 16 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.0625 NaN NaN NaN
a 10.8125 NaN NaN NaN
b 1.5625 NaN NaN NaN
c 4.9375 NaN NaN NaN
d 7.3125 NaN NaN NaN
a:b 0.0625 NaN NaN NaN
a:c -9.0625 NaN NaN NaN
a:d 8.3125 NaN NaN NaN
b:c 1.1875 NaN NaN NaN
b:d -0.1875 NaN NaN NaN
c:d -0.5625 NaN NaN NaN
a:b:c 0.9375 NaN NaN NaN
a:b:d 2.0625 NaN NaN NaN
a:c:d -0.8125 NaN NaN NaN
b:c:d -1.3125 NaN NaN NaN
a:b:c:d 0.6875 NaN NaN NaN
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 15 and 0 DF, p-value: NA
library(BsMD)
par(mfrow=c(1,3))
LenthPlot(lm.3.2.y.4WIabcd, alpha = 0.05, main = "Lenth Plot") # , adj = 0.2
alpha PSE ME SME
0.050000 2.625000 6.747777 13.698960
DanielPlot(lm.3.2.y.4WIabcd, main = "Normal Plot")
DanielPlot(lm.3.2.y.4WIabcd, half = TRUE, main = "Half-Normal Plot")
In example 3.2, all terms with B were not significant. So, project design to a 2^3 in factors A, C, and D (exploratory, not confirmatory). Now we have two replicates in each term (A;C;D), so now have error terms. So can do ANOVA, etc. Then, we can run additional confirmatory experiments. Fit first-order with three-way interaction linear model.
lm.3.2.y.4WIacd <- lm(y ~ (a + c + d)^3, data = df.3.2)
lm.3.2.y.4WIacd$studres <- rstudent(lm.3.2.y.4WIacd)
summary(lm.3.2.y.4WIacd)
Call:
lm(formula = y ~ (a + c + d)^3, data = df.3.2)
Residuals:
Min 1Q Median 3Q Max
-6.0 -2.5 0.0 2.5 6.0
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.0625 1.1842 59.164 7.40e-12 ***
a 10.8125 1.1842 9.131 1.67e-05 ***
c 4.9375 1.1842 4.169 0.003124 **
d 7.3125 1.1842 6.175 0.000267 ***
a:c -9.0625 1.1842 -7.653 6.00e-05 ***
a:d 8.3125 1.1842 7.019 0.000110 ***
c:d -0.5625 1.1842 -0.475 0.647483
a:c:d -0.8125 1.1842 -0.686 0.512032
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.737 on 8 degrees of freedom
Multiple R-squared: 0.9687, Adjusted R-squared: 0.9413
F-statistic: 35.35 on 7 and 8 DF, p-value: 2.119e-05
library(rsm)
# help for cube, see examples
?cube
# create the first block
block1 <- cube( basis = ~ x1 + x2 + x3 + x4
, n0 = 0
, blockgen = ~ c(x1 * x2 * x3, x1 * x3 * x4)
, randomize = FALSE
, bid = 1)
block1
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is
1 1 1 -1 -1 -1 -1
6 2 2 1 -1 1 -1
12 3 3 1 1 -1 1
15 4 4 -1 1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
block <- list()
for (i.block in 1:4) {
block[[i.block]] <- cube( basis = ~ x1 + x2 + x3 + x4
, n0 = 0
, blockgen = ~ c(x1 * x2 * x3, x1 * x3 * x4)
, randomize = FALSE
, bid = i.block)
}
block
[[1]]
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is
1 1 1 -1 -1 -1 -1
6 2 2 1 -1 1 -1
12 3 3 1 1 -1 1
15 4 4 -1 1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
[[2]]
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is
3 1 1 -1 1 -1 -1
8 2 2 1 1 1 -1
10 3 3 1 -1 -1 1
13 4 4 -1 -1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
[[3]]
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is
4 1 1 1 1 -1 -1
7 2 2 -1 1 1 -1
9 3 3 -1 -1 -1 1
14 4 4 1 -1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
[[4]]
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is
2 1 1 1 -1 -1 -1
5 2 2 -1 -1 1 -1
11 3 3 -1 1 -1 1
16 4 4 1 1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
Alternatively, package FrF2 is the most comprehensive R package for creating fractional factorial 2-level designs, as well as Plackett-Burman type screening designs. Regular fractional factorials default to maximum resolution minimum aberration designs and can be customized in various ways. It can also show the alias structure for regular fractional factorials of 2-level factors.
library(FrF2)
k = 4 # number of factors, defaults to names = A, B, ...
FrF2( nruns = 2^k
, nfactors = k
, blocks = 4
, alias.block.2fis=TRUE
, randomize = FALSE
)
run.no run.no.std.rp Blocks A B C D
1 1 1.1.1 1 -1 -1 -1 -1
2 2 4.1.2 1 -1 -1 1 1
3 3 14.1.3 1 1 1 -1 1
4 4 15.1.4 1 1 1 1 -1
run.no run.no.std.rp Blocks A B C D
5 5 5.2.1 2 -1 1 -1 -1
6 6 8.2.2 2 -1 1 1 1
7 7 10.2.3 2 1 -1 -1 1
8 8 11.2.4 2 1 -1 1 -1
run.no run.no.std.rp Blocks A B C D
9 9 6.3.1 3 -1 1 -1 1
10 10 7.3.2 3 -1 1 1 -1
11 11 9.3.3 3 1 -1 -1 -1
12 12 12.3.4 3 1 -1 1 1
run.no run.no.std.rp Blocks A B C D
13 13 2.4.1 4 -1 -1 -1 1
14 14 3.4.2 4 -1 -1 1 -1
15 15 13.4.3 4 1 1 -1 -1
16 16 16.4.4 4 1 1 1 1
class=design, type= FrF2.blocked
NOTE: columns run.no and run.no.std.rp are annotation,
not part of the data frame
Original design is a III fractional factorial design. Below, I specify generators I=ABD=ACE. A, B, and C in notes are x3, x4, and x5, respectively. E=AC and below x1 = x3x5, and D=AB and below x2 =x1x4.
#### Generate design
library(rsm)
# help for cube, see examples
?cube
# create the first block
block1 <- cube( basis = ~ x1 + x2 + x3 + x4 + x5
, n0 = 0
, blockgen = ~ c(x1 * x2 * x4, x1 * x3 * x5)
, randomize = FALSE
, bid = 1)
block1
run.order std.order x1.as.is x2.as.is x3.as.is x4.as.is x5.as.is
1 1 1 -1 -1 -1 -1 -1
8 2 2 1 1 1 -1 -1
11 3 3 -1 1 -1 1 -1
14 4 4 1 -1 1 1 -1
20 5 5 1 1 -1 -1 1
21 6 6 -1 -1 1 -1 1
26 7 7 1 -1 -1 1 1
31 8 8 -1 1 1 1 1
Data are stored in coded form using these coding formulas ...
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
x4 ~ x4.as.is
x5 ~ x5.as.is
To put this in the same order as in the notes, change the order of the basis.
# create the first block
block1 <- cube( basis = ~ x4 + x5 + x1 + x2 + x3
, n0 = 0
, blockgen = ~ c(x1 * x2 * x4, x1 * x3 * x5)
, randomize = FALSE
, bid = 1)
block1
run.order std.order x4.as.is x5.as.is x1.as.is x2.as.is x3.as.is
1 1 1 -1 -1 -1 -1 -1
8 2 2 1 1 1 -1 -1
10 3 3 1 -1 -1 1 -1
15 4 4 -1 1 1 1 -1
19 5 5 -1 1 -1 -1 1
22 6 6 1 -1 1 -1 1
28 7 7 1 1 -1 1 1
29 8 8 -1 -1 1 1 1
Data are stored in coded form using these coding formulas ...
x4 ~ x4.as.is
x5 ~ x5.as.is
x1 ~ x1.as.is
x2 ~ x2.as.is
x3 ~ x3.as.is
#### 4.5a
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_04-05a.txt"
df.4.5a <- read.table(fn.data, header=TRUE)
str(df.4.5a)
'data.frame': 8 obs. of 4 variables:
$ a : int -1 1 -1 1 -1 1 -1 1
$ b : int -1 -1 1 1 -1 -1 1 1
$ c : int -1 -1 -1 -1 1 1 1 1
$ time: num 85.5 75.1 93.2 145.4 83.7 ...
head(df.4.5a,5)
a b c time
1 -1 -1 -1 85.5
2 1 -1 -1 75.1
3 -1 1 -1 93.2
4 1 1 -1 145.4
5 -1 -1 1 83.7
lm.4.5a.time.3WIabc <- lm(time ~ (a + b + c)^3, data = df.4.5a)
## externally Studentized residuals
#lm.4.5a.time.3WIabc$studres <- rstudent(lm.4.5a.time.3WIabc)
summary(lm.4.5a.time.3WIabc)
Call:
lm.default(formula = time ~ (a + b + c)^3, data = df.4.5a)
Residuals:
ALL 8 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 99.6625 NaN NaN NaN
a 10.3125 NaN NaN NaN
b 19.1875 NaN NaN NaN
c -0.1375 NaN NaN NaN
a:b 14.4375 NaN NaN NaN
a:c -0.1375 NaN NaN NaN
b:c -0.3125 NaN NaN NaN
a:b:c -1.2125 NaN NaN NaN
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 7 and 0 DF, p-value: NA
# BsMD package has unreplicated factorial tests (Daniel plots (aka normal), and Lenth)
library(BsMD)
par(mfrow=c(1,3))
LenthPlot(lm.4.5a.time.3WIabc, alpha = 0.05, main = "Lenth Plot") # , adj = 0.2
alpha PSE ME SME
0.050000 0.675000 2.540783 6.080607
## alpha PSE ME SME
## 0.050 0.675 2.541 6.081
DanielPlot(lm.4.5a.time.3WIabc, main = "Normal Plot")
DanielPlot(lm.4.5a.time.3WIabc, half = TRUE, main = "Half-Normal Plot")
#### 4.5b
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_04-05b.txt"
df.4.5b <- read.table(fn.data, header=TRUE)
str(df.4.5b)
'data.frame': 8 obs. of 4 variables:
$ a : int 1 -1 1 -1 1 -1 1 -1
$ b : int 1 1 -1 -1 1 1 -1 -1
$ c : int 1 1 1 1 -1 -1 -1 -1
$ time: num 91.3 136.7 82.4 73.4 94.1 ...
head(df.4.5b,5)
a b c time
1 1 1 1 91.3
2 -1 1 1 136.7
3 1 -1 1 82.4
4 -1 -1 1 73.4
5 1 1 -1 94.1
lm.4.5b.time.3WIabc <- lm(time ~ (a + b + c)^3, data = df.4.5b)
## externally Studentized residuals
#lm.4.5b.time.3WIabc$studres <- rstudent(lm.4.5b.time.3WIabc)
summary(lm.4.5b.time.3WIabc)
Call:
lm.default(formula = time ~ (a + b + c)^3, data = df.4.5b)
Residuals:
ALL 8 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 97.6125 NaN NaN NaN
a -8.8375 NaN NaN NaN
b 18.8625 NaN NaN NaN
c -1.6625 NaN NaN NaN
a:b -14.9375 NaN NaN NaN
a:c -0.2625 NaN NaN NaN
b:c -0.8125 NaN NaN NaN
a:b:c 1.3375 NaN NaN NaN
Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 7 and 0 DF, p-value: NA
## BsMD package has unreplicated factorial tests (Daniel plots (aka normal), and Lenth)
library(BsMD)
par(mfrow=c(1,3))
LenthPlot(lm.4.5b.time.3WIabc, alpha = 0.05, main = "Lenth Plot") # , adj = 0.2
alpha PSE ME SME
0.05000 3.22500 12.13930 29.05179
DanielPlot(lm.4.5b.time.3WIabc, main = "Normal Plot")
DanielPlot(lm.4.5b.time.3WIabc, half = TRUE, main = "Half-Normal Plot")
# combine two data sets
df.4.5 <- rbind(df.4.5a, df.4.5b)
str(df.4.5)
'data.frame': 16 obs. of 4 variables:
$ a : int -1 1 -1 1 -1 1 -1 1 1 -1 ...
$ b : int -1 -1 1 1 -1 -1 1 1 1 1 ...
$ c : int -1 -1 -1 -1 1 1 1 1 1 1 ...
$ time: num 85.5 75.1 93.2 145.4 83.7 ...
head(df.4.5,5)
a b c time
1 -1 -1 -1 85.5
2 1 -1 -1 75.1
3 -1 1 -1 93.2
4 1 1 -1 145.4
5 -1 -1 1 83.7
lm.4.5.time.3WIabc <- lm(time ~ (a + b + c)^3, data = df.4.5)
## externally Studentized residuals
#lm.4.5.time.3WIabc$studres <- rstudent(lm.4.5.time.3WIabc)
summary(lm.4.5.time.3WIabc)
Call:
lm.default(formula = time ~ (a + b + c)^3, data = df.4.5)
Residuals:
Min 1Q Median 3Q Max
-25.65 -10.31 0.00 10.31 25.65
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 98.6375 6.2325 15.826 2.54e-07 ***
a 0.7375 6.2325 0.118 0.9087
b 19.0250 6.2325 3.053 0.0158 *
c -0.9000 6.2325 -0.144 0.8888
a:b -0.2500 6.2325 -0.040 0.9690
a:c -0.2000 6.2325 -0.032 0.9752
b:c -0.5625 6.2325 -0.090 0.9303
a:b:c 0.0625 6.2325 0.010 0.9922
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 24.93 on 8 degrees of freedom
Multiple R-squared: 0.5393, Adjusted R-squared: 0.1361
F-statistic: 1.338 on 7 and 8 DF, p-value: 0.344
# BsMD package has unreplicated factorial tests (Daniel plots (aka normal), and Lenth)
library(BsMD)
par(mfrow=c(1,3))
LenthPlot(lm.4.5.time.3WIabc, alpha = 0.05, main = "Lenth Plot") # , adj = 0.2
alpha PSE ME SME
0.050000 1.218750 4.587525 10.978874
DanielPlot(lm.4.5.time.3WIabc, main = "Normal Plot")
DanielPlot(lm.4.5.time.3WIabc, half = TRUE, main = "Half-Normal Plot")
library(FrF2)
pb(nruns=8, randomize=FALSE)
A B C D E F G
1 1 1 1 -1 1 -1 -1
2 -1 1 1 1 -1 1 -1
3 -1 -1 1 1 1 -1 1
4 1 -1 -1 1 1 1 -1
5 -1 1 -1 -1 1 1 1
6 1 -1 1 -1 -1 1 1
7 1 1 -1 1 -1 -1 1
8 -1 -1 -1 -1 -1 -1 -1
class=design, type= pb
pb(nruns=12, randomize=FALSE)
A B C D E F G H J K L
1 1 1 -1 1 1 1 -1 -1 -1 1 -1
2 -1 1 1 -1 1 1 1 -1 -1 -1 1
3 1 -1 1 1 -1 1 1 1 -1 -1 -1
4 -1 1 -1 1 1 -1 1 1 1 -1 -1
5 -1 -1 1 -1 1 1 -1 1 1 1 -1
6 -1 -1 -1 1 -1 1 1 -1 1 1 1
7 1 -1 -1 -1 1 -1 1 1 -1 1 1
8 1 1 -1 -1 -1 1 -1 1 1 -1 1
9 1 1 1 -1 -1 -1 1 -1 1 1 -1
10 -1 1 1 1 -1 -1 -1 1 -1 1 1
11 1 -1 1 1 1 -1 -1 -1 1 -1 1
12 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
class=design, type= pb
#### 5.1
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_05-01.txt"
df.5.1 <- read.table(fn.data, header=TRUE)
str(df.5.1)
'data.frame': 8 obs. of 3 variables:
$ a: int -1 1 -1 1 0 0 0 0
$ b: int -1 -1 1 1 0 0 0 0
$ y: int 775 670 890 730 745 760 780 720
head(df.5.1,5)
a b y
1 -1 -1 775
2 1 -1 670
3 -1 1 890
4 1 1 730
5 0 0 745
Fit first-order with two-way interaction linear model. This model fit is slightly different than the one in the text; the intercept differs and the significance of the factors.
library(rsm)
rsm.5.1.y.TWIab <- rsm(y ~ FO(a, b) + TWI(a, b), data = df.5.1)
# externally Studentized residuals
#rsm.5.1.y.TWIab$studres <- rstudent(rsm.5.1.y.TWIab)
summary(rsm.5.1.y.TWIab)
Call:
rsm(formula = y ~ FO(a, b) + TWI(a, b), data = df.5.1)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 758.7500 8.6037 88.1889 9.911e-08 ***
a -66.2500 12.1675 -5.4449 0.005525 **
b 43.7500 12.1675 3.5957 0.022846 *
a:b -13.7500 12.1675 -1.1301 0.321622
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9164, Adjusted R-squared: 0.8537
F-statistic: 14.62 on 3 and 4 DF, p-value: 0.01273
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(a, b) 2 25212.5 12606.2 21.2876 0.007376
TWI(a, b) 1 756.3 756.3 1.2770 0.321622
Residuals 4 2368.7 592.2
Lack of fit 1 450.0 450.0 0.7036 0.463156
Pure error 3 1918.7 639.6
Stationary point of response surface:
a b
3.181818 -4.818182
Eigenanalysis:
eigen() decomposition
$values
[1] 6.875 -6.875
$vectors
[,1] [,2]
a -0.7071068 -0.7071068
b 0.7071068 -0.7071068
par(mfrow=c(1,2))
contour(rsm.5.1.y.TWIab, ~ a + b, image = TRUE)
persp(rsm.5.1.y.TWIab, b ~ a, zlab = "y")
steepest.5.1 <- steepest(rsm.5.1.y.TWIab, dist = seq(0, 7, by = 1))
Path of steepest ascent from ridge analysis:
steepest.5.1
dist a b | yhat
1 0 0.000 0.000 | 758.750
2 1 -0.805 0.593 | 844.589
3 2 -1.573 1.235 | 943.704
4 3 -2.321 1.901 | 1056.353
5 4 -3.058 2.579 | 1182.614
6 5 -3.787 3.265 | 1322.495
7 6 -4.511 3.956 | 1476.055
8 7 -5.232 4.650 | 1643.329
library(rsm)
rsm.5.1.y.FOab <- rsm(y ~ FO(a, b), data = df.5.1)
# externally Studentized residuals
#rsm.5.1.y.FOab$studres <- rstudent(rsm.5.1.y.FOab)
summary(rsm.5.1.y.FOab)
Call:
rsm(formula = y ~ FO(a, b), data = df.5.1)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 758.7500 8.8388 85.843 4.066e-09 ***
a -66.2500 12.5000 -5.300 0.003192 **
b 43.7500 12.5000 3.500 0.017284 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.8897, Adjusted R-squared: 0.8456
F-statistic: 20.17 on 2 and 5 DF, p-value: 0.004039
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(a, b) 2 25212.5 12606.2 20.170 0.004039
Residuals 5 3125.0 625.0
Lack of fit 2 1206.3 603.1 0.943 0.481119
Pure error 3 1918.7 639.6
Direction of steepest ascent (at radius 1):
a b
-0.8344646 0.5510615
Corresponding increment in original units:
a b
-0.8344646 0.5510615
par(mfrow=c(1,2))
contour(rsm.5.1.y.FOab, ~ a + b, image = TRUE)
persp(rsm.5.1.y.FOab, b ~ a, zlab = "y")
summary(rsm.5.1.y.FOab)$sa
a b
-0.8344646 0.5510615
steepest.5.1 <- steepest(rsm.5.1.y.FOab, dist = seq(0, 7, by = 1))
Path of steepest ascent from ridge analysis:
steepest.5.1$a[2]
[1] -0.834
steepest.5.1b <- steepest(rsm.5.1.y.FOab, dist = seq(0, 7, by = 1/abs(steepest.5.1$a[2])))
Path of steepest ascent from ridge analysis:
steepest.5.1b
dist a b | yhat
1 0.000000 0.000 0.000 | 758.750
2 1.199041 -1.001 0.661 | 853.985
3 2.398082 -2.001 1.321 | 949.110
4 3.597122 -3.002 1.982 | 1044.345
5 4.796163 -4.002 2.643 | 1139.514
6 5.995204 -5.003 3.304 | 1234.749
#### 6.2
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_06-02.txt"
df.6.2 <- read.table(fn.data, header=TRUE)
head(df.6.2,5)
x1 x2 x3 y
1 -1 -1 -1 57
2 1 -1 -1 40
3 -1 1 -1 19
4 1 1 -1 40
5 -1 -1 1 54
# code the variables
library(rsm)
cd.6.2 <- as.coded.data(df.6.2, x1 ~ (SodiumCitrate - 3) / 0.7
, x2 ~ (Glycerol - 8) / 3
, x3 ~ (EquilibrationTime - 16) / 6)
# the original variables are shown, but their codings are shown
str(cd.6.2)
Classes 'coded.data' and 'data.frame': 15 obs. of 4 variables:
$ x1: int -1 1 -1 1 -1 1 -1 1 0 -2 ...
$ x2: int -1 -1 1 1 -1 -1 1 1 0 0 ...
$ x3: int -1 -1 -1 -1 1 1 1 1 0 0 ...
$ y : int 57 40 19 40 54 41 21 43 63 28 ...
- attr(*, "codings")=List of 3
..$ x1:Class 'formula' language x1 ~ (SodiumCitrate - 3)/0.7
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ x2:Class 'formula' language x2 ~ (Glycerol - 8)/3
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ x3:Class 'formula' language x3 ~ (EquilibrationTime - 16)/6
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
- attr(*, "rsdes")=List of 2
..$ primary: chr [1:3] "x1" "x2" "x3"
..$ call : language as.coded.data(data = df.6.2, x1 ~ (SodiumCitrate - 3)/0.7, x2 ~ (Glycerol - 8)/3, x3 ~ (EquilibrationTime - 16)/6)
cd.6.2
SodiumCitrate Glycerol EquilibrationTime y
1 2.3 5 10 57
2 3.7 5 10 40
3 2.3 11 10 19
4 3.7 11 10 40
5 2.3 5 22 54
6 3.7 5 22 41
7 2.3 11 22 21
8 3.7 11 22 43
9 3.0 8 16 63
10 1.6 8 16 28
11 4.4 8 16 11
12 3.0 2 16 2
13 3.0 14 16 18
14 3.0 8 4 56
15 3.0 8 28 46
Data are stored in coded form using these coding formulas ...
x1 ~ (SodiumCitrate - 3)/0.7
x2 ~ (Glycerol - 8)/3
x3 ~ (EquilibrationTime - 16)/6
# this prints the coded values
print(cd.6.2, decode = FALSE)
x1 x2 x3 y
1 -1 -1 -1 57
2 1 -1 -1 40
3 -1 1 -1 19
4 1 1 -1 40
5 -1 -1 1 54
6 1 -1 1 41
7 -1 1 1 21
8 1 1 1 43
9 0 0 0 63
10 -2 0 0 28
11 2 0 0 11
12 0 -2 0 2
13 0 2 0 18
14 0 0 -2 56
15 0 0 2 46
Variable codings ...
x1 ~ (SodiumCitrate - 3)/0.7
x2 ~ (Glycerol - 8)/3
x3 ~ (EquilibrationTime - 16)/6
# note that the coded values (-1, +1) are used in the modelling.
library(rsm)
rsm.6.2.y.SOx1x2x3 <- rsm(y ~ SO(x1, x2, x3), data = cd.6.2)
# externally Studentized residuals
rsm.6.2.y.SOx1x2x3$studres <- rstudent(rsm.6.2.y.SOx1x2x3)
summary(rsm.6.2.y.SOx1x2x3)
Call:
rsm(formula = y ~ SO(x1, x2, x3), data = cd.6.2)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 66.1111 11.5215 5.7380 0.002252 **
x1 -1.3125 3.2660 -0.4019 0.704388
x2 -2.3125 3.2660 -0.7080 0.510551
x3 -1.0625 3.2660 -0.3253 0.758114
x1:x2 9.1250 4.6189 1.9756 0.105171
x1:x3 0.6250 4.6189 0.1353 0.897643
x2:x3 0.8750 4.6189 0.1894 0.857199
x1^2 -11.2639 3.9253 -2.8696 0.035013 *
x2^2 -13.6389 3.9253 -3.4746 0.017761 *
x3^2 -3.3889 3.9253 -0.8633 0.427411
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.8216, Adjusted R-squared: 0.5004
F-statistic: 2.558 on 9 and 5 DF, p-value: 0.1567
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2, x3) 3 131.19 43.73 0.2562 0.85418
TWI(x1, x2, x3) 3 675.38 225.13 1.3190 0.36605
PQ(x1, x2, x3) 3 3123.00 1041.00 6.0994 0.03998
Residuals 5 853.37 170.67
Lack of fit 5 853.37 170.67 NaN NaN
Pure error 0 0.00 NaN
Stationary point of response surface:
x1 x2 x3
-0.1157917 -0.1294177 -0.1841474
Stationary point in original units:
SodiumCitrate Glycerol EquilibrationTime
2.918946 7.611747 14.895115
Eigenanalysis:
eigen() decomposition
$values
[1] -3.327052 -7.796973 -17.167642
$vectors
[,1] [,2] [,3]
x1 0.08493000 0.7870012 0.61107775
x2 0.07971567 0.6059608 -0.79149032
x3 0.99319299 -0.1159337 0.01127208
## summary statistics
summary(rsm.6.2.y.SOx1x2x3)$canonical$xs
x1 x2 x3
-0.1157917 -0.1294177 -0.1841474
The plot below is like that in Figure 6.11. The contour plots below indicates increasing a increases thickness a great deal, c has almost no effect on thickness, and decreasing b increases thickness very little.
par(mfrow = c(2,2))
# this is the stationary point
canonical(rsm.6.2.y.SOx1x2x3)$xs
x1 x2 x3
-0.1157917 -0.1294177 -0.1841474
contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2, image=TRUE
, at = canonical(rsm.6.2.y.SOx1x2x3)$xs, main = "through stationary point")
contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2, image=TRUE
, at = data.frame(x1 = 0, x2 = 0, x3 = -1/3), main = "through (0, 0, ET=14)")
contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2, image=TRUE
, at = data.frame(x1 = 0, x2 = 0, x3 = -2/3), main = "through (0, 0, ET=12)")
contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2, image=TRUE
, at = data.frame(x1 = 0, x2 = 0, x3 = -1 ), main = "through (0, 0, ET=10)")
### Some additional contour plots
# op <- par(no.readonly = TRUE)
# par(mfrow = c(4,3), oma = c(0,0,0,0), mar = c(4,4,2,0))
# contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2 + x3, image=TRUE, at = data.frame(x1 = 0, x2 = 0, x3 # contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2 + x3, image=TRUE, at = data.frame(x1 = 0, x2 = 0, x3 # contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2 + x3, image=TRUE, at = data.frame(x1 = 0, x2 = 0, x3 # contour(rsm.6.2.y.SOx1x2x3, ~ x1 + x2 + x3, image=TRUE, at = canonical(rsm.6.2.y.SOx1x2x3)$xs, # par(op)
#### 6.3
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_06-03.txt"
df.6.3 <- read.table(fn.data, header=TRUE)
str(df.6.3)
'data.frame': 25 obs. of 5 variables:
$ x1: num -1 1 -1 1 -1 1 -1 1 -1 1 ...
$ x2: num -1 -1 1 1 -1 -1 1 1 -1 -1 ...
$ x3: num -1 -1 -1 -1 1 1 1 1 -1 -1 ...
$ x4: num -1 -1 -1 -1 -1 -1 -1 -1 1 1 ...
$ y : num 58.2 23.4 21.9 21.8 14.3 6.3 4.5 21.8 46.7 53.2 ...
head(df.6.3,5)
x1 x2 x3 x4 y
1 -1 -1 -1 -1 58.2
2 1 -1 -1 -1 23.4
3 -1 1 -1 -1 21.9
4 1 1 -1 -1 21.8
5 -1 -1 1 -1 14.3
library(rsm)
rsm.6.3.y.SOx1x2x3x4 <- rsm(y ~ SO(x1, x2, x3, x4), data = df.6.3)
# externally Studentized residuals
rsm.6.3.y.SOx1x2x3x4$studres <- rstudent(rsm.6.3.y.SOx1x2x3x4)
summary(rsm.6.3.y.SOx1x2x3x4)
Call:
rsm(formula = y ~ SO(x1, x2, x3, x4), data = df.6.3)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 40.198215 8.321708 4.8305 0.0006912 ***
x1 -1.511044 3.151968 -0.4794 0.6419677
x2 1.284137 3.151968 0.4074 0.6922929
x3 -8.738956 3.151968 -2.7725 0.0197013 *
x4 4.954819 3.151968 1.5720 0.1470308
x1:x2 2.193750 3.516952 0.6238 0.5467465
x1:x3 -0.143750 3.516952 -0.0409 0.9682013
x1:x4 1.581250 3.516952 0.4496 0.6625813
x2:x3 8.006250 3.516952 2.2765 0.0460610 *
x2:x4 2.806250 3.516952 0.7979 0.4434516
x3:x4 0.293750 3.516952 0.0835 0.9350832
x1^2 -6.332399 5.035510 -1.2575 0.2371289
x2^2 -4.291583 5.035510 -0.8523 0.4140130
x3^2 0.019642 5.035510 0.0039 0.9969645
x4^2 -2.505869 5.035510 -0.4976 0.6294983
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.6622, Adjusted R-squared: 0.1892
F-statistic: 1.4 on 14 and 10 DF, p-value: 0.3004
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2, x3, x4) 4 2088.7 522.16 2.6385 0.09725
TWI(x1, x2, x3, x4) 6 1270.3 211.72 1.0698 0.43976
PQ(x1, x2, x3, x4) 4 519.8 129.95 0.6566 0.63566
Residuals 10 1979.0 197.90
Lack of fit 10 1979.0 197.90 NaN NaN
Pure error 0 0.0 NaN
Stationary point of response surface:
x1 x2 x3 x4
0.2646871 1.0336457 0.2905784 1.6679609
Eigenanalysis:
eigen() decomposition
$values
[1] 2.604001 -2.159312 -6.008325 -7.546573
$vectors
[,1] [,2] [,3] [,4]
x1 -0.07414003 0.2150982 0.7687739 0.59768112
x2 -0.52824021 0.1373869 0.4567510 -0.70247107
x3 -0.82641775 -0.3071076 -0.2857634 0.37557672
x4 -0.18027545 0.9168111 -0.3445351 0.09085044
# the stationary point:
summary(rsm.6.3.y.SOx1x2x3x4)$canonical$xs
x1 x2 x3 x4
0.2646871 1.0336457 0.2905784 1.6679609
canonical(rsm.6.3.y.SOx1x2x3x4)$xs
x1 x2 x3 x4
0.2646871 1.0336457 0.2905784 1.6679609
summary(rsm.6.3.y.SOx1x2x3x4)$canonical$eigen
eigen() decomposition
$values
[1] 2.604001 -2.159312 -6.008325 -7.546573
$vectors
[,1] [,2] [,3] [,4]
x1 -0.07414003 0.2150982 0.7687739 0.59768112
x2 -0.52824021 0.1373869 0.4567510 -0.70247107
x3 -0.82641775 -0.3071076 -0.2857634 0.37557672
x4 -0.18027545 0.9168111 -0.3445351 0.09085044
canonical(rsm.6.3.y.SOx1x2x3x4)$eigen
eigen() decomposition
$values
[1] 2.604001 -2.159312 -6.008325 -7.546573
$vectors
[,1] [,2] [,3] [,4]
x1 -0.07414003 0.2150982 0.7687739 0.59768112
x2 -0.52824021 0.1373869 0.4567510 -0.70247107
x3 -0.82641775 -0.3071076 -0.2857634 0.37557672
x4 -0.18027545 0.9168111 -0.3445351 0.09085044
The stationary point is just outside region of experimentation for x4. The contour plots below go through the stationary point. The saddle is most apparent in the (x2; x3) plot.
par(mfrow = c(2,3), oma = c(0,0,0,0), mar = c(4,4,2,0))
contour(rsm.6.3.y.SOx1x2x3x4, ~ x1 + x2 + x3 + x4, image=TRUE, at = canonical(rsm.6.3.y.SOx1x2x3x4)$xs, main = "through stationary point")
#### calculate predicted increase along ridge of steepest ascent
steepest.6.3 <- steepest(rsm.6.3.y.SOx1x2x3x4, dist = seq(0, 2, by = 0.1))
Path of steepest ascent from ridge analysis:
## Path of steepest ascent from ridge analysis:
# include SE of response in the table, too
predict.6.3 <- predict(rsm.6.3.y.SOx1x2x3x4
, newdata = steepest.6.3[,c("x1","x2","x3","x4")], se.fit = TRUE)
steepest.6.3$StdError <- predict.6.3$se.fit
# steepest.6.3
#### 6.2: Example 6.3, p. 239 53
par(mfrow = c(1, 2))
# plot expected response vs radius
plot (steepest.6.3$dist, steepest.6.3$yhat, pch = "y"
, main = "Ridge plot: Estimated maximum +- SE vs radius")
points(steepest.6.3$dist, steepest.6.3$yhat, type = "l")
points(steepest.6.3$dist, steepest.6.3$yhat - predict.6.3$se.fit, type = "l", col = "red")
points(steepest.6.3$dist, steepest.6.3$yhat + predict.6.3$se.fit, type = "l", col = "red")
# plot change of factor variables vs radius
plot (steepest.6.3$dist, steepest.6.3$x1, pch = "1", col = "red"
, main = "Ridge plot: Factor values vs radius"
, ylim = c(-2, 0.25))
points(steepest.6.3$dist, steepest.6.3$x1, type = "l", col = "red")
points(steepest.6.3$dist, steepest.6.3$x2, pch = "2", col = "blue")
points(steepest.6.3$dist, steepest.6.3$x2, type = "l", col = "blue")
points(steepest.6.3$dist, steepest.6.3$x3, pch = "3", col = "green")
points(steepest.6.3$dist, steepest.6.3$x3, type = "l", col = "green")
points(steepest.6.3$dist, steepest.6.3$x4, pch = "4", col = "purple")
points(steepest.6.3$dist, steepest.6.3$x4, type = "l", col = "purple")
steepest.6.3
dist x1 x2 x3 x4 | yhat StdError
1 0.0 0.000 0.000 0.000 0.000 | 40.198 8.321708
2 0.1 -0.013 0.006 -0.087 0.047 | 41.206 8.304672
3 0.2 -0.022 0.001 -0.177 0.090 | 42.193 8.254748
4 0.3 -0.029 -0.014 -0.270 0.127 | 43.177 8.175279
5 0.4 -0.035 -0.038 -0.364 0.158 | 44.162 8.073586
6 0.5 -0.040 -0.069 -0.459 0.181 | 45.156 7.960484
7 0.6 -0.045 -0.104 -0.554 0.199 | 46.171 7.849435
8 0.7 -0.050 -0.144 -0.649 0.212 | 47.218 7.757770
9 0.8 -0.056 -0.187 -0.744 0.220 | 48.302 7.707911
10 0.9 -0.061 -0.232 -0.838 0.225 | 49.420 7.724549
11 1.0 -0.067 -0.279 -0.931 0.226 | 50.572 7.832800
12 1.1 -0.073 -0.327 -1.023 0.225 | 51.764 8.055343
13 1.2 -0.079 -0.377 -1.115 0.222 | 53.012 8.414972
14 1.3 -0.085 -0.426 -1.206 0.217 | 54.292 8.922153
15 1.4 -0.091 -0.477 -1.296 0.211 | 55.621 9.579913
16 1.5 -0.098 -0.528 -1.386 0.203 | 57.002 10.395542
17 1.6 -0.104 -0.579 -1.475 0.194 | 58.421 11.354753
18 1.7 -0.111 -0.630 -1.564 0.185 | 59.895 12.460097
19 1.8 -0.117 -0.682 -1.652 0.174 | 61.411 13.691907
20 1.9 -0.124 -0.734 -1.740 0.163 | 62.983 15.054297
21 2.0 -0.131 -0.786 -1.828 0.151 | 64.608 16.540864
## Functions for desirability
## Maximum
f.d.max <- function(y, L, T, r) {
# Computes desirability function (Derringer and Suich)
# when object is to maximize the response
# y = response
# L = unacceptability boundary
# T = target acceptability boundary T
# r = exponent
# d = desirability function
y.L <- min(T, max(L, y)) # y if L < y, otherwise L, and y if y < T, otherwise T
d <- ((y.L - L) / (T - L))^r # desirability function
return(d)
}
## Target
f.d.target <- function(y, L, T, U, r1, r2) {
# Computes desirability function (Derringer and Suich)
# when object is to hit target value
# y = response
# L = unacceptability boundary
# T = target acceptability boundary T
# U = upper unacceptability boundary
# r1 = exponent 1 for L
# r2 = exponent 2 for U
# d = desirability function
y.L <- min(T, max(L, y)) # y if L < y, otherwise L, and y if y < T, otherwise T
y.U <- max(T, min(U, y)) # y if y < U, otherwise U, and y if T < y, otherwise T
d <- (((y.L - L) / (T - L))^r1) * (((U - y.U) / (U - T))^r2) # desirability function
return(d)
}
#### 6.6
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_06-06.txt"
df.6.6 <- read.table(fn.data, header=TRUE)
str(df.6.6)
'data.frame': 13 obs. of 5 variables:
$ x1: num -1 1 -1 1 0 ...
$ x2: num -1 -1 1 1 0 0 0 0 0 0 ...
$ y1: num 76.5 77 78 79.5 79.9 80.3 80 79.7 79.8 78.4 ...
$ y2: int 62 60 66 59 72 69 68 70 71 68 ...
$ y3: int 2940 3470 3680 3890 3480 3200 3410 3290 3500 3360 ...
head(df.6.6,5)
x1 x2 y1 y2 y3
1 -1 -1 76.5 62 2940
2 1 -1 77.0 60 3470
3 -1 1 78.0 66 3680
4 1 1 79.5 59 3890
5 0 0 79.9 72 3480
Fit second-order linear models for each response variable. Based on the canonical analysis y1 has a maximum, y2 has a maximum, and y3 has a saddle point.
# 6.6, y1
library(rsm)
rsm.6.6.y1.SOx1x2 <- rsm(y1 ~ SO(x1, x2), data = df.6.6)
## externally Studentized residuals
#rsm.6.6.y1.SOx1x2$studres <- rstudent(rsm.6.6.y1.SOx1x2)
#summary(rsm.6.6.y1.SOx1x2)
# 6.6, y2
library(rsm)
rsm.6.6.y2.SOx1x2 <- rsm(y2 ~ SO(x1, x2), data = df.6.6)
## externally Studentized residuals
#rsm.6.6.y2.SOx1x2$studres <- rstudent(rsm.6.6.y2.SOx1x2)
#summary(rsm.6.6.y2.SOx1x2)
# 6.6, y3
library(rsm)
rsm.6.6.y3.SOx1x2 <- rsm(y3 ~ SO(x1, x2), data = df.6.6)
## externally Studentized residuals
#rsm.6.6.y3.SOx1x2$studres <- rstudent(rsm.6.6.y3.SOx1x2)
#summary(rsm.6.6.y3.SOx1x2)
canonical(rsm.6.6.y1.SOx1x2)$eigen$values
[1] -0.9634986 -1.4142867
canonical(rsm.6.6.y2.SOx1x2)$eigen$values
[1] 0.000000 -6.753528
canonical(rsm.6.6.y3.SOx1x2)$eigen$values
[1] 72.31441 -55.77167
Optimize the response subject to \[y1 \geq 78:5, 62 \le y2 \le 68\space and \space y3 \le 3400.\] ##### In particular (from p. 259): yield: max y1: L = 78.5, (T = 85), r = 1 time: target y2: L = 62 , T = 65 , U = 68 , r1 = r2 = 1 temp: min y3: (T = 3300), U = 3400, r = 1
### Create empty columns to populate with the desirability values
df.6.6$d1 <- NA #$
df.6.6$d2 <- NA #$
df.6.6$d3 <- NA #$
### 6.3: Example from Sec 6.6, Table 6.8, p. 253 57
df.6.6$D <- NA #$
# For each data value, calculate desirability
for (i.x in 1:dim(df.6.6)[1]) {
d1 <- f.d.max (df.6.6$y1[i.x]
, L = 78.5, T = 85, r = 1)
d2 <- f.d.target(df.6.6$y2[i.x]
, L = 62, T = 65, U = 68, r1 = 1, r2 = 1)
d3 <- f.d.max (-df.6.6$y3[i.x]
, L = -3400, T = -3300, r = 1)
# Combined desirability
D <- (d1 * d2 * d3)^(1/3)
df.6.6[i.x, c("d1", "d2", "d3", "D")] <- c(d1, d2, d3, D)
}
df.6.6
x1 x2 y1 y2 y3 d1 d2 d3 D
1 -1.000 -1.000 76.5 62 2940 0.0000000 0.0000000 1.0 0
2 1.000 -1.000 77.0 60 3470 0.0000000 0.0000000 0.0 0
3 -1.000 1.000 78.0 66 3680 0.0000000 0.6666667 0.0 0
4 1.000 1.000 79.5 59 3890 0.1538462 0.0000000 0.0 0
5 0.000 0.000 79.9 72 3480 0.2153846 0.0000000 0.0 0
6 0.000 0.000 80.3 69 3200 0.2769231 0.0000000 1.0 0
7 0.000 0.000 80.0 68 3410 0.2307692 0.0000000 0.0 0
8 0.000 0.000 79.7 70 3290 0.1846154 0.0000000 1.0 0
9 0.000 0.000 79.8 71 3500 0.2000000 0.0000000 0.0 0
10 -1.414 0.000 78.4 68 3360 0.0000000 0.0000000 0.4 0
11 1.414 0.000 75.6 71 3020 0.0000000 0.0000000 1.0 0
12 0.000 -1.414 78.5 58 3630 0.0000000 0.0000000 0.0 0
13 0.000 1.414 77.0 57 3150 0.0000000 0.0000000 1.0 0
# max among these points
df.6.6[which(df.6.6$D == max(df.6.6$D)),]
x1 x2 y1 y2 y3 d1 d2 d3 D
1 -1.000 -1.000 76.5 62 2940 0.0000000 0.0000000 1.0 0
2 1.000 -1.000 77.0 60 3470 0.0000000 0.0000000 0.0 0
3 -1.000 1.000 78.0 66 3680 0.0000000 0.6666667 0.0 0
4 1.000 1.000 79.5 59 3890 0.1538462 0.0000000 0.0 0
5 0.000 0.000 79.9 72 3480 0.2153846 0.0000000 0.0 0
6 0.000 0.000 80.3 69 3200 0.2769231 0.0000000 1.0 0
7 0.000 0.000 80.0 68 3410 0.2307692 0.0000000 0.0 0
8 0.000 0.000 79.7 70 3290 0.1846154 0.0000000 1.0 0
9 0.000 0.000 79.8 71 3500 0.2000000 0.0000000 0.0 0
10 -1.414 0.000 78.4 68 3360 0.0000000 0.0000000 0.4 0
11 1.414 0.000 75.6 71 3020 0.0000000 0.0000000 1.0 0
12 0.000 -1.414 78.5 58 3630 0.0000000 0.0000000 0.0 0
13 0.000 1.414 77.0 57 3150 0.0000000 0.0000000 1.0 0
# Create empty columns to populate with the desirability values
df.6.6$d1 <- NA #$
df.6.6$d2 <- NA #$
df.6.6$d3 <- NA #$
df.6.6$D <- NA #$
# For each data value, calculate desirability
for (i.x in 1:dim(df.6.6)[1]) {
d1 <- f.d.max (df.6.6$y1[i.x]
, L = 70, T = 85, r = 1)
d2 <- f.d.target(df.6.6$y2[i.x]
, L = 58, T = 65, U = 72, r1 = 1, r2 = 1)
d3 <- f.d.max (-df.6.6$y3[i.x]
, L = -3800, T = -3300, r = 1)
# Combined desirability
D <- (d1 * d2 * d3)^(1/3)
df.6.6[i.x, c("d1", "d2", "d3", "D")] <- c(d1, d2, d3, D)
}
df.6.6
x1 x2 y1 y2 y3 d1 d2 d3 D
1 -1.000 -1.000 76.5 62 2940 0.4333333 0.5714286 1.00 0.6279543
2 1.000 -1.000 77.0 60 3470 0.4666667 0.2857143 0.66 0.4447960
3 -1.000 1.000 78.0 66 3680 0.5333333 0.8571429 0.24 0.4787268
4 1.000 1.000 79.5 59 3890 0.6333333 0.1428571 0.00 0.0000000
5 0.000 0.000 79.9 72 3480 0.6600000 0.0000000 0.64 0.0000000
6 0.000 0.000 80.3 69 3200 0.6866667 0.4285714 1.00 0.6651553
7 0.000 0.000 80.0 68 3410 0.6666667 0.5714286 0.78 0.6673010
8 0.000 0.000 79.7 70 3290 0.6466667 0.2857143 1.00 0.5695574
9 0.000 0.000 79.8 71 3500 0.6533333 0.1428571 0.60 0.3825862
10 -1.414 0.000 78.4 68 3360 0.5600000 0.5714286 0.88 0.6554570
11 1.414 0.000 75.6 71 3020 0.3733333 0.1428571 1.00 0.3764144
12 0.000 -1.414 78.5 58 3630 0.5666667 0.0000000 0.34 0.0000000
13 0.000 1.414 77.0 57 3150 0.4666667 0.0000000 1.00 0.0000000
# max among these points
df.6.6[which(df.6.6$D == max(df.6.6$D)),]
x1 x2 y1 y2 y3 d1 d2 d3 D
7 0 0 80 68 3410 0.6666667 0.5714286 0.78 0.667301
# D as response
library(rsm)
rsm.6.6.D.SOx1x2 <- rsm(D ~ SO(x1, x2), data = df.6.6)
# externally Studentized residuals
rsm.6.6.D.SOx1x2$studres <- rstudent(rsm.6.6.D.SOx1x2)
summary(rsm.6.6.D.SOx1x2)
Call:
rsm(formula = D ~ SO(x1, x2), data = df.6.6)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.456904 0.107092 4.2665 0.003717 **
x1 -0.132076 0.084670 -1.5599 0.162750
x2 -0.074264 0.084670 -0.8771 0.409523
x1:x2 -0.073892 0.119733 -0.6171 0.556673
x1^2 0.062025 0.090811 0.6830 0.516563
x2^2 -0.196021 0.090811 -2.1586 0.067753 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.5676, Adjusted R-squared: 0.2587
F-statistic: 1.838 on 5 and 7 DF, p-value: 0.2242
Analysis of Variance Table
Response: D
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 0.18365 0.091823 1.6013 0.2675
TWI(x1, x2) 1 0.02184 0.021840 0.3809 0.5567
PQ(x1, x2) 2 0.32142 0.160710 2.8026 0.1276
Residuals 7 0.40140 0.057344
Lack of fit 3 0.08679 0.028931 0.3678 0.7812
Pure error 4 0.31461 0.078653
Stationary point of response surface:
x1 x2
0.8557887 -0.3507283
Eigenanalysis:
eigen() decomposition
$values
[1] 0.06721042 -0.20120656
$vectors
[,1] [,2]
x1 -0.9902933 0.1389935
x2 0.1389935 0.9902933
par(mfrow = c(1,1))
contour(rsm.6.6.D.SOx1x2, ~ x1 + x2
, bounds = list(x1 = c(-2, 2), x2 = c(-2, 2))
, image=TRUE, main = "D")
A brute-force search for the optimum predicted deirability, D, over the +/-2-unit cube. gives the result below.
### x-values
D.cube <- expand.grid(seq(-2, 2, by=0.1), seq(-2, 2, by=0.1))
colnames(D.cube) <- c("x1", "x2")
# predicted D
D.cube$D <- predict(rsm.6.6.D.SOx1x2, newdata = D.cube)
# predicted optimum
D.opt <- D.cube[which(D.cube$D == max(D.cube$D)),]
D.opt
x1 x2 D
903 -2 0.2 0.9760195
# predicted response at the optimal D
y1 <- predict(rsm.6.6.y1.SOx1x2, newdata = D.opt)
y2 <- predict(rsm.6.6.y2.SOx1x2, newdata = D.opt)
y3 <- predict(rsm.6.6.y3.SOx1x2, newdata = D.opt)
c(y1, y2, y3)
903 903 903
74.83096 68.71269 3190.54579
Always check that the optimal value is contained in the specified constraints. Summary: D has all zeros for original values. The RSA gives a saddle point for wider bounds for y1, y2, and y3.
Perform RSA on desirability function D evaluated at the predicted responses at the design points. At the center point, these are the desirability values, and the combined desirability, D.
d1 <- f.d.max (predict(rsm.6.6.y1.SOx1x2, newdata = data.frame(x1=0, x2=0, x3=0))
, L = 78.5, T = 85, r = 1)
d2 <- f.d.target(predict(rsm.6.6.y2.SOx1x2, newdata = data.frame(x1=0, x2=0, x3=0))
, L = 62, T = 65, U = 68, r1 = 1, r2 = 1)
d3 <- f.d.max (-predict(rsm.6.6.y3.SOx1x2, newdata = data.frame(x1=0, x2=0, x3=0))
, L = -3400, T = -3300, r = 1)
# Combined desirability
D <- (d1 * d2 * d3)^(1/3)
c(d1, d2, d3, D)
[1] 0.2215315 0.0000000 0.2402477 0.0000000
#### 7.6
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_07-06.txt"
df.7.6 <- read.table(fn.data, header=TRUE)
df.7.6$block <- factor(df.7.6$block)
str(df.7.6)
'data.frame': 14 obs. of 4 variables:
$ x1 : num -1 -1 1 1 0 0 0 0 0 0 ...
$ x2 : num -1 1 -1 1 0 0 0 0 0 0 ...
$ block: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 2 2 2 ...
$ y : num 80.5 81.5 82 83.5 83.9 84.3 84 79.7 79.8 79.5 ...
head(df.7.6,10)
x1 x2 block y
1 -1 -1 1 80.5
2 -1 1 1 81.5
3 1 -1 1 82.0
4 1 1 1 83.5
5 0 0 1 83.9
6 0 0 1 84.3
7 0 0 1 84.0
8 0 0 2 79.7
9 0 0 2 79.8
10 0 0 2 79.5
library(rsm)
rsm.7.6.y.SOx1x2 <- rsm(y ~ SO(x1, x2), data = df.7.6)
# externally Studentized residuals
rsm.7.6.y.SOx1x2$studres <- rstudent(rsm.7.6.y.SOx1x2)
summary(rsm.7.6.y.SOx1x2)
Call:
rsm(formula = y ~ SO(x1, x2), data = df.7.6)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 81.86621 1.20528 67.9228 2.457e-12 ***
x1 0.93254 1.04388 0.8933 0.3978
x2 0.57771 1.04388 0.5534 0.5951
x1:x2 0.12500 1.47616 0.0847 0.9346
x1^2 -1.30816 1.08667 -1.2038 0.2631
x2^2 -0.93305 1.08667 -0.8586 0.4155
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.2826, Adjusted R-squared: -0.1658
F-statistic: 0.6303 on 5 and 8 DF, p-value: 0.683
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 9.626 4.8128 0.5522 0.5962
TWI(x1, x2) 1 0.063 0.0625 0.0072 0.9346
PQ(x1, x2) 2 17.779 8.8896 1.0199 0.4031
Residuals 8 69.730 8.7162
Lack of fit 3 40.557 13.5189 2.3170 0.1928
Pure error 5 29.173 5.8347
Stationary point of response surface:
x1 x2
0.3724143 0.3345289
Eigenanalysis:
eigen() decomposition
$values
[1] -0.922910 -1.318302
$vectors
[,1] [,2]
x1 -0.1601375 -0.9870947
x2 -0.9870947 0.1601375
library(rsm)
rsm.7.6.y.b.SOx1x2 <- rsm(y ~ block + SO(x1, x2), data = df.7.6)
# externally Studentized residuals
rsm.7.6.y.b.SOx1x2$studres <- rstudent(rsm.7.6.y.b.SOx1x2)
summary(rsm.7.6.y.b.SOx1x2)
Call:
rsm(formula = y ~ block + SO(x1, x2), data = df.7.6)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 84.095427 0.079631 1056.067 < 2.2e-16 ***
block2 -4.457530 0.087226 -51.103 2.877e-10 ***
x1 0.932541 0.057699 16.162 8.444e-07 ***
x2 0.577712 0.057699 10.012 2.122e-05 ***
x1:x2 0.125000 0.081592 1.532 0.1694
x1^2 -1.308555 0.060064 -21.786 1.083e-07 ***
x2^2 -0.933442 0.060064 -15.541 1.104e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9981, Adjusted R-squared: 0.9964
F-statistic: 607.2 on 6 and 7 DF, p-value: 3.811e-09
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
block 1 69.531 69.531 2611.0950 2.879e-10
FO(x1, x2) 2 9.626 4.813 180.7341 9.450e-07
TWI(x1, x2) 1 0.063 0.063 2.3470 0.1694
PQ(x1, x2) 2 17.791 8.896 334.0539 1.135e-07
Residuals 7 0.186 0.027
Lack of fit 3 0.053 0.018 0.5307 0.6851
Pure error 4 0.133 0.033
Stationary point of response surface:
x1 x2
0.3722954 0.3343802
Eigenanalysis:
eigen() decomposition
$values
[1] -0.9233027 -1.3186949
$vectors
[,1] [,2]
x1 -0.1601375 -0.9870947
x2 -0.9870947 0.1601375
df.7.6$block.num <- as.numeric(df.7.6$block) - 1.5
head(df.7.6,10)
x1 x2 block y block.num
1 -1 -1 1 80.5 -0.5
2 -1 1 1 81.5 -0.5
3 1 -1 1 82.0 -0.5
4 1 1 1 83.5 -0.5
5 0 0 1 83.9 -0.5
6 0 0 1 84.3 -0.5
7 0 0 1 84.0 -0.5
8 0 0 2 79.7 0.5
9 0 0 2 79.8 0.5
10 0 0 2 79.5 0.5
library(rsm)
rsm.7.6.y.bn.SOx1x2 <- rsm(y ~ block.num + SO(x1, x2), data = df.7.6)
# externally Studentized residuals
rsm.7.6.y.bn.SOx1x2$studres <- rstudent(rsm.7.6.y.bn.SOx1x2)
summary(rsm.7.6.y.bn.SOx1x2)
Call:
rsm(formula = y ~ block.num + SO(x1, x2), data = df.7.6)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 81.866662 0.066620 1228.863 < 2.2e-16 ***
block.num -4.457530 0.087226 -51.103 2.877e-10 ***
x1 0.932541 0.057699 16.162 8.444e-07 ***
x2 0.577712 0.057699 10.012 2.122e-05 ***
x1:x2 0.125000 0.081592 1.532 0.1694
x1^2 -1.308555 0.060064 -21.786 1.083e-07 ***
x2^2 -0.933442 0.060064 -15.541 1.104e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9981, Adjusted R-squared: 0.9964
F-statistic: 607.2 on 6 and 7 DF, p-value: 3.811e-09
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
block.num 1 69.531 69.531 2611.0950 2.879e-10
FO(x1, x2) 2 9.626 4.813 180.7341 9.450e-07
TWI(x1, x2) 1 0.063 0.063 2.3470 0.1694
PQ(x1, x2) 2 17.791 8.896 334.0539 1.135e-07
Residuals 7 0.186 0.027
Lack of fit 3 0.053 0.018 0.5307 0.6851
Pure error 4 0.133 0.033
Stationary point of response surface:
x1 x2
0.3722954 0.3343802
Eigenanalysis:
eigen() decomposition
$values
[1] -0.9233027 -1.3186949
$vectors
[,1] [,2]
x1 -0.1601375 -0.9870947
x2 -0.9870947 0.1601375
8.1 Evaluating design efficiency Here are some ideas for evaluating efficiency of designs. Let’s start with a single rep of a 2^3 design.
#### 8.1
df.8.1 <- read.table(text="
x1 x2 x3
-1 -1 -1
1 -1 -1
-1 1 -1
1 1 -1
-1 -1 1
1 -1 1
-1 1 1
1 1 1
", header=TRUE)
str(df.8.1)
'data.frame': 8 obs. of 3 variables:
$ x1: int -1 1 -1 1 -1 1 -1 1
$ x2: int -1 -1 1 1 -1 -1 1 1
$ x3: int -1 -1 -1 -1 1 1 1 1
You can use the package AlgDesign function eval.design() to evaluate the D and A effciencies of the design. The effciency depends on the model function you specify.
library(AlgDesign)
eval.design(~ x1 + x2 + x3, df.8.1)
$determinant
[1] 1
$A
[1] 1
$diagonality
[1] 1
$gmean.variances
[1] 1
# these can be assigned to variables
D.eff.lin <- eval.design(~ x1 + x2 + x3, df.8.1)$determinant
# note that A-efficiency is 1/A given from eval.design()
A.eff.lin <- 1 / eval.design(~ x1 + x2 + x3, df.8.1)$A
c(D.eff.lin, A.eff.lin)
[1] 1 1
Now we calculate the same quantities after adding 3 center points.
df.8.1c <- rbind(df.8.1, data.frame(x1 = rep(0,3), x2 = rep(0,3), x3 = rep(0,3)))
df.8.1c
x1 x2 x3
1 -1 -1 -1
2 1 -1 -1
3 -1 1 -1
4 1 1 -1
5 -1 -1 1
6 1 -1 1
7 -1 1 1
8 1 1 1
9 0 0 0
10 0 0 0
11 0 0 0
library(AlgDesign)
eval.design(~ x1 + x2 + x3, df.8.1c)
$determinant
[1] 0.7875406
$A
[1] 1.28125
$diagonality
[1] 1
$gmean.variances
[1] 1.375
D.eff.lin <- eval.design(~ x1 + x2 + x3, df.8.1c)$determinant
A.eff.lin <- 1 / eval.design(~ x1 + x2 + x3, df.8.1c)$A
c(D.eff.lin, A.eff.lin)
[1] 0.7875406 0.7804878
Sometimes it is desirable to augment a design in an optimal way. Suppose it is expected that a linear model will suce for the experiment and so the design in df.8.2 is run.
df.8.2 <- read.table(text="
x1 x2
-1 -1
1 -1
-1 1
0 0
0 0
", header=TRUE)
str(df.8.2)
'data.frame': 5 obs. of 2 variables:
$ x1: int -1 1 -1 0 0
$ x2: int -1 -1 1 0 0
Later, the experimenters realized that a second-order design would be more appropriate. They constructed a sensible list of design points to include (based on a face-centered CCD). Then they evaluate the star points at a number of values (0.5, 1.0, 1.5, 2.0), and find the D-optimal augmentation at each.
for (i.alpha in c(0.5, 1.0, 1.5, 2.0)) {
# augment design candidate points
D.candidate.points <- i.alpha * data.frame(x1 = c( 1, -1, 0, 0)
, x2 = c( 0, 0, 1, -1))
# combine original points with candidate points
D.all <- rbind(df.8.2, D.candidate.points)
# keep the original points and augment the design with 4 points
library(AlgDesign)
D.aug <- optFederov(~ quad(.), D.all, nTrials = dim(df.8.2)[1] + 4, rows = 1:dim(df.8.2)[1]
, augment = TRUE, criterion = "D", maxIteration = 1e4, nRepeats = 1e2)
## same as:
#D.aug <- optFederov(~ x1 + x2 + x1:x2 + x1^2 + x2^2
# , D.all, nTrials = dim(df.8.2)[1] + 4, rows = 1:dim(df.8.2)[1]
# , augment = TRUE, criterion = "D", maxIteration = 1e4, nRepeats = 1e2)
# Added points
print(D.aug$design[(dim(df.8.2)[1]+1):dim(D.aug$design)[1],])
library(AlgDesign)
# quadratic
#rm(D.eff.lin, A.eff.lin)
D.eff.lin <- eval.design(~ quad(.), D.aug$design)$determinant
A.eff.lin <- 1 / eval.design(~ quad(.), D.aug$design)$A
print(c(i.alpha, D.eff.lin, A.eff.lin))
}
x1 x2
6 0.5 0.0
7 -0.5 0.0
8 0.0 0.5
9 0.0 -0.5
[1] 0.50000000 0.18812090 0.05286691
x1 x2
6 1 0
7 -1 0
8 0 1
9 0 -1
[1] 1.0000000 0.3812509 0.2556391
x1 x2
6 1.5 0.0
7 -1.5 0.0
8 0.0 1.5
9 0.0 -1.5
[1] 1.5000000 0.6583697 0.4208625
x1 x2
6 2 0
7 -2 0
8 0 2
9 0 -2
[1] 2.0000000 1.1079479 0.6066351
Funding comes through and they don’t have to just add the four axial points, but can afford to add any 12 points they like to the initial run. They look at a range of combinations of points for x1 and x2 and find the D-optimal augmentation.
### augment design candidate points
D.candidate.points <- expand.grid(x1 = seq(-2, 2, by = 0.5)
, x2 = seq(-2, 2, by = 0.5))
str(D.candidate.points)
'data.frame': 81 obs. of 2 variables:
$ x1: num -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 ...
$ x2: num -2 -2 -2 -2 -2 -2 -2 -2 -2 -1.5 ...
- attr(*, "out.attrs")=List of 2
..$ dim : Named int [1:2] 9 9
.. ..- attr(*, "names")= chr [1:2] "x1" "x2"
..$ dimnames:List of 2
.. ..$ x1: chr [1:9] "x1=-2.0" "x1=-1.5" "x1=-1.0" "x1=-0.5" ...
.. ..$ x2: chr [1:9] "x2=-2.0" "x2=-1.5" "x2=-1.0" "x2=-0.5" ...
head(D.candidate.points)
x1 x2
1 -2.0 -2
2 -1.5 -2
3 -1.0 -2
4 -0.5 -2
5 0.0 -2
6 0.5 -2
tail(D.candidate.points)
x1 x2
76 -0.5 2
77 0.0 2
78 0.5 2
79 1.0 2
80 1.5 2
81 2.0 2
# combine original points with candidate points
D.all <- rbind(df.8.2, D.candidate.points)
# keep the original points and augment the design with 4 points
library(AlgDesign)
D.aug <- optFederov(~ quad(.), D.all, nTrials = dim(df.8.2)[1] + 12, rows = 1:dim(df.8.2)[1]
, augment = TRUE, criterion = "D", maxIteration = 1e4, nRepeats = 1e2)
# Added points
print(D.aug$design[(dim(df.8.2)[1]+1):dim(D.aug$design)[1],])
x1 x2
6 -2.0 -2.0
7 -1.5 -2.0
10 0.0 -2.0
14 2.0 -2.0
23 2.0 -1.5
33 -2.0 -0.5
59 2.0 0.5
69 -2.0 1.5
78 -2.0 2.0
82 0.0 2.0
85 1.5 2.0
86 2.0 2.0
library(AlgDesign)
# quadratic
#rm(D.eff.lin, A.eff.lin)
D.eff.lin <- eval.design(~ quad(.), D.aug$design)$determinant
A.eff.lin <- 1 / eval.design(~ quad(.), D.aug$design)$A
print(c(i.alpha, D.eff.lin, A.eff.lin))
[1] 2.000000 2.504373 1.017305
### full design
library(plyr)
D.aug.ordered <- arrange(D.aug$design, x1, x2)
D.aug.ordered
x1 x2
1 -2.0 -2.0
2 -2.0 -0.5
3 -2.0 1.5
4 -2.0 2.0
5 -1.5 -2.0
6 -1.0 -1.0
7 -1.0 1.0
8 0.0 -2.0
9 0.0 0.0
10 0.0 0.0
11 0.0 2.0
12 1.0 -1.0
13 1.5 2.0
14 2.0 -2.0
15 2.0 -1.5
16 2.0 0.5
17 2.0 2.0
The AlgDesign package can be used to create computer-generated designs, providing a list of permitted/suggested design points.
\[ D-option \space3*2*2 \space for \space y = \beta_0 +\beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \beta_11x_1^2\space + \eta \]
#### 8.10
# design candidate values 3 * 2 * 2
D.candidate.points <- expand.grid(x1 = seq(-1, 1, by = 1)
, x2 = c(-1, 1)
, x3 = c(-1, 1))
D.candidate.points
x1 x2 x3
1 -1 -1 -1
2 0 -1 -1
3 1 -1 -1
4 -1 1 -1
5 0 1 -1
6 1 1 -1
7 -1 -1 1
8 0 -1 1
9 1 -1 1
10 -1 1 1
11 0 1 1
12 1 1 1
# choose the 12 points based on D-criterion
library(AlgDesign)
D.gen <- optFederov(~ x1 + x2 + x3 + x1^2, D.candidate.points, nTrials = 12
, criterion = "D", evaluateI = TRUE, maxIteration = 1e4, nRepeats = 1e2)
D.gen
$D
[1] 0.903602
$A
[1] 1.125
$I
[1] 4
$Ge
[1] 0.889
$Dea
[1] 0.882
$design
x1 x2 x3
1 -1 -1 -1
2 0 -1 -1
3 1 -1 -1
4 -1 1 -1
5 0 1 -1
6 1 1 -1
7 -1 -1 1
8 0 -1 1
9 1 -1 1
10 -1 1 1
11 0 1 1
12 1 1 1
$rows
[1] 1 2 3 4 5 6 7 8 9 10 11 12
A.eff <- 1/D.gen$A # A efficiency is the reciprocal of $A
# easier to read when ordered
library(plyr)
D.gen.ordered <- arrange(D.gen$design, x1, x2, x3)
D.gen.ordered
x1 x2 x3
1 -1 -1 -1
2 -1 -1 1
3 -1 1 -1
4 -1 1 1
5 0 -1 -1
6 0 -1 1
7 0 1 -1
8 0 1 1
9 1 -1 -1
10 1 -1 1
11 1 1 -1
12 1 1 1
\[ D-option \space5*2*2 \space for \space y = \beta_0 +\beta_1x_1 + \beta_2x_2 + \beta_3x_3 + \beta_11x_1^2\space + \eta \]
#### 8.10
# design candidate values 5 * 2 * 2
D.candidate.points <- expand.grid(x1 = seq(-1, 1, by = 0.5)
, x2 = c(-1, 1)
, x3 = c(-1, 1))
D.candidate.points
x1 x2 x3
1 -1.0 -1 -1
2 -0.5 -1 -1
3 0.0 -1 -1
4 0.5 -1 -1
5 1.0 -1 -1
6 -1.0 1 -1
7 -0.5 1 -1
8 0.0 1 -1
9 0.5 1 -1
10 1.0 1 -1
11 -1.0 -1 1
12 -0.5 -1 1
13 0.0 -1 1
14 0.5 -1 1
15 1.0 -1 1
16 -1.0 1 1
17 -0.5 1 1
18 0.0 1 1
19 0.5 1 1
20 1.0 1 1
# choose the 12 points based on D-criterion
library(AlgDesign)
D.gen <- optFederov(~ x1 + x2 + x3 + x1^2, D.candidate.points, nTrials = 12
, criterion = "D", evaluateI = TRUE, maxIteration = 1e4, nRepeats = 1e2)
D.gen
$D
[1] 0.9306049
$A
[1] 1.083333
$I
[1] 3.666667
$Ge
[1] 0.923
$Dea
[1] 0.92
$design
x1 x2 x3
1 -1.0 -1 -1
2 -0.5 -1 -1
5 1.0 -1 -1
6 -1.0 1 -1
9 0.5 1 -1
10 1.0 1 -1
11 -1.0 -1 1
14 0.5 -1 1
15 1.0 -1 1
16 -1.0 1 1
17 -0.5 1 1
20 1.0 1 1
$rows
[1] 1 2 5 6 9 10 11 14 15 16 17 20
A.eff <- 1/D.gen$A # A efficiency is the reciprocal of $A
# easier to read when ordered
library(plyr)
D.gen.ordered <- arrange(D.gen$design, x1, x2, x3)
D.gen.ordered
x1 x2 x3
1 -1.0 -1 -1
2 -1.0 -1 1
3 -1.0 1 -1
4 -1.0 1 1
5 -0.5 -1 -1
6 -0.5 1 1
7 0.5 -1 1
8 0.5 1 -1
9 1.0 -1 -1
10 1.0 -1 1
11 1.0 1 -1
12 1.0 1 1
$$ I-option 22 for y = _0 +_1x_1 + _2x_2 + _3x_3 + _11x_1^2+
$$
#### 8.10
# design candidate values 5 * 2 * 2
D.candidate.points <- expand.grid(x1 = seq(-1, 1, by = 0.5)
, x2 = c(-1, 1)
, x3 = c(-1, 1))
D.candidate.points
x1 x2 x3
1 -1.0 -1 -1
2 -0.5 -1 -1
3 0.0 -1 -1
4 0.5 -1 -1
5 1.0 -1 -1
6 -1.0 1 -1
7 -0.5 1 -1
8 0.0 1 -1
9 0.5 1 -1
10 1.0 1 -1
11 -1.0 -1 1
12 -0.5 -1 1
13 0.0 -1 1
14 0.5 -1 1
15 1.0 -1 1
16 -1.0 1 1
17 -0.5 1 1
18 0.0 1 1
19 0.5 1 1
20 1.0 1 1
# choose the 12 points based on D-criterion
library(AlgDesign)
D.gen <- optFederov(~ x1 + x2 + x3 + x1^2, D.candidate.points, nTrials = 12
, criterion = "I", evaluateI = TRUE, maxIteration = 1e4, nRepeats = 1e2)
D.gen
$D
[1] 0.9306049
$A
[1] 1.083333
$I
[1] 3.666667
$Ge
[1] 0.923
$Dea
[1] 0.92
$design
x1 x2 x3
1 -1.0 -1 -1
4 0.5 -1 -1
5 1.0 -1 -1
6 -1.0 1 -1
7 -0.5 1 -1
10 1.0 1 -1
11 -1.0 -1 1
12 -0.5 -1 1
15 1.0 -1 1
16 -1.0 1 1
19 0.5 1 1
20 1.0 1 1
$rows
[1] 1 4 5 6 7 10 11 12 15 16 19 20
A.eff <- 1/D.gen$A # A efficiency is the reciprocal of $A
# easier to read when ordered
library(plyr)
D.gen.ordered <- arrange(D.gen$design, x1, x2, x3)
D.gen.ordered
x1 x2 x3
1 -1.0 -1 -1
2 -1.0 -1 1
3 -1.0 1 -1
4 -1.0 1 1
5 -0.5 -1 1
6 -0.5 1 -1
7 0.5 -1 -1
8 0.5 1 1
9 1.0 -1 -1
10 1.0 -1 1
11 1.0 1 -1
12 1.0 1 1
$$ A-option 22 for y = _0 +_1x_1 + _2x_2 + _3x_3 + _11x_1^2+
$$
#### 8.10
# design candidate values
D.candidate.points <- expand.grid(x1 = seq(-1, 1, by = 0.5)
, x2 = c(-1, 1)
, x3 = c(-1, 1))
#D.candidate.points
# choose the 12 points based on A-criterion
library(AlgDesign)
D.gen <- optFederov(~ x1 + x2 + x3 + x1^2, D.candidate.points, nTrials = 12
, criterion = "A", evaluateI = TRUE, maxIteration = 1e4, nRepeats = 1e2)
D.gen
$D
[1] 0.9306049
$A
[1] 1.083333
$I
[1] 3.666667
$Ge
[1] 0.923
$Dea
[1] 0.92
$design
x1 x2 x3
1 -1.0 -1 -1
2 -0.5 -1 -1
5 1.0 -1 -1
6 -1.0 1 -1
9 0.5 1 -1
10 1.0 1 -1
11 -1.0 -1 1
14 0.5 -1 1
15 1.0 -1 1
16 -1.0 1 1
17 -0.5 1 1
20 1.0 1 1
$rows
[1] 1 2 5 6 9 10 11 14 15 16 17 20
A.eff <- 1/D.gen$A # A efficiency is the reciprocal of $A
# easier to read when ordered
library(plyr)
D.gen.ordered <- arrange(D.gen$design, x1, x2, x3)
D.gen.ordered
x1 x2 x3
1 -1.0 -1 -1
2 -1.0 -1 1
3 -1.0 1 -1
4 -1.0 1 1
5 -0.5 -1 -1
6 -0.5 1 1
7 0.5 -1 1
8 0.5 1 -1
9 1.0 -1 -1
10 1.0 -1 1
11 1.0 1 -1
12 1.0 1 1
The function varfcn() computes the scaled variance function for a design, based on a specified model, and plots it either as a function of the radius in a specified direction or as a contour plot.
library(rsm)
par(mfrow = c(2,2))
varfcn(D.gen.ordered, ~ x1 + x2 + x3 + x1^2)
# the contour plots will plot the first two variables in the formula
varfcn(D.gen.ordered, ~ x1 + x2 + x3 + x1^2, contour = TRUE)
varfcn(D.gen.ordered, ~ x2 + x3 + x1 + x1^2, contour = TRUE)
varfcn(D.gen.ordered, ~ x1 + x3 + x2 + x1^2, contour = TRUE)
#### 10.5
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_10-05.txt"
df.10.5 <- read.table(fn.data, header=TRUE)
df.10.5
x1 x2 y1 y2 y3 y4
1 -1 -1 33.5021 41.2268 25.2683 31.9930
2 -1 0 35.8234 38.0689 32.7928 34.0383
3 -1 1 33.0773 31.8435 36.2500 34.0162
4 0 -1 30.4481 41.2870 15.1493 23.9883
5 0 0 34.8679 40.2276 27.7724 31.1321
6 0 1 35.2202 37.1008 33.3280 35.2085
7 1 -1 21.1553 34.1086 0.7917 15.7450
8 1 0 27.6736 38.1477 15.5132 25.9873
9 1 1 32.1245 38.1193 26.1673 32.1622
# calculate some summary statistics, especially SNR
df.10.5a <- df.10.5
df.10.5a$m <- apply(df.10.5a[,c("y1","y2","y3","y4")], 1, mean)
df.10.5a$s <- apply(df.10.5a[,c("y1","y2","y3","y4")], 1, sd)
df.10.5a$s2 <- apply(df.10.5a[,c("y1","y2","y3","y4")], 1, var)
df.10.5a$logs2 <- log(df.10.5a$s2)
df.10.5a$SNR <- apply(df.10.5a[,c("y1","y2","y3","y4")], 1
, function(x) {
-10 * log10( sum(1 / x^2) / length(x) )
})
# average SNR over each condition, independently
library(plyr)
df.10.5a.SNR <- rbind(
ddply(df.10.5a, .(x1), function(.X) {
data.frame(
var = "x1"
, level = .X$x1[1]
, m.SNR = mean(.X$SNR)
, s.SNR = sd(.X$SNR)
, min.SNR = min(.X$SNR)
, max.SNR = max(.X$SNR)
)
})[,-1]
,
ddply(df.10.5a, .(x2), function(.X) {
data.frame(
var = "x2"
, level = .X$x2[1]
, m.SNR = mean(.X$SNR)
, s.SNR = sd(.X$SNR)
, min.SNR = min(.X$SNR)
, max.SNR = max(.X$SNR)
)
})[,-1]
)
df.10.5a.SNR
var level m.SNR s.SNR min.SNR max.SNR
1 x1 -1 30.46985 0.4599576 29.975634 30.88540
2 x1 0 29.43202 2.0275222 27.121776 30.91566
3 x1 1 20.35925 14.2560878 3.972453 29.90935
4 x2 -1 20.35662 14.2606748 3.972453 29.97563
5 x2 0 29.44667 1.9741986 27.195963 30.88540
6 x2 1 30.45784 0.5092463 29.909346 30.91566
First, model mean and ln of variance separately, as suggested on p. 519 of text. Fit second-order linear model for \[\bar{y} \] and \[log (S^2)\]
library(rsm)
rsm.10.5a.m.SOx1x2 <- rsm(m ~ SO(x1, x2), data = df.10.5a)
# externally Studentized residuals
rsm.10.5a.m.SOx1x2$studres <- rstudent(rsm.10.5a.m.SOx1x2)
summary(rsm.10.5a.m.SOx1x2)
Call:
rsm(formula = m ~ SO(x1, x2), data = df.10.5a)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.388881 0.071727 465.498 2.186e-08 ***
x1 -4.175204 0.039287 -106.275 1.837e-06 ***
x2 3.748096 0.039287 95.404 2.539e-06 ***
x1:x2 3.348494 0.048116 69.592 6.538e-06 ***
x1^2 -2.327671 0.068046 -34.207 5.493e-05 ***
x2^2 -1.867046 0.068046 -27.438 0.0001063 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.9999, Adjusted R-squared: 0.9997
F-statistic: 5432 on 5 and 3 DF, p-value: 3.94e-06
Analysis of Variance Table
Response: m
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 188.883 94.442 10198.17 1.783e-06
TWI(x1, x2) 1 44.850 44.850 4843.03 6.538e-06
PQ(x1, x2) 2 17.808 8.904 961.48 6.148e-05
Residuals 3 0.028 0.009
Lack of fit 3 0.028 0.009 NaN NaN
Pure error 0 0.000 NaN
Stationary point of response surface:
x1 x2
-0.4926412 0.5619813
Eigenanalysis:
eigen() decomposition
$values
[1] -0.4073446 -3.7873721
$vectors
[,1] [,2]
x1 -0.6571611 -0.7537501
x2 -0.7537501 0.6571611
library(rsm)
rsm.10.5a.logs2.SOx1x2 <- rsm(logs2 ~ SO(x1, x2), data = df.10.5a)
# externally Studentized residuals
rsm.10.5a.logs2.SOx1x2$studres <- rstudent(rsm.10.5a.logs2.SOx1x2)
summary(rsm.10.5a.logs2.SOx1x2)
Call:
rsm(formula = logs2 ~ SO(x1, x2), data = df.10.5a)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.986329 0.541521 5.5147 0.01174 *
x1 1.034956 0.296603 3.4894 0.03979 *
x2 -1.421199 0.296603 -4.7916 0.01729 *
x1:x2 0.109458 0.363263 0.3013 0.78285
x1^2 0.251571 0.513732 0.4897 0.65792
x2^2 0.026179 0.513732 0.0510 0.96256
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Multiple R-squared: 0.922, Adjusted R-squared: 0.792
F-statistic: 7.094 on 5 and 3 DF, p-value: 0.06883
Analysis of Variance Table
Response: logs2
Df Sum Sq Mean Sq F value Pr(>F)
FO(x1, x2) 2 18.5457 9.2728 17.5674 0.02206
TWI(x1, x2) 1 0.0479 0.0479 0.0908 0.78285
PQ(x1, x2) 2 0.1279 0.0640 0.1212 0.88999
Residuals 3 1.5835 0.5278
Lack of fit 3 1.5835 0.5278 NaN NaN
Pure error 0 0.0000 NaN
Stationary point of response surface:
x1 x2
-14.60436 57.67617
Eigenanalysis:
eigen() decomposition
$values
[1] 0.26415693 0.01359235
$vectors
[,1] [,2]
x1 -0.9745607 0.2241237
x2 -0.2241237 -0.9745607
par(mfrow = c(1,2))
# this is the stationary point
canonical(rsm.10.5a.m.SOx1x2)$xs
x1 x2
-0.4926412 0.5619813
contour(rsm.10.5a.m.SOx1x2, ~ x1 + x2, image=TRUE
, at = canonical(rsm.10.5a.m.SOx1x2)$xs, main = "mean")
# this is the stationary point
canonical(rsm.10.5a.logs2.SOx1x2)$xs
x1 x2
-1.2730074 -0.2927588
contour(rsm.10.5a.logs2.SOx1x2, ~ x1 + x2, image=TRUE
, at = canonical(rsm.10.5a.logs2.SOx1x2)$xs, main = "log(s^2)")
#### 10.5b
fn.data <- "http://statacumen.com/teach/RSM/data/RSM_EX_10-05.txt"
df.10.5b <- read.table(fn.data, header=TRUE)
df.10.5b
x1 x2 y1 y2 y3 y4
1 -1 -1 33.5021 41.2268 25.2683 31.9930
2 -1 0 35.8234 38.0689 32.7928 34.0383
3 -1 1 33.0773 31.8435 36.2500 34.0162
4 0 -1 30.4481 41.2870 15.1493 23.9883
5 0 0 34.8679 40.2276 27.7724 31.1321
6 0 1 35.2202 37.1008 33.3280 35.2085
7 1 -1 21.1553 34.1086 0.7917 15.7450
8 1 0 27.6736 38.1477 15.5132 25.9873
9 1 1 32.1245 38.1193 26.1673 32.1622
library(reshape2)
df.10.5b.long <- melt(df.10.5b, id.vars = c("x1", "x2"), variable.name = "rep", value.name = "# create z1 and z2 variables")
# create z1 and z2 variables
df.10.5b.long$z1 <- NA
df.10.5b.long$z2 <- NA
df.10.5b.long[(df.10.5b.long$rep == "y1"), c("z1", "z2")] <- data.frame(z1 = -1, z2 = -1)
df.10.5b.long[(df.10.5b.long$rep == "y2"), c("z1", "z2")] <- data.frame(z1 = -1, z2 = 1)
df.10.5b.long[(df.10.5b.long$rep == "y3"), c("z1", "z2")] <- data.frame(z1 = 1, z2 = -1)
df.10.5b.long[(df.10.5b.long$rep == "y4"), c("z1", "z2")] <- data.frame(z1 = 1, z2 = 1)
df.10.5b.long
x1 x2 rep # create z1 and z2 variables z1 z2
1 -1 -1 y1 33.5021 -1 -1
2 -1 0 y1 35.8234 -1 -1
3 -1 1 y1 33.0773 -1 -1
4 0 -1 y1 30.4481 -1 -1
5 0 0 y1 34.8679 -1 -1
6 0 1 y1 35.2202 -1 -1
7 1 -1 y1 21.1553 -1 -1
8 1 0 y1 27.6736 -1 -1
9 1 1 y1 32.1245 -1 -1
10 -1 -1 y2 41.2268 -1 1
11 -1 0 y2 38.0689 -1 1
12 -1 1 y2 31.8435 -1 1
13 0 -1 y2 41.2870 -1 1
14 0 0 y2 40.2276 -1 1
15 0 1 y2 37.1008 -1 1
16 1 -1 y2 34.1086 -1 1
17 1 0 y2 38.1477 -1 1
18 1 1 y2 38.1193 -1 1
19 -1 -1 y3 25.2683 1 -1
20 -1 0 y3 32.7928 1 -1
21 -1 1 y3 36.2500 1 -1
22 0 -1 y3 15.1493 1 -1
23 0 0 y3 27.7724 1 -1
24 0 1 y3 33.3280 1 -1
25 1 -1 y3 0.7917 1 -1
26 1 0 y3 15.5132 1 -1
27 1 1 y3 26.1673 1 -1
28 -1 -1 y4 31.9930 1 1
29 -1 0 y4 34.0383 1 1
30 -1 1 y4 34.0162 1 1
31 0 -1 y4 23.9883 1 1
32 0 0 y4 31.1321 1 1
33 0 1 y4 35.2085 1 1
34 1 -1 y4 15.7450 1 1
35 1 0 y4 25.9873 1 1
36 1 1 y4 32.1622 1 1