For this recipe, we will be using the same dataset that was analyzed in Project 3. This is a dataset of automobile design and performance metrics from 1974 Motor Trend is analyzed. MPG is a response variable, which is dependent on factors including number of cylinders, car weight, V or Straight Engine, and Automatic or Manual transmission. There were 32 observations. This data can be found on vincentarelbundock.github.io/Rdatasets/datasets.html.
library("Ecdat")
## Warning: package 'Ecdat' was built under R version 3.1.3
## Loading required package: Ecfun
## Warning: package 'Ecfun' was built under R version 3.1.3
##
## Attaching package: 'Ecdat'
## The following object is masked from 'package:datasets':
##
## Orange
data(mtcars)
library(FrF2)
## Warning: package 'FrF2' was built under R version 3.1.3
## Loading required package: DoE.base
## Warning: package 'DoE.base' was built under R version 3.1.3
## Loading required package: grid
## Loading required package: conf.design
## Warning: package 'conf.design' was built under R version 3.1.3
##
## Attaching package: 'DoE.base'
## The following objects are masked from 'package:stats':
##
## aov, lm
## The following object is masked from 'package:graphics':
##
## plot.design
library("rsm")
## Warning: package 'rsm' was built under R version 3.1.3
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000
The two 2-level factors being considered are:
vs: Engine (0 = V, 1 = Straight)
am: Transmission (0 = automatic, 1 = manual)
The two 3-level factors being considered are:
cyl: Number of cylinders (“4”=2, “6”= 1, “8”=0)
wt: Weight in 1000s of lbs (“1.304 to 2.817” = 2, “2.818 to 4.120” = 1, “4.121 to 5.424” = 0)
mtcars$cyl[mtcars$cyl >= 4 & mtcars$cyl < 6] = 2
mtcars$cyl[mtcars$cyl >= 6 & mtcars$cyl < 8] = 1
mtcars$cyl[mtcars$cyl >= 8 & mtcars$cyl < 9] = 0
mtcars$wt[as.numeric(mtcars$wt) >= 1.304 & as.numeric(mtcars$wt) < 2.818] = 2
mtcars$wt[as.numeric(mtcars$wt) >= 2.818 & as.numeric(mtcars$wt) < 4.121] = 1
mtcars$wt[as.numeric(mtcars$wt) >= 4.121 & as.numeric(mtcars$wt) < 5.424] = 0
These functions split the continous data into three defined levels.
mtcars$vs=as.factor(mtcars$vs)
mtcars$am=as.factor(mtcars$am)
mtcars$cyl=as.factor(mtcars$cyl)
mtcars$wt=as.factor(mtcars$wt)
Assingment of binary structure for factors.
r=nrow(mtcars)
vsnum = data.frame(1)
amnum = data.frame(1)
cylnum = data.frame(1)
wtnum = data.frame(1)
for (i in 1:r){
if (mtcars$vs[i] == "1"){
vsnum[i,1] <- 1
} else {
vsnum[i,1] <- 0
}
if (mtcars$am[i] == "1"){
amnum[i,1] <- 1
} else {
amnum[i,1] <- 0
}
if (mtcars$cyl[i] == "1"){
cylnum[i,1] <- 1
} else {
cylnum[i,1] <- 0
}
if (mtcars$wt[i] == "1"){
wtnum[i,1] <- 1
} else {
wtnum[i,1] <- 0
}
}
car1 <- cbind(vsnum,amnum,cylnum,wtnum,mtcars$mpg)
colnames(car1) <- c("vs","am","cyl","wt","mpg")
head(car1)
## vs am cyl wt mpg
## 1 0 1 1 0 21.0
## 2 0 1 1 1 21.0
## 3 1 1 0 0 22.8
## 4 1 0 1 1 21.4
## 5 0 0 0 1 18.7
## 6 1 0 1 1 18.1
str(car1)
## 'data.frame': 32 obs. of 5 variables:
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ cyl: num 1 1 0 1 0 1 0 0 0 1 ...
## $ wt : num 0 1 0 1 1 1 1 1 1 1 ...
## $ mpg: num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
summary(car1)
## vs am cyl wt
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.0000 Median :0.0000 Median :1.0000
## Mean :0.4375 Mean :0.4062 Mean :0.2188 Mean :0.5625
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.0000
## mpg
## Min. :10.40
## 1st Qu.:15.43
## Median :19.20
## Mean :20.09
## 3rd Qu.:22.80
## Max. :33.90
All of the variables in this experiment are discrete. The only continuous variable is MPG, which is dependent upon the factors. But, MPG is still a discretized value.
Deciding upon a response variable in this experiment was relatively simple and intuitive. The MPG of an automobile is affected by weight, transmission, engine, and number of cylinders, along with other factors which are not discussed. (i.e. displacement, fuel type, aerodynamics, gear ratios, horsepower etc.) These factors, combined determine MPG. MPG is not the determinant for the factors, so it was relatively simple to see where the dependency lies. MPG is the response variable.
The structure of the automobile data is as follows:
#analyzing structure
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : Factor w/ 3 levels "0","1","2": 2 2 3 2 1 2 1 3 3 2 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : Factor w/ 4 levels "0","1","2","5.424": 3 2 3 2 2 2 2 2 2 2 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : Factor w/ 2 levels "0","1": 1 1 2 2 1 2 1 2 2 2 ...
## $ am : Factor w/ 2 levels "0","1": 2 2 2 1 1 1 1 1 1 1 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
First and last 6 observations of the dataset:
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 1 160 110 3.90 2 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 1 160 110 3.90 1 17.02 0 1 4 4
## Datsun 710 22.8 2 108 93 3.85 2 18.61 1 1 4 1
## Hornet 4 Drive 21.4 1 258 110 3.08 1 19.44 1 0 3 1
## Hornet Sportabout 18.7 0 360 175 3.15 1 17.02 0 0 3 2
## Valiant 18.1 1 225 105 2.76 1 20.22 1 0 3 1
tail(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Porsche 914-2 26.0 2 120.3 91 4.43 2 16.7 0 1 5 2
## Lotus Europa 30.4 2 95.1 113 3.77 2 16.9 1 1 5 2
## Ford Pantera L 15.8 0 351.0 264 4.22 1 14.5 0 1 5 4
## Ferrari Dino 19.7 1 145.0 175 3.62 2 15.5 0 1 5 6
## Maserati Bora 15.0 0 301.0 335 3.54 1 14.6 0 1 5 8
## Volvo 142E 21.4 2 121.0 109 4.11 2 18.6 1 1 4 2
The design of the experiment is 2k fractional factorial design. This design is used to calculate the lowest level resoultion. The full factorial design of the experiment would be 26 design, but ultimately, the three-level factor will be represented by two-level factors.We also used a linear ANOVA model to analyze main effects.
After that, we use Response Surface Method to estimate residuals and coefficients.
We use the linear ANOVA design because it allows us to analyze main effects of factors on response variables. We use the fractional factorial design to analyze confounding between interaction effects.
We then use RSM for goal optimization, which entails the combining of factor levels to maximaize results. WE are ultimately in search of the combination of engine, transmission, weight, and cylinders that produce the highest MPG.
We are assuming random selection of vehicles, which will have random MPG and factors values.
There are replicates; the study observed the exact same factors on each vehicle. Each vehicle is an observation, but also a replicate. They would be “repeated” if the same study were performed on the same vehicle more than once.
Blocking was applied when selecting the factors to use. The factors that were deemed null include engine displacement, horsepower, rear axle ratio, 1/4 mile time, number of forward gears, and number of carbs. These factors were not as significant as the 4 which were selected.
#checking for relationships
boxplot(mtcars$mpg~mtcars$vs, xlab="V or Straight", ylab="MPG")
boxplot(mtcars$mpg~mtcars$am, xlab="Automatic or Manual", ylab="MPG")
boxplot(mtcars$mpg~mtcars$cyl, xlab="# of Cylinders", ylab="MPG")
boxplot(mtcars$mpg~mtcars$wt, xlab="Weight", ylab="MPG")
#anova test
model1=aov(mtcars$mpg~mtcars$vs+mtcars$am+mtcars$cyl+mtcars$wt)
anova(model1)
## Analysis of Variance Table
##
## Response: mtcars$mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## mtcars$vs 1 496.53 496.53 53.5014 1.484e-07 ***
## mtcars$am 1 276.03 276.03 29.7429 1.324e-05 ***
## mtcars$cyl 2 94.59 47.30 5.0961 0.0143 *
## mtcars$wt 3 36.16 12.05 1.2987 0.2977
## Residuals 24 222.74 9.28
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the linear ANOVA model, with the binary data, we can see that the type of engine, the type of transmission, the number of cylinders, and the weight all have significant effects on the MPG.
qqnorm(residuals(model1))
qqline(residuals(model1))
After observing the Q-Q norm plot, we can tell that the data is relatively normal, so we can assume normality when proceeding.
Full Factorial Design of 64 runs:
library(FrF2)
u= FrF2(64,6,res3 = T)
## creating full factorial with 64 runs ...
print(u)
## A B C D E F
## 1 1 -1 1 1 -1 1
## 2 -1 -1 -1 1 -1 1
## 3 1 -1 -1 1 1 -1
## 4 -1 -1 1 1 1 -1
## 5 1 1 1 1 -1 -1
## 6 1 -1 -1 -1 1 -1
## 7 -1 -1 1 -1 1 1
## 8 1 1 -1 1 1 1
## 9 1 -1 1 -1 -1 -1
## 10 1 1 -1 -1 -1 1
## 11 1 -1 1 -1 1 1
## 12 -1 -1 1 1 -1 1
## 13 -1 -1 1 1 1 1
## 14 1 -1 1 1 -1 -1
## 15 1 1 -1 1 -1 -1
## 16 1 1 1 1 1 1
## 17 -1 1 1 -1 -1 -1
## 18 1 1 1 -1 1 -1
## 19 -1 1 1 1 1 1
## 20 -1 1 1 -1 1 -1
## 21 -1 -1 1 -1 1 -1
## 22 -1 1 -1 1 1 1
## 23 1 -1 -1 -1 -1 -1
## 24 1 1 1 -1 1 1
## 25 -1 1 -1 1 -1 -1
## 26 -1 -1 1 -1 -1 -1
## 27 -1 1 -1 1 -1 1
## 28 1 -1 1 1 1 1
## 29 -1 1 -1 -1 1 -1
## 30 -1 1 -1 1 1 -1
## 31 -1 1 1 -1 -1 1
## 32 1 1 1 1 1 -1
## 33 1 1 -1 1 -1 1
## 34 1 -1 -1 1 1 1
## 35 -1 1 1 1 -1 1
## 36 1 -1 -1 1 -1 -1
## 37 -1 1 -1 -1 -1 -1
## 38 -1 1 1 1 1 -1
## 39 -1 -1 -1 1 1 -1
## 40 -1 -1 1 1 -1 -1
## 41 -1 -1 -1 1 -1 -1
## 42 1 1 -1 -1 1 -1
## 43 1 1 -1 -1 -1 -1
## 44 1 -1 1 1 1 -1
## 45 1 -1 -1 1 -1 1
## 46 -1 1 1 1 -1 -1
## 47 -1 -1 1 -1 -1 1
## 48 -1 1 -1 -1 1 1
## 49 -1 1 -1 -1 -1 1
## 50 -1 1 1 -1 1 1
## 51 -1 -1 -1 1 1 1
## 52 -1 -1 -1 -1 1 1
## 53 1 -1 1 -1 -1 1
## 54 1 -1 -1 -1 -1 1
## 55 1 1 1 -1 -1 1
## 56 1 1 -1 -1 1 1
## 57 -1 -1 -1 -1 1 -1
## 58 1 -1 -1 -1 1 1
## 59 -1 -1 -1 -1 -1 -1
## 60 -1 -1 -1 -1 -1 1
## 61 1 1 1 -1 -1 -1
## 62 1 1 -1 1 1 -1
## 63 1 -1 1 -1 1 -1
## 64 1 1 1 1 -1 1
## class=design, type= full factorial
Fractional factorial design of 8 runs:
library(FrF2)
s= FrF2(8,6,res3 = T)
print(s)
## A B C D E F
## 1 -1 -1 1 1 -1 -1
## 2 -1 -1 -1 1 1 1
## 3 1 1 1 1 1 1
## 4 1 -1 1 -1 1 -1
## 5 -1 1 1 -1 -1 1
## 6 -1 1 -1 -1 1 -1
## 7 1 -1 -1 -1 -1 1
## 8 1 1 -1 1 -1 -1
## class=design, type= FrF2
We can recognize apparent confounding in within the data below. Since the design of the experiment is resolution 3, the MEs and 2fis are confounding with 2fis.
aliasprint(s)
## $legend
## [1] A=A B=B C=C D=D E=E F=F
##
## $main
## [1] A=BD=CE B=AD=CF C=AE=BF D=AB=EF E=AC=DF F=BC=DE
##
## $fi2
## [1] AF=BE=CD
library("rsm")
car.rsm=rsm(mpg ~ SO(vs,am,cyl,wt), data=car1)
## Warning in rsm(mpg ~ SO(vs, am, cyl, wt), data = car1): Some coefficients are aliased - cannot use 'rsm' methods.
## Returning an 'lm' object.
summary(car.rsm)
##
## Call:
## rsm(formula = mpg ~ SO(vs, am, cyl, wt), data = car1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.971 -1.037 0.000 1.391 5.529
##
## Coefficients: (5 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.833 1.695 6.983 5.22e-07 ***
## FO(vs, am, cyl, wt)vs 9.667 3.389 2.852 0.009266 **
## FO(vs, am, cyl, wt)am 14.167 3.389 4.180 0.000389 ***
## FO(vs, am, cyl, wt)cyl -5.650 3.595 -1.572 0.130287
## FO(vs, am, cyl, wt)wt 4.289 1.957 2.192 0.039265 *
## TWI(vs, am, cyl, wt)vs:am -7.295 4.619 -1.580 0.128492
## TWI(vs, am, cyl, wt)vs:cyl -10.075 4.403 -2.288 0.032087 *
## TWI(vs, am, cyl, wt)vs:wt -2.189 4.093 -0.535 0.598146
## TWI(vs, am, cyl, wt)am:cyl NA NA NA NA
## TWI(vs, am, cyl, wt)am:wt -14.889 4.093 -3.638 0.001453 **
## TWI(vs, am, cyl, wt)cyl:wt 11.250 5.084 2.213 0.037584 *
## PQ(vs, am, cyl, wt)vs^2 NA NA NA NA
## PQ(vs, am, cyl, wt)am^2 NA NA NA NA
## PQ(vs, am, cyl, wt)cyl^2 NA NA NA NA
## PQ(vs, am, cyl, wt)wt^2 NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.935 on 22 degrees of freedom
## Multiple R-squared: 0.8317, Adjusted R-squared: 0.7628
## F-statistic: 12.08 on 9 and 22 DF, p-value: 1.233e-06
After reviewing the results of the RSM model, we can reject the null hypotheses for first order vs, first order am, and two way interaction between am and wt.
The following contour plots demonstrate the MPG in response to two variables per plot.
par(mfrow=c(2,3))
contour(car.rsm, ~vs + am + cyl + wt, image=TRUE, at=summary(car.rsm$canonical$xs))
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
The following images are perspective plots for the interactions.
par(mfrow=c(1,1))
persp(car.rsm, ~ vs + am, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
par(mfrow=c(1,1))
persp(car.rsm, ~ vs + cyl, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
par(mfrow=c(1,1))
persp(car.rsm, ~ vs + wt, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
par(mfrow=c(1,1))
persp(car.rsm, ~ am + cyl, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
par(mfrow=c(1,1))
persp(car.rsm, ~ am + wt, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
par(mfrow=c(1,1))
persp(car.rsm, ~ cyl + wt, image=TRUE, at = c(summary(car.rsm)$canonical$xs, Block="B2"), contour="colors", zlab="MPG", theta=30)
## Warning in predict.lm(lmobj, newdata = newdata): prediction from a rank-
## deficient fit may be misleading
## Warning in persp.default(dat$x, dat$y, dat$z, zlim = dat$zlim, theta =
## theta, : "image" is not a graphical parameter
## Warning in persp.default(dat$x, dat$y, dat$z, xlab = dat$labs[1], ylab =
## dat$labs[2], : "image" is not a graphical parameter
## Warning in title(sub = dat$labs[5], ...): "image" is not a graphical
## parameter
We will utilize the Shapiro-Wilk Test to test normality. It uses the null hypothesis principle to check if a sample(i.e. our vehicles) came from a normally distributed population.
shapiro.test(car1$mpg)
##
## Shapiro-Wilk normality test
##
## data: car1$mpg
## W = 0.9476, p-value = 0.1229
shapiro.test(1/(car1$mpg))
##
## Shapiro-Wilk normality test
##
## data: 1/(car1$mpg)
## W = 0.9388, p-value = 0.06922
shapiro.test((car1$mpg)^2)
##
## Shapiro-Wilk normality test
##
## data: (car1$mpg)^2
## W = 0.8752, p-value = 0.00153
shapiro.test(sqrt(car1$mpg))
##
## Shapiro-Wilk normality test
##
## data: sqrt(car1$mpg)
## W = 0.9695, p-value = 0.4849
shapiro.test(log(car1$mpg))
##
## Shapiro-Wilk normality test
##
## data: log(car1$mpg)
## W = 0.9767, p-value = 0.699
Based on the results, we reject the null hypothesis only of the squared transformation. This proves that there is evidence, for this transform, that the data is not normally distributed.
One last normality check is the Q-Qnorm plot.
qqnorm(residuals(car.rsm), ylab="MPG")
qqline(residuals(car.rsm))
It is evident that a majority of the data follows a normal distribution. There are some outliers, but ideally, we would exclud or truncate them.
Based on the above RSM analysis, i WOuld suggest that the data are normally distributed.The optimal designs make sense for the problem.
N/A
Raw Data: http://vincentarelbundock.github.io/Rdatasets/datasets.html