The data being used in the experiment is the Cars93 dataset from the Ecdat package. This dataset contains 93 observations of 23 variables. This dataset is a collection of attributes of vehicles that were for sale in the United States in 1993.
This experiment is a continuation of Project #3. For this experiment, we are interested in the effect of 4 factors on the price of the vehicle. As per the design requirements, we are interested in examining two 2-level factors and 2 3-level factors. The analysis is conducted by expressing the two 3-level factors as combinations of 2-level factors. A Taguchi design will be created and carried out to analyze the main effects. The results obtained from the Taguchi design will be compared to the results obtained from the Fractional Factorial Design in Project #3.
Cars93 <- read.delim("C:/Users/wheels/Desktop/Design of Experiments/Project #3/Cars93.txt")
mantrans is a 2-factor categorical variable that states whether or not a car can have a manual transmission. The two levels are No and Yes.
origin is a 2-factor categorical variable that states whether or not a car was produced in the United States. The two levels are non-USA and USA.
airbag is a 3-factor categorical variable that lists the type of airbags that the car has. The three levels are None, Driver only and Driver & Passenger.
drive is a 3-factor categorical variable that lists the type of drive train that the car has. The three levels are Rear, Front, and 4WD.
Since this experiment is interested in the effect of certain vehicle features on the price of the vehicle, the response variable is Price. Price is a continuous dependent variable.
head(Cars93)
## mantrans origin airbags drive price
## 1 Yes non-USA None Front 15.9
## 2 Yes non-USA Driver & Passenger Front 33.9
## 3 Yes non-USA Driver only Front 29.1
## 4 Yes non-USA Driver & Passenger Front 37.7
## 5 Yes non-USA Driver only Rear 30.0
## 6 No USA Driver only Front 15.7
str(Cars93)
## 'data.frame': 93 obs. of 5 variables:
## $ mantrans: Factor w/ 2 levels "No","Yes": 2 2 2 2 2 1 1 1 1 1 ...
## $ origin : Factor w/ 2 levels "non-USA","USA": 1 1 1 1 1 2 2 2 2 2 ...
## $ airbags : Factor w/ 3 levels "Driver & Passenger",..: 3 1 2 1 2 2 2 2 2 2 ...
## $ drive : Factor w/ 3 levels "4WD","Front",..: 2 2 2 2 3 2 2 3 2 2 ...
## $ price : num 15.9 33.9 29.1 37.7 30 15.7 20.8 23.7 26.3 34.7 ...
summary(Cars93)
## mantrans origin airbags drive price
## No :32 non-USA:45 Driver & Passenger:16 4WD :10 Min. : 7.40
## Yes:61 USA :48 Driver only :43 Front:67 1st Qu.:12.20
## None :34 Rear :16 Median :17.70
## Mean :19.51
## 3rd Qu.:23.30
## Max. :61.90
To be able to carry out a Taguchi design, the categorical variables need to be replaced with character factors to represent levels of high and low. The levels were assigned to each factor as follows:
mantrans: No= 0, Yes= 1
origin: Non-USA= 0, USA= 1
airbag: Driver & Passenger= 0, Driver Only= 1, None= 2
drive: 4WD= 0, Front= 1, Rear= 2
This manipulated dataset can be seen here:
## mantrans origin airbag drive price
## 1 1 0 2 1 15.9
## 2 1 0 0 1 33.9
## 3 1 0 1 1 29.1
## 4 1 0 0 1 37.7
## 5 1 0 1 2 30.0
## 6 0 1 1 1 15.7
## 7 0 1 1 1 20.8
## 8 0 1 1 2 23.7
## 9 0 1 1 1 26.3
## 10 0 1 1 1 34.7
## 'data.frame': 93 obs. of 5 variables:
## $ mantrans: num 1 1 1 1 1 0 0 0 0 0 ...
## $ origin : num 0 0 0 0 0 1 1 1 1 1 ...
## $ airbag : num 2 0 1 0 1 1 1 1 1 1 ...
## $ drive : num 1 1 1 1 2 1 1 2 1 1 ...
## $ price : num 15.9 33.9 29.1 37.7 30 15.7 20.8 23.7 26.3 34.7 ...
This experiment was conducted to observe the effects of several vehicle features on the price of the vehicle. A Taguchi design was used to conduct this experiment, which allows for the reduction of experimental runs, while still calculating the main effects.
This experiment will be a Taguchi design and will be more efficient way to look at the main effects than a full factorial design. The 3-level factors will be deconstructed into two 2-level factors and, upon calculation, the sum of the 2-level factors will represent the data that was stored in the 3-level factor. Finally ANOVA will be conducted and the final model will be represented.
This Taguchi design is used to reduce the resources necessary to conduct an analysis, compared to a full factorial design. With a full factorial design, this experiment would take 64 runs if it was decomposed like it is now, or 36 runs without decomposition. A Taguchi design will reduce the number of runs and still provide an accurate measurement of the main effects.
Randomization was utilized in this experiment. While we can’t comment on the data collection method, we can use randomization by randomly ordering the 8 experiments and randomly selecting a sample.This experiment is not going to use replication. There is also no blocking in this experiment.
A boxplot showing the combinations of the independent factors and their effect on the response variable price is shown below.
As you can see from this boxplot, there are some outliers, but overall, there are no large issues that would stop us from moving forward with the experiment.
The hypothesis for this test will be:
Null Hypothesis - There is no statistically significant difference between the prices of the vehicles due to the changing factor levels of the independent variables.
Alternate Hypothesis - There is a statistically significant difference between the prices of the vehicles due to the changing factor levels of the independent variables.
A Taguchi design will be used to analyze the main effects, and then ANOVA will be used further analyze the main effects and build a model.
As previously mentioned, it would take 64 runs to complete this experiment with the decomposed factors and 36 runs to complete this experiment with the original data with a full factorial design. The decomposed factors were chosen for the Taguchi design because it helps to reduce the number of runs needed. taguchiChoose() was used to determine the possible design options. Since L8_2 had the least number of runs, it was used for this experiment.
## Warning: package 'qualityTools' was built under R version 3.3.2
## Loading required package: Rsolnp
## Warning: package 'Rsolnp' was built under R version 3.3.2
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked _by_ '.GlobalEnv':
##
## Cars93
##
## Attaching package: 'qualityTools'
## The following object is masked from 'package:stats':
##
## sigma
## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
##
## Possible Designs:
##
## L8_2 L12_2 L16_2 L32_2
##
## Use taguchiDesign("L8_2") or different to create a taguchi design object
## [1] "L8_2" "L12_2" "L16_2" "L32_2"
The chosen design was input into the taguchiDesign function to create a design matrix, as shown below:
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## StandOrder RunOrder Replicate A B C D E F G y
## 1 5 1 1 2 1 2 1 2 1 2 NA
## 2 7 2 1 2 2 1 1 2 2 1 NA
## 3 1 3 1 1 1 1 1 1 1 1 NA
## 4 3 4 1 1 2 2 1 1 2 2 NA
## 5 4 5 1 1 2 2 2 2 1 1 NA
## 6 8 6 1 2 2 1 2 1 1 2 NA
## 7 6 7 1 2 1 2 2 1 2 1 NA
## 8 2 8 1 1 1 1 2 2 2 2 NA
Since there are only 6 factors being tested, the last column will be dropped. A subset of the dataset will be created for each of the treatment levels shown.
The means are calculated for each run, using the level combinations mentioned above, and put into an array. This array is put into the Taguchi design and the main effects are calculated.
## StandOrder RunOrder Replicate A B C D E F G means
## 1 5 1 1 2 1 2 1 2 1 2 15.7
## 2 7 2 1 2 2 1 1 2 2 1 9.1
## 3 1 3 1 1 1 1 1 1 1 1 20.8
## 4 3 4 1 1 2 2 1 1 2 2 47.9
## 5 4 5 1 1 2 2 2 2 1 1 17.7
## 6 8 6 1 2 2 1 2 1 1 2 19.9
## 7 6 7 1 2 1 2 2 1 2 1 16.3
## 8 2 8 1 1 1 1 2 2 2 2 19.0
The results of the main effects calculations are shown below in both a table and effect plots.
## A
## A -5.15
## B 10.40
## C 11.55
## D 6.35
## E 3.90
## F -8.55
This allows us to easily see the main effects of the factors on the response variable. These main effects can be compared to the main effects that were calculated on project #3:
## Taguchi FFD
## A -5.15 0.275
## B 10.40 11.075
## C 11.55 -10.925
## D 6.35 -2.375
## E 3.90 -2.725
## F -8.55 3.675
The main effects from the Taguchi design are vary somewhat from the main effects that were calculated with the fractional factorial design. Effect B from both designs seems to be similar, while others have different signs, and some of them have different magnitudes.
ANOVA was conducted to observe the significance that each of the factors has on price. Below is a summary of ANOVA.
## Df Sum Sq Mean Sq F value Pr(>F)
## mantrans 1 915 915.1 17.843 5.83e-05 ***
## origin 1 611 611.1 11.915 0.000858 ***
## airbag 1 2198 2198.4 42.865 3.79e-09 ***
## drive 1 346 346.1 6.748 0.011000 *
## Residuals 88 4513 51.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From ANOVA, it is clear that all of the main effects are significant. This is in agreement with the full factorial design from Project #3 and indicates that we should reject the null hypothesis for each variable. The model can be represented by the following:
\(price = 3.848X1 - 6.189 X2 - 6.916 X3 - 5.520 X4 + 30.186\)
It is difficult to interpret this model because of the different signs and magnitudes that were calculated for the main effects from project #3. However, we can still check the adequacy of this model.
Model adequacy checking was done to observe the fit of the model. The residuals vs. fitted plot shows that the model doesn’t have as uniform of a distribution of values as we would like to see. This might be due to the small number of observations collected, or just the nature of this type of data. This Q-Q plot has very little deviation, with the exception of a few outliers, which shows that this model fits the data well. This shows that there isn’t much change from project #3 in the model fit.
This Taguchi design produced different main effect calculations than the fractional factorial design. The Taguchi design and the fractional factorial design use the same number of runs to calculate the main effects. This experiment seems to reasonably be accurate, and it would probably help it the dataset had more observations and replicates in each subset. We can’t determine which design is more accurate without repeating the experiment with a full factorial design. This Taguchi design seems to be a decent approximation of the main effects.
D. C. Montgomery, Design and Analysis of Experiments, 8th ed. Hoboken, NJ: John Wiley & Sons, Inc., 2013.
https://vincentarelbundock.github.io/Rdatasets/doc/MASS/Cars93.html
ISYE 6020 class resources
The Cars93 data set was used from the Ecdat package in R. More information on this data set can be found at: https://vincentarelbundock.github.io/Rdatasets/doc/MASS/Cars93.html
#Shamus Wheeler
#Project 4
#load qualityTools package
library(qualityTools)
#show first observations in dataset
head(Cars93)
#show structure of dataset
str(Cars93)
#show summary of dataset
summary(Cars93)
#create dataframes that represent factor levels with numbers
a <- nrow(Cars93)
mantrans = data.frame(a)
origin <- data.frame(a)
airbag <- data.frame(a)
drive <- data.frame(a)
#for loop to replace factor levels with numbers
for (i in 1:a){
#mantrans: No = 0, Yes = 1
if (Cars93$mantrans[i] == "No"){
mantrans[i,1] = 0
} else{
mantrans[i,1] = 1
}
# origin: Non-USA = 0, USA = 1
if (Cars93$origin[i] == "non-USA"){
origin[i,1] = 0
} else{
origin[i,1] = 1
}
# airbags: driver & passenger = 0, driver only = 1, none = 2
if (Cars93$airbags[i] =="Driver & Passenger"){
airbag[i,1] = 0
}
if (Cars93$airbags[i] == "Driver only"){
airbag[i,1] = 1
}
if (Cars93$airbags[i] == "None"){
airbag[i,1] = 2
}
# drive: 4WD = 0, Front = 1, Rear = 2
if (Cars93$drive[i] =="4WD"){
drive[i,1] = 0
}
if (Cars93$drive[i] == "Front"){
drive[i,1] = 1
}
if (Cars93$drive[i] == "Rear"){
drive[i,1] = 2
}
}
#dataframe of column vectors with response variable
car <- cbind( mantrans, origin, airbag, drive, Cars93$price)
colnames(car) <- c( "mantrans", "origin", "airbag", "drive", "price")
#show head of new dataset
head(car,10)
#show structure of new dataset
str(car)
#boxplot of factor
boxplot(car$price ~ car$mantrans+car$origin+car$airbag+car$drive+car$price, xlab="mantrans.origin.airbag.drive", ylab="Price",main="Analysis of Factors")
#set seed fpr project results
set.seed(1)
# find correct Taguchi design for data set
t <- taguchiChoose(6,0,2,0)
print(t)
#show structure of Taguchi design matrix
t <- taguchiDesign("L8_2")
print(t)
# Subset creation for factorial design pulled randomly from the table
subseta <- subset(car, mantrans == "1" & origin == "0" & airbag == "1" & drive == "1")
subsetb <- subset(car, mantrans == "1" & origin == "0" & airbag == "2" & drive == "1")
subsetc <- subset(car, mantrans == "0" & origin == "1" & airbag == "1" & drive == "1")
subsetd <- subset(car, mantrans == "0" & origin == "0" & airbag == "1" & drive == "2")
subsete <- subset(car, mantrans == "1" & origin == "1" & airbag == "0" & drive == "2")
subsetf <- subset(car, mantrans == "1" & origin == "1" & airbag == "1" & drive == "0")
subsetg <- subset(car, mantrans == "0" & origin == "1" & airbag == "2" & drive == "1")
subseth <- subset(car, mantrans == "0" & origin == "1" & airbag == "1" & drive == "0")
#Function to get a sample of row
func <- function (Cars93){
a <- sample(nrow(Cars93))
b <- a[1]
return(Cars93$price[b])
}
# Use function to get group samples
m_a <- func(subseta)
m_b <- func(subsetb)
m_c <- func(subsetc)
m_d <- func(subsetd)
m_e <- func(subsete)
m_f <- func(subsetf)
m_g <- func(subsetg)
m_h <- func(subseth)
#create vector
means_vec <- c(m_a[1], m_b[1], m_c[1], m_d[1], m_e[1], m_f[1], m_g[1], m_h[1])
#convert to matrix
means <- as.matrix(means_vec)
#insert response into Taguchi matrix
response(t) <- means
#show matrix with responses
print(t)
#calculate main effects
mea <- 1/4 * ((m_e[1]+m_f[1]+m_g[1]+m_h[1])-(m_b[1]+m_a[1]+m_c[1]+m_d[1]))
meb <- 1/4 * ((m_c[1]+m_d[1]+m_g[1]+m_h[1])-(m_a[1]+m_e[1]+m_f[1]+m_b[1]))
mec <- 1/4 * ((m_c[1]+m_d[1]+m_e[1]+m_f[1])-(m_b[1]+m_a[1]+m_g[1]+m_h[1]))
med <- 1/4 * ((m_b[1]+m_d[1]+m_f[1]+m_h[1])-(m_a[1]+m_c[1]+m_e[1]+m_g[1]))
mee <- 1/4 * ((m_b[1]+m_d[1]+m_e[1]+m_g[1])-(m_a[1]+m_c[1]+m_f[1]+m_h[1]))
mef <- 1/4 * ((m_b[1]+m_c[1]+m_f[1]+m_g[1])-(m_a[1]+m_d[1]+m_h[1]+m_e[1]))
#create main effect vector
me_vec <- matrix(c(mea,meb,mec,med,mee,mef),ncol=1)
#convert to table
me_table <- as.table(me_vec)
#add names to rows
rownames(me_table) <- c("A","B","C","D","E","F")
#display table
me_table
#create main effect plots
effectPlot(t, factors = c("A","B","C","D","E","F"))
#create compare table
me_compare_table <- matrix(c(mea,meb,mec,med,mee,mef,0.275,11.075,-10.925,-2.375,-2.725,3.675), ncol = 2)
#add names to rows
colnames(me_compare_table) <- c("Taguchi", "FFD")
#add names to rows
rownames(me_compare_table) <- c("A","B","C","D","E","F")
#display table
me_compare_table
#perform ANOVA
anova <- aov(price~mantrans + origin + airbag + drive, data=car)
#display summary of ANOVA
summary(anova)
#find coefficients for model
fit <- lm(price~mantrans+origin+airbag+drive, data=car)
fit
#Anova for model
anova2 <- aov(price~ mantrans + origin + airbag + drive, data=car)
#model diagnostics
plot(anova2)