Project

Optimal Designs-Taguchi Design

Kaan UNNU

RPI

12.14.2016 V.1

1. Setting

After examining the available data sets in Ecdat, Housing data set has been selected.

Installation of the necessary packages and selection of the data-set is shown below:

#install.packages("pid")
#install.packages("Ecdat")
library(pid)
library(Ecdat)
library(FrF2)
library(knitr)
library(tidyr)
library(kimisc)

Warning: package 'kimisc' was built under R version 3.3.2

library(qualityTools)

Warning: package 'qualityTools' was built under R version 3.3.2

Warning: package 'Rsolnp' was built under R version 3.3.2

Prj <- Housing

System under test

Housing data set was collected for Fractional Factorial Design Project and also for Optimal Designs-Taguchi Design consist of 12 variables and 546 observations. It represents Sales Prices of Houses in the City of Windsor in 1987 and the main source of the data is Journal of Applied Econometrics.

Definitions of the variables in data set are as below:

price : sale price of a house
lotsize : the lot size of a property in square feet
bedrooms : number of bedrooms
bathrms : number of full bathrooms
stories : number of stories excluding basement
driveway : does the house has a driveway ?
recroom : does the house has a recreational room ?
fullbase : does the house has a full finished basement ?
gashw : does the house uses gas for hot water heating ?
airco : does the house has central air conditioning ?
garagepl : number of garage places
prefarea : is the house located in the preferred neighborhood of the city ?

head(Prj)

  price lotsize bedrooms bathrms stories driveway recroom fullbase gashw
1 42000    5850        3       1       2      yes      no      yes    no
2 38500    4000        2       1       1      yes      no       no    no
3 49500    3060        3       1       1      yes      no       no    no
4 60500    6650        3       1       2      yes     yes       no    no
5 61000    6360        2       1       1      yes      no       no    no
6 66000    4160        3       1       1      yes     yes      yes    no
  airco garagepl prefarea
1    no        1       no
2    no        0       no
3    no        0       no
4    no        0       no
5    no        0       no
6   yes        0       no

Factors and Levels

Factors and levels of each factor are listed as below:

str(Prj)

'data.frame':   546 obs. of  12 variables:
 $ price   : num  42000 38500 49500 60500 61000 66000 66000 69000 83800 88500 ...
 $ lotsize : num  5850 4000 3060 6650 6360 4160 3880 4160 4800 5500 ...
 $ bedrooms: num  3 2 3 3 2 3 3 3 3 3 ...
 $ bathrms : num  1 1 1 1 1 1 2 1 1 2 ...
 $ stories : num  2 1 1 2 1 1 2 3 1 4 ...
 $ driveway: Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
 $ recroom : Factor w/ 2 levels "no","yes": 1 1 1 2 1 2 1 1 2 2 ...
 $ fullbase: Factor w/ 2 levels "no","yes": 2 1 1 1 1 2 2 1 2 1 ...
 $ gashw   : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ airco   : Factor w/ 2 levels "no","yes": 1 1 1 1 1 2 1 1 1 2 ...
 $ garagepl: num  1 0 0 0 0 0 2 0 0 1 ...
 $ prefarea: Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...

However, for our project we should select two 2-level factors and two 3-level factors in the given data and re-organize some of the variables according to this constrain.

2-level factors selected are:

fullbase : does the house has a full finished basement ? > already a 2 level factor (yes / no)
prefarea : is the house located in the preferred neighborhood of the city ? > already a 2 level factor (yes / no)

3-level factors selected and needs categorization are:

lotsize : the lot size of a property in square feet > numeric variable we will group it in three levels. Smaller than 3600 sqfeet will be defined as “Low”, 3600-6360 range will be defined as “Med” and finally lotsize bigger than 6360 as “High”.
bedrooms : number of bedrooms > numeric variable we will group it in three levels similar to the lotsize factor definitions will be 1-2 bedrooms as “Low” , 3 bedrooms “Med” and anyone bigger will be “High”.

Our new data set is now :

for (i in 1:546){
if(Prj$lotsize[i] <= 3600){Prj$lotsize2[i] <- "Low"}
if(Prj$lotsize[i] > 3600 & Prj$lotsize[i] <= 6360){Prj$lotsize2[i] <- "Med"}
if(Prj$lotsize[i] > 6360){Prj$lotsize2[i] <- "High"}

if(Prj$bedrooms[i] <= 2){Prj$bedrooms2[i] <- "Low"}
if(Prj$bedrooms[i] == 3){Prj$bedrooms2[i] <- "Med"}
if(Prj$bedrooms[i] > 3){Prj$bedrooms2[i] <- "High"}
} 

Prj$lotsize2 <- as.factor(Prj$lotsize2)
Prj$lotsize2 <- factor(Prj$lotsize2, levels= c("Low","Med","High"))
Prj$bedrooms2 <- as.factor(Prj$bedrooms2)
Prj$bedrooms2 <- factor(Prj$bedrooms2, levels= c("Low","Med","High"))

Prj <- Prj[,c(1,8,12,13,14)]
Prj.Orj <- Prj
str(Prj)

'data.frame':   546 obs. of  5 variables:
 $ price    : num  42000 38500 49500 60500 61000 66000 66000 69000 83800 88500 ...
 $ fullbase : Factor w/ 2 levels "no","yes": 2 1 1 1 1 2 2 1 2 1 ...
 $ prefarea : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ lotsize2 : Factor w/ 3 levels "Low","Med","High": 2 2 1 3 2 2 2 2 2 2 ...
 $ bedrooms2: Factor w/ 3 levels "Low","Med","High": 2 1 2 2 1 2 2 2 2 2 ...

Continuous variables & Response variables

In the data set we have one continuous variable which is also our Response Variable.

price : Sale price of a house

2. (Experimental) Design

How will the experiment be organized ?

We will use taguchi designs in our experiment as we know we have 2 three-level and 2 two-level factors. We could use taguchiChoose function for selecting the best orthogonal design.

taguchiChoose(factors1 = 2, factors2 = 2, level1 = 3, level2 = 2)

## 2 factors on 3 levels and 2 factors on 2 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L36_2_3_a L36_2_3_b
## 
## Use taguchiDesign("L36_2_3_a") or different to create a taguchi design object

For these suggested arrays we need 36 runs:

taguchiDesign("L36_2_3_a")

##    StandOrder RunOrder Replicate A B C D E F G H J K L M N O P Q R S T U V
## 1          15        1         1 1 2 2 1 2 2 1 2 1 2 1 3 1 2 3 2 1 3 2 2 1
## 2           3        2         1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3
## 3          19        3         1 2 1 2 2 1 1 2 2 1 2 1 1 2 1 3 3 3 1 2 2 1
## 4          35        4         1 2 2 1 1 2 1 2 1 2 2 1 2 1 2 3 1 3 1 2 3 3
## 5          23        5         1 2 1 2 1 2 2 2 1 1 1 2 2 3 3 1 1 2 3 2 2 1
## 6          29        6         1 2 2 2 1 1 1 1 2 2 1 2 2 1 3 3 3 2 2 1 3 1
## 7          22        7         1 2 1 2 1 2 2 2 1 1 1 2 1 2 2 3 3 1 2 1 1 3
## 8          17        8         1 1 2 2 2 1 2 2 1 2 1 1 2 3 1 3 2 2 1 3 1 1
## 9          32        9         1 2 2 1 2 1 2 1 1 1 2 2 2 1 1 1 3 1 3 3 2 3
## 10         13       10         1 1 2 2 1 2 2 1 2 1 2 1 1 2 3 1 3 2 1 3 3 2
## 11         28       11         1 2 2 2 1 1 1 1 2 2 1 2 1 3 2 2 2 1 1 3 2 3
## 12          1       12         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 13         27       13         1 2 1 1 2 2 2 1 2 2 1 1 3 2 1 3 1 2 2 3 2 3
## 14         33       14         1 2 2 1 2 1 2 1 1 1 2 2 3 2 2 2 1 2 1 1 3 1
## 15          4       15         1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 3 3
## 16          6       16         1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 1 1 1 1 2 2
## 17         25       17         1 2 1 1 2 2 2 1 2 2 1 1 1 3 2 1 2 3 3 1 3 1
## 18         20       18         1 2 1 2 2 1 1 2 2 1 2 1 2 3 2 1 1 1 2 3 3 2
## 19         24       19         1 2 1 2 1 2 2 2 1 1 1 2 3 1 1 2 2 3 1 3 3 2
## 20          2       20         1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
## 21         31       21         1 2 2 1 2 1 2 1 1 1 2 2 1 3 3 3 2 3 2 2 1 2
## 22         14       22         1 1 2 2 1 2 2 1 2 1 2 1 2 3 1 2 1 3 2 1 1 3
## 23          5       23         1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 1 1
## 24         18       24         1 1 2 2 2 1 2 2 1 2 1 1 3 1 2 1 3 3 2 1 2 2
## 25         26       25         1 2 1 1 2 2 2 1 2 2 1 1 2 1 3 2 3 1 1 2 1 2
## 26         16       26         1 1 2 2 2 1 2 2 1 2 1 1 1 2 3 2 1 1 3 2 3 3
## 27         21       27         1 2 1 2 2 1 1 2 2 1 2 1 3 1 3 2 2 2 3 1 1 3
## 28         10       28         1 1 2 1 2 2 1 2 2 1 1 2 1 1 3 2 1 3 2 3 2 1
## 29         11       29         1 1 2 1 2 2 1 2 2 1 1 2 2 2 1 3 2 1 3 1 3 2
## 30          8       30         1 1 1 2 2 2 1 1 1 2 2 2 2 2 3 1 2 3 1 1 2 3
## 31         30       31         1 2 2 2 1 1 1 1 2 2 1 2 3 2 1 1 1 3 3 2 1 2
## 32         34       32         1 2 2 1 1 2 1 2 1 2 2 1 1 3 1 2 3 2 3 1 2 2
## 33         36       33         1 2 2 1 1 2 1 2 1 2 2 1 3 2 3 1 2 1 2 3 1 1
## 34         12       34         1 1 2 1 2 2 1 2 2 1 1 2 3 3 2 1 3 2 1 2 1 3
## 35          7       35         1 1 1 2 2 2 1 1 1 2 2 2 1 1 2 3 1 2 3 3 1 2
## 36          9       36         1 1 1 2 2 2 1 1 1 2 2 2 3 3 1 2 3 1 2 2 3 1
##    W X  y
## 1  3 1 NA
## 2  3 3 NA
## 3  2 3 NA
## 4  1 2 NA
## 5  1 3 NA
## 6  2 1 NA
## 7  3 2 NA
## 8  3 2 NA
## 9  2 2 NA
## 10 1 2 NA
## 11 1 3 NA
## 12 1 1 NA
## 13 1 1 NA
## 14 3 3 NA
## 15 3 3 NA
## 16 2 2 NA
## 17 2 2 NA
## 18 3 1 NA
## 19 2 1 NA
## 20 2 2 NA
## 21 1 1 NA
## 22 2 3 NA
## 23 1 1 NA
## 24 1 3 NA
## 25 3 3 NA
## 26 2 1 NA
## 27 1 2 NA
## 28 3 2 NA
## 29 1 3 NA
## 30 3 1 NA
## 31 3 2 NA
## 32 3 1 NA
## 33 2 3 NA
## 34 2 1 NA
## 35 2 3 NA
## 36 1 2 NA

So the other options for would be either decomposing the 3 level factors to two level factors and design the dataset similiar to the Fractional Factorial Designs. In this situation we will have 6 factors all of them in 2 levels:

taguchiChoose(factors1 = 6, level1 = 2)

## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L8_2 L12_2 L16_2 L32_2
## 
## Use taguchiDesign("L8_2") or different to create a taguchi design object

For these suggested arrays we can use L8_2 with 8 runs:

taguchiDesign("L8_2")

##   StandOrder RunOrder Replicate A B C D E F G  y
## 1          6        1         1 2 1 2 2 1 2 1 NA
## 2          3        2         1 1 2 2 1 1 2 2 NA
## 3          7        3         1 2 2 1 1 2 2 1 NA
## 4          8        4         1 2 2 1 2 1 1 2 NA
## 5          2        5         1 1 1 1 2 2 2 2 NA
## 6          4        6         1 1 2 2 2 2 1 1 NA
## 7          1        7         1 1 1 1 1 1 1 1 NA
## 8          5        8         1 2 1 2 1 2 1 2 NA

However in the course examples : Mavruz and Ogulata, Taguchi Approach for the optimization of the Bursting Strength of Knitted Fabrics, Fiber & Textiles in Eastern Europe 2010, Vol. 18, No. 2 (79), pp. 78-83 they used 2 level factors in a three level Taguchi array. So we can also try this apporach and design the experiment as 4 three-level factors:

taguchiChoose(factors1 = 4, level1 = 3)

## 4 factors on 3 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L9_3 L27_3
## 
## Use taguchiDesign("L9_3") or different to create a taguchi design object

For these suggested arrays we can use L9_3 with 9 runs:

taguchiDesign("L9_3")

##   StandOrder RunOrder Replicate A B C D  y
## 1          1        1         1 1 1 1 1 NA
## 2          9        2         1 3 3 2 1 NA
## 3          6        3         1 2 3 1 2 NA
## 4          3        4         1 1 3 3 3 NA
## 5          8        5         1 3 2 1 3 NA
## 6          4        6         1 2 1 2 3 NA
## 7          2        7         1 1 2 2 2 NA
## 8          5        8         1 2 2 3 1 NA
## 9          7        9         1 3 1 3 2 NA

What is the rationale for this design?

Data set has been retrieved from Ecdat, and factors had been already chosen and we have more than 500 observations. Similiar to the fractional factorial desing we will again start with a very small number of experimental trials. We will design our experiment with L9 array and modify the two level factors for 9 experiments.

Randomization,Blocking and Replication

Replication, local control(blocking) and randomization are fundamentals of a designing an experiment.Replication and blocking increases the precision in the experiment and randomization helps reducing bias.

Randomize: What is the Randomization Scheme?

Randomization is possible with below three conditions:

Random selection : Selection of your sample set from the population.
Random assignment : Assigning the samples to different groups or treatments
Random execution : Order of the experiments

If all these are realized in a random behavior we can assume the analyze is randomized. In our analyze we do not know the initial collection of the data but we will : * Randomly order our 9 experiments and * Randomly assign our matching measurements form our complete set to the experiment

Replicate: Are there replicates and/or repeated measures?

Similiar with the fractional design also in this dataset we are not going to use a replication. Since our target is to define the main effects with minimum cost/time we will not increase the number of experiements.

3. (Statistical) Analysis

(Exploratory Data Analysis) Graphics and descriptive summary

Since we are going to use only 9 treatments in our Taguchi Design, these initial descriptive analysis will only be for our interest and will not affect our further analysis . Normally the researcher would not have these data beforehand. However, for comparing our findings, factors can be represented with below boxplots and also our response variable histogram will follow:

summary(Prj)

     price        fullbase  prefarea  lotsize2   bedrooms2 
 Min.   : 25000   no :355   no :418   Low :144   Low :138  
 1st Qu.: 49125   yes:191   yes:128   Med :267   Med :301  
 Median : 62000                       High:135   High:107  
 Mean   : 68122                                            
 3rd Qu.: 82000                                            
 Max.   :190000

str(Prj)

'data.frame':   546 obs. of  5 variables:
 $ price    : num  42000 38500 49500 60500 61000 66000 66000 69000 83800 88500 ...
 $ fullbase : Factor w/ 2 levels "no","yes": 2 1 1 1 1 2 2 1 2 1 ...
 $ prefarea : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ lotsize2 : Factor w/ 3 levels "Low","Med","High": 2 2 1 3 2 2 2 2 2 2 ...
 $ bedrooms2: Factor w/ 3 levels "Low","Med","High": 2 1 2 2 1 2 2 2 2 2 ...

head(Prj)

  price fullbase prefarea lotsize2 bedrooms2
1 42000      yes       no      Med       Med
2 38500       no       no      Med       Low
3 49500       no       no      Low       Med
4 60500       no       no     High       Med
5 61000       no       no      Med       Low
6 66000      yes       no      Med       Med

boxplot(Prj$price~Prj$lotsize2, xlab="Lot Size", ylab="House Price")
title("Lot Size(sqf)")
means1 <- by(Prj$price, Prj$lotsize2, mean)
points(1:3, means1, pch = 23, cex = 2, bg = "red")
text(1:3 - 0.1, means1,labels = format(means1, format = "f", digits = 2),pos = 3, cex = 0.9, col = "red")

boxplot(Prj$price~Prj$bedrooms2, xlab="Bedrooms", ylab="House Price")
title("Quantitiy Of Bedrooms")
means1 <- by(Prj$price, Prj$bedrooms2, mean)
points(1:3, means1, pch = 23, cex = 2, bg = "red")
text(1:3 - 0.1, means1,labels = format(means1, format = "f", digits = 2),pos = 3, cex = 0.9, col = "red")

boxplot(Prj$price~Prj$fullbase, xlab="Full Basement", ylab="House Price")
title("Full Basement")
means1 <- by(Prj$price, Prj$fullbase, mean)
points(1:2, means1, pch = 23, cex = 2, bg = "red")
text(1:2 - 0.1, means1,labels = format(means1, format = "f", digits = 2),pos = 3, cex = 0.9, col = "red")

boxplot(Prj$price~Prj$prefarea, xlab="Preferred Area", ylab="House Price")
title("Preferred Area")
means1 <- by(Prj$price, Prj$prefarea, mean)
points(1:2, means1, pch = 23, cex = 2, bg = "red")
text(1:2 - 0.1, means1,labels = format(means1, format = "f", digits = 2),pos = 3, cex = 0.9, col = "red")

Testing

At first for controlling the experiment, we will assing our data to an ordered L9_3 Taguchi Design

K <- as.data.frame(taguchiDesign("L9_3", randomize = FALSE))
A <- c(as.integer(K[,4]))
B <- c(as.integer(K[,5]))
C <- c(as.integer(K[,6]))
D <- c(as.integer(K[,7]))

Plan <- as.data.frame(cbind(A,B,C,D))
Plan

n = nrow(K)
for (i in 1:n){
  if (Plan$A[i] == 1){Plan$fullbase[i] <- "no"}
  if (Plan$A[i] == 2){Plan$fullbase[i] <- "yes"}
  if (Plan$A[i] == 3){Plan$fullbase[i] <- "no"}
  
  if (Plan$B[i] == 1){Plan$prefarea[i] <- "no"}
  if (Plan$B[i] == 2){Plan$prefarea[i] <- "yes"}
  if (Plan$B[i] == 3){Plan$prefarea[i] <- "no"}
    
  if (Plan$C[i] == 1){Plan$lotsize2[i] <- "Low"}
  if (Plan$C[i] == 2){Plan$lotsize2[i] <- "Med"}
  if (Plan$C[i] == 3){Plan$lotsize2[i] <- "High"}
  
  if (Plan$D[i] == 1){Plan$bedrooms2[i] <- "Low"}
  if (Plan$D[i] == 2){Plan$bedrooms2[i] <- "Med"}
  if (Plan$D[i] == 3){Plan$bedrooms2[i] <- "High"}
}
Prjplan <- cbind(K,Plan$fullbase,Plan$prefarea,Plan$lotsize2,Plan$bedrooms2)
kable(Prjplan, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	y	Plan$fullbase	Plan$prefarea	Plan$lotsize2	Plan$bedrooms2
1	1	1	1	1	1	1	NA	no	no	Low	Low
2	2	1	1	2	2	2	NA	no	yes	Med	Med
3	3	1	1	3	3	3	NA	no	no	High	High
4	4	1	2	1	2	3	NA	yes	no	Med	High
5	5	1	2	2	3	1	NA	yes	yes	High	Low
6	6	1	2	3	1	2	NA	yes	no	Low	Med
7	7	1	3	1	3	2	NA	no	no	High	Med
8	8	1	3	2	1	3	NA	no	yes	Low	High
9	9	1	3	3	2	1	NA	no	no	Med	Low

Prjplan1<-unite(Prjplan, refcol , c(9,10,11,12), remove=FALSE)
kable(Prjplan1, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	y	refcol	Plan$fullbase	Plan$prefarea	Plan$lotsize2	Plan$bedrooms2
1	1	1	1	1	1	1	NA	no_no_Low_Low	no	no	Low	Low
2	2	1	1	2	2	2	NA	no_yes_Med_Med	no	yes	Med	Med
3	3	1	1	3	3	3	NA	no_no_High_High	no	no	High	High
4	4	1	2	1	2	3	NA	yes_no_Med_High	yes	no	Med	High
5	5	1	2	2	3	1	NA	yes_yes_High_Low	yes	yes	High	Low
6	6	1	2	3	1	2	NA	yes_no_Low_Med	yes	no	Low	Med
7	7	1	3	1	3	2	NA	no_no_High_Med	no	no	High	Med
8	8	1	3	2	1	3	NA	no_yes_Low_High	no	yes	Low	High
9	9	1	3	3	2	1	NA	no_no_Med_Low	no	no	Med	Low

Plan.refcol <- c(Prjplan1$refcol)


Prj1<-unite(Prj, refcol , c(2,3,4,5), remove=FALSE)
head(Prj1)

  price         refcol fullbase prefarea lotsize2 bedrooms2
1 42000 yes_no_Med_Med      yes       no      Med       Med
2 38500  no_no_Med_Low       no       no      Med       Low
3 49500  no_no_Low_Med       no       no      Low       Med
4 60500 no_no_High_Med       no       no     High       Med
5 61000  no_no_Med_Low       no       no      Med       Low
6 66000 yes_no_Med_Med      yes       no      Med       Med

for (j in 1:546){if (Prjplan1$refcol[1] == Prj1$refcol[j]){Prj1$RO1[j] <- Prj1$price[j]}else{Prj1$RO1[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[2] == Prj1$refcol[j]){Prj1$RO2[j] <- Prj1$price[j]}else{Prj1$RO2[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[3] == Prj1$refcol[j]){Prj1$RO3[j] <- Prj1$price[j]}else{Prj1$RO3[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[4] == Prj1$refcol[j]){Prj1$RO4[j] <- Prj1$price[j]}else{Prj1$RO4[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[5] == Prj1$refcol[j]){Prj1$RO5[j] <- Prj1$price[j]}else{Prj1$RO5[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[6] == Prj1$refcol[j]){Prj1$RO6[j] <- Prj1$price[j]}else{Prj1$RO6[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[7] == Prj1$refcol[j]){Prj1$RO7[j] <- Prj1$price[j]}else{Prj1$RO7[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[8] == Prj1$refcol[j]){Prj1$RO8[j] <- Prj1$price[j]}else{Prj1$RO8[j] <- 0}}
for (j in 1:546){if (Prjplan1$refcol[9] == Prj1$refcol[j]){Prj1$RO9[j] <- Prj1$price[j]}else{Prj1$RO9[j] <- 0}}


numRO1 <- nrow(Prj1[Prj1$RO1 > 0,])
numRO2 <- nrow(Prj1[Prj1$RO2 > 0,])
numRO3 <- nrow(Prj1[Prj1$RO3 > 0,])
numRO4 <- nrow(Prj1[Prj1$RO4 > 0,])
numRO5 <- nrow(Prj1[Prj1$RO5 > 0,])
numRO6 <- nrow(Prj1[Prj1$RO6 > 0,])
numRO7 <- nrow(Prj1[Prj1$RO7 > 0,])
numRO8 <- nrow(Prj1[Prj1$RO8 > 0,])
numRO9 <- nrow(Prj1[Prj1$RO9 > 0,])

numHist <- c(numRO1,numRO2,numRO3,numRO4,numRO5,numRO6,numRO7,numRO8,numRO9)
barplot(numHist, names.arg = c("ER1","ER2","ER3","ER4","ER5","ER6","ER7","ER8","ER9"),xlab = "Exprimental Runs",ylab = "Observations")

However when we plan to use in above order one of the L9 array designs/experiments cannot be found in our dataset Experimental Run 8 where Fullbase=No, PrefArea=Yes, Lotsize=Low, Bedrooms=High. Whic can be also seen from the histogram plotted.

To avoid this problem we can re-assign our factors to the design, this situation also represents the importance of the initial design. We will change the order of Lotsize to factor D , Bedrooms to factor C :

K <- as.data.frame(taguchiDesign("L9_3", randomize = FALSE))
A <- c(as.integer(K[,4]))
B <- c(as.integer(K[,5]))
C <- c(as.integer(K[,6]))
D <- c(as.integer(K[,7]))

Plan <- as.data.frame(cbind(A,B,C,D))
Plan

n = nrow(K)
for (i in 1:n){
  if (Plan$A[i] == 1){Plan$fullbase[i] <- "no"}
  if (Plan$A[i] == 2){Plan$fullbase[i] <- "yes"}
  if (Plan$A[i] == 3){Plan$fullbase[i] <- "no"}
  
  if (Plan$B[i] == 1){Plan$prefarea[i] <- "no"}
  if (Plan$B[i] == 2){Plan$prefarea[i] <- "yes"}
  if (Plan$B[i] == 3){Plan$prefarea[i] <- "no"}
    
  if (Plan$C[i] == 1){Plan$bedrooms2[i] <- "Low"}
  if (Plan$C[i] == 2){Plan$bedrooms2[i] <- "Med"}
  if (Plan$C[i] == 3){Plan$bedrooms2[i] <- "High"}
  
  if (Plan$D[i] == 1){Plan$lotsize2[i] <- "Low"}
  if (Plan$D[i] == 2){Plan$lotsize2[i] <- "Med"}
  if (Plan$D[i] == 3){Plan$lotsize2[i] <- "High"}
  
}
Prjplan <- cbind(K,Plan$fullbase,Plan$prefarea,Plan$bedrooms2,Plan$lotsize2)
kable(Prjplan, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	y	Plan$fullbase	Plan$prefarea	Plan$bedrooms2	Plan$lotsize2
1	1	1	1	1	1	1	NA	no	no	Low	Low
2	2	1	1	2	2	2	NA	no	yes	Med	Med
3	3	1	1	3	3	3	NA	no	no	High	High
4	4	1	2	1	2	3	NA	yes	no	Med	High
5	5	1	2	2	3	1	NA	yes	yes	High	Low
6	6	1	2	3	1	2	NA	yes	no	Low	Med
7	7	1	3	1	3	2	NA	no	no	High	Med
8	8	1	3	2	1	3	NA	no	yes	Low	High
9	9	1	3	3	2	1	NA	no	no	Med	Low

Prjplan2<-unite(Prjplan, refcol , c(9,10,11,12), remove=FALSE)
kable(Prjplan2, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	y	refcol	Plan$fullbase	Plan$prefarea	Plan$bedrooms2	Plan$lotsize2
1	1	1	1	1	1	1	NA	no_no_Low_Low	no	no	Low	Low
2	2	1	1	2	2	2	NA	no_yes_Med_Med	no	yes	Med	Med
3	3	1	1	3	3	3	NA	no_no_High_High	no	no	High	High
4	4	1	2	1	2	3	NA	yes_no_Med_High	yes	no	Med	High
5	5	1	2	2	3	1	NA	yes_yes_High_Low	yes	yes	High	Low
6	6	1	2	3	1	2	NA	yes_no_Low_Med	yes	no	Low	Med
7	7	1	3	1	3	2	NA	no_no_High_Med	no	no	High	Med
8	8	1	3	2	1	3	NA	no_yes_Low_High	no	yes	Low	High
9	9	1	3	3	2	1	NA	no_no_Med_Low	no	no	Med	Low

Plan.refcol <- c(Prjplan2$refcol)


Prj2<-unite(Prj, refcol , c(2,3,5,4), remove=FALSE)
head(Prj2)

  price         refcol fullbase prefarea lotsize2 bedrooms2
1 42000 yes_no_Med_Med      yes       no      Med       Med
2 38500  no_no_Low_Med       no       no      Med       Low
3 49500  no_no_Med_Low       no       no      Low       Med
4 60500 no_no_Med_High       no       no     High       Med
5 61000  no_no_Low_Med       no       no      Med       Low
6 66000 yes_no_Med_Med      yes       no      Med       Med

for (j in 1:546){if (Prjplan2$refcol[1] == Prj2$refcol[j]){Prj2$RO1[j] <- Prj2$price[j]}else{Prj2$RO1[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[2] == Prj2$refcol[j]){Prj2$RO2[j] <- Prj2$price[j]}else{Prj2$RO2[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[3] == Prj2$refcol[j]){Prj2$RO3[j] <- Prj2$price[j]}else{Prj2$RO3[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[4] == Prj2$refcol[j]){Prj2$RO4[j] <- Prj2$price[j]}else{Prj2$RO4[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[5] == Prj2$refcol[j]){Prj2$RO5[j] <- Prj2$price[j]}else{Prj2$RO5[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[6] == Prj2$refcol[j]){Prj2$RO6[j] <- Prj2$price[j]}else{Prj2$RO6[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[7] == Prj2$refcol[j]){Prj2$RO7[j] <- Prj2$price[j]}else{Prj2$RO7[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[8] == Prj2$refcol[j]){Prj2$RO8[j] <- Prj2$price[j]}else{Prj2$RO8[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[9] == Prj2$refcol[j]){Prj2$RO9[j] <- Prj2$price[j]}else{Prj2$RO9[j] <- 0}}

numRO1 <- nrow(Prj2[Prj2$RO1 > 0,])
numRO2 <- nrow(Prj2[Prj2$RO2 > 0,])
numRO3 <- nrow(Prj2[Prj2$RO3 > 0,])
numRO4 <- nrow(Prj2[Prj2$RO4 > 0,])
numRO5 <- nrow(Prj2[Prj2$RO5 > 0,])
numRO6 <- nrow(Prj2[Prj2$RO6 > 0,])
numRO7 <- nrow(Prj2[Prj2$RO7 > 0,])
numRO8 <- nrow(Prj2[Prj2$RO8 > 0,])
numRO9 <- nrow(Prj2[Prj2$RO9 > 0,])

numHist <- c(numRO1,numRO2,numRO3,numRO4,numRO5,numRO6,numRO7,numRO8,numRO9)
barplot(numHist, names.arg = c("ER1","ER2","ER3","ER4","ER5","ER6","ER7","ER8","ER9"),xlab = "Exprimental Runs",ylab = "Observations")

Now we can select experimental runs for our defining a vector of Response Variable with a randomized Taguchi Design:

plan.design <- taguchiDesign("L9_3", randomize = TRUE)
values(plan.design) = list(A = c("no","yes","no"), 
                   B = c("no","yes","no"), C = c("Low","Med","High") , D = c("Low","Med","High"))
names(plan.design) = c("fullbase", "prefarea", "bedrooms", "lotsize")

K <- as.data.frame(plan.design)
A <- c(as.integer(K[,4]))
B <- c(as.integer(K[,5]))
C <- c(as.integer(K[,6]))
D <- c(as.integer(K[,7]))

Plan <- as.data.frame(cbind(A,B,C,D))
Plan

n = nrow(K)
for (i in 1:n){
  if (Plan$A[i] == 1){Plan$fullbase[i] <- "no"}
  if (Plan$A[i] == 2){Plan$fullbase[i] <- "yes"}
  if (Plan$A[i] == 3){Plan$fullbase[i] <- "no"}
  
  if (Plan$B[i] == 1){Plan$prefarea[i] <- "no"}
  if (Plan$B[i] == 2){Plan$prefarea[i] <- "yes"}
  if (Plan$B[i] == 3){Plan$prefarea[i] <- "no"}
    
  if (Plan$C[i] == 1){Plan$bedrooms2[i] <- "Low"}
  if (Plan$C[i] == 2){Plan$bedrooms2[i] <- "Med"}
  if (Plan$C[i] == 3){Plan$bedrooms2[i] <- "High"}
  
  if (Plan$D[i] == 1){Plan$lotsize2[i] <- "Low"}
  if (Plan$D[i] == 2){Plan$lotsize2[i] <- "Med"}
  if (Plan$D[i] == 3){Plan$lotsize2[i] <- "High"}
  
}
Prjplan <- cbind(K,Plan$fullbase,Plan$prefarea,Plan$bedrooms2,Plan$lotsize2)

Prjplan3<-unite(Prjplan, refcol , c(9,10,11,12), remove=FALSE)
kable(Prjplan3, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	y	refcol	Plan$fullbase	Plan$prefarea	Plan$bedrooms2	Plan$lotsize2
7	1	1	3	1	3	2	NA	no_no_High_Med	no	no	High	Med
5	2	1	2	2	3	1	NA	yes_yes_High_Low	yes	yes	High	Low
8	3	1	3	2	1	3	NA	no_yes_Low_High	no	yes	Low	High
4	4	1	2	1	2	3	NA	yes_no_Med_High	yes	no	Med	High
3	5	1	1	3	3	3	NA	no_no_High_High	no	no	High	High
2	6	1	1	2	2	2	NA	no_yes_Med_Med	no	yes	Med	Med
6	7	1	2	3	1	2	NA	yes_no_Low_Med	yes	no	Low	Med
9	8	1	3	3	2	1	NA	no_no_Med_Low	no	no	Med	Low
1	9	1	1	1	1	1	NA	no_no_Low_Low	no	no	Low	Low

Plan.refcol <- c(Prjplan3$refcol)


Prj3<-unite(Prj, refcol , c(2,3,5,4), remove=FALSE)
head(Prj3)

  price         refcol fullbase prefarea lotsize2 bedrooms2
1 42000 yes_no_Med_Med      yes       no      Med       Med
2 38500  no_no_Low_Med       no       no      Med       Low
3 49500  no_no_Med_Low       no       no      Low       Med
4 60500 no_no_Med_High       no       no     High       Med
5 61000  no_no_Low_Med       no       no      Med       Low
6 66000 yes_no_Med_Med      yes       no      Med       Med

for (j in 1:546){if (Prjplan3$refcol[1] == Prj3$refcol[j]){Prj3$RO1[j] <- Prj3$price[j]}else{Prj3$RO1[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[2] == Prj3$refcol[j]){Prj3$RO2[j] <- Prj3$price[j]}else{Prj3$RO2[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[3] == Prj3$refcol[j]){Prj3$RO3[j] <- Prj3$price[j]}else{Prj3$RO3[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[4] == Prj3$refcol[j]){Prj3$RO4[j] <- Prj3$price[j]}else{Prj3$RO4[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[5] == Prj3$refcol[j]){Prj3$RO5[j] <- Prj3$price[j]}else{Prj3$RO5[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[6] == Prj3$refcol[j]){Prj3$RO6[j] <- Prj3$price[j]}else{Prj3$RO6[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[7] == Prj3$refcol[j]){Prj3$RO7[j] <- Prj3$price[j]}else{Prj3$RO7[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[8] == Prj3$refcol[j]){Prj3$RO8[j] <- Prj3$price[j]}else{Prj3$RO8[j] <- 0}}
for (j in 1:546){if (Prjplan3$refcol[9] == Prj3$refcol[j]){Prj3$RO9[j] <- Prj3$price[j]}else{Prj3$RO9[j] <- 0}}

numRO1 <- nrow(Prj3[Prj3$RO1 > 0,])
numRO2 <- nrow(Prj3[Prj3$RO2 > 0,])
numRO3 <- nrow(Prj3[Prj3$RO3 > 0,])
numRO4 <- nrow(Prj3[Prj3$RO4 > 0,])
numRO5 <- nrow(Prj3[Prj3$RO5 > 0,])
numRO6 <- nrow(Prj3[Prj3$RO6 > 0,])
numRO7 <- nrow(Prj3[Prj3$RO7 > 0,])
numRO8 <- nrow(Prj3[Prj3$RO8 > 0,])
numRO9 <- nrow(Prj3[Prj3$RO9 > 0,])

numHist <- c(numRO1,numRO2,numRO3,numRO4,numRO5,numRO6,numRO7,numRO8,numRO9)
barplot(numHist, names.arg = c("ER1","ER2","ER3","ER4","ER5","ER6","ER7","ER8","ER9"),xlab = "Exprimental Runs",ylab = "Observations", col = "red")

R1 <- sample.rows(subset(Prj3, RO1 > 0), 1)
R2 <- sample.rows(subset(Prj3, RO2 > 0), 1)
R3 <- sample.rows(subset(Prj3, RO3 > 0), 1)
R4 <- sample.rows(subset(Prj3, RO4 > 0), 1)
R5 <- sample.rows(subset(Prj3, RO5 > 0), 1)
R6 <- sample.rows(subset(Prj3, RO6 > 0), 1)
R7 <- sample.rows(subset(Prj3, RO7 > 0), 1)
R8 <- sample.rows(subset(Prj3, RO8 > 0), 1)
R9 <- sample.rows(subset(Prj3, RO9 > 0), 1)

RV <- c(R1$RO1,R2$RO2,R3$RO3,R4$RO4,R5$RO5,R6$RO6,R7$RO7,R8$RO8,R9$RO9)
RV

[1]  78000  56000  61500 124000 175000  65500  67000  49500  38000

Our response variable for the selected experimental runs is defined with RV and now can add our response variable to our design with response function:

response(plan.design) = RV
summary(plan.design)

## Taguchi SINGLE Design
## Information about the factors:
## 
##                A        B        C       D
## value 1       no       no      Low     Low
## value 2      yes      yes      Med     Med
## value 3       no       no     High    High
## name    fullbase prefarea bedrooms lotsize
## unit                                      
## type     numeric  numeric  numeric numeric
## 
## -----------
## 
##   StandOrder RunOrder Replicate A B C D     RV
## 1          7        1         1 3 1 3 2  78000
## 2          5        2         1 2 2 3 1  56000
## 3          8        3         1 3 2 1 3  61500
## 4          4        4         1 2 1 2 3 124000
## 5          3        5         1 1 3 3 3 175000
## 6          2        6         1 1 2 2 2  65500
## 7          6        7         1 2 3 1 2  67000
## 8          9        8         1 3 3 2 1  49500
## 9          1        9         1 1 1 1 1  38000
## 
## -----------

Now we are ready for plotting the effects :

effectPlot(plan.design)

From the effect plots we cannot identify a direct effect. In the two level factors A:full base and B:prefarea according to our design Level1 and Level3 represents the same level, so the graphs normally should represent this but the samples selected for our experiment are not refleting this.

Additionally from the initial box plots we normally expect a rising trend for bedrooms and lotsizes but also these are not visible in the selected data.

ANOVA For the Taguchi Design experiment :

Since we have only 2 levels for two of our factors we should also revise these before the running ANOVA :

K2 <- as.data.frame(plan.design)
fullbase <- c(as.integer(K2[,4]))
fullbase <- replace(fullbase, fullbase==3, 1)

prefarea <- c(as.integer(K2[,5]))
prefarea <- replace(prefarea, prefarea==3, 1)

bedrooms <- c(as.integer(K2[,6]))
lotsize <- c(as.integer(K2[,7]))

HousePrice <- c(as.integer(K2[,8]))

K3 <- as.data.frame(cbind(fullbase,prefarea,bedrooms,lotsize,HousePrice))
K3$fullbase <- as.factor(K3$fullbase)
K3$prefarea <- as.factor(K3$prefarea)
K3$bedrooms <- as.factor(K3$bedrooms)
K3$lotsize <- as.factor(K3$lotsize)

K3

  fullbase prefarea bedrooms lotsize HousePrice
1        1        1        3       2      78000
2        2        2        3       1      56000
3        1        2        1       3      61500
4        2        1        2       3     124000
5        1        1        3       3     175000
6        1        2        2       2      65500
7        2        1        1       2      67000
8        1        1        2       1      49500
9        1        1        1       1      38000

summary(K3)

 fullbase prefarea bedrooms lotsize   HousePrice    
 1:6      1:6      1:3      1:3     Min.   : 38000  
 2:3      2:3      2:3      2:3     1st Qu.: 56000  
                   3:3      3:3     Median : 65500  
                                    Mean   : 79389  
                                    3rd Qu.: 78000  
                                    Max.   :175000

Now we can run ANOVA for this new organized dataset:

DNM2=lm(HousePrice~fullbase+prefarea+bedrooms+lotsize,data=K3)
anova(DNM2)

Analysis of Variance Table

Response: HousePrice
          Df     Sum Sq    Mean Sq F value Pr(>F)
fullbase   1   39013889   39013889  0.0439 0.8534
prefarea   1 1521680556 1521680556  1.7126 0.3208
bedrooms   2 3384722222 1692361111  1.9047 0.3443
lotsize    2 8230888889 4115444444  4.6317 0.1776
Residuals  2 1777083333  888541667

All p-values are above 0.05 and none of the factors have a significant effect on the house price. Also following q-q plots and residual plots are not covering the assumptions of ANOVA:

qqnorm(residuals(DNM2))
qqline(residuals(DNM2))

par(mfrow=c(2,2))
plot(DNM2)

This findings was also similiar to our previous study with fractional factorial designs. Dependig on the samples selected during the analysis main effect plots and also estimates are changing.

To eliminate any desing problem possibilities we can also re-arrange our experiment with Taguchi L8 array, similiar to the fractional factorial design with 6 factors 2 levels all :

plan.design <- taguchiDesign("L8_2")
K <- as.data.frame(plan.design)
A <- c(as.integer(K[,4]))
B <- c(as.integer(K[,5]))
C <- c(as.integer(K[,6]))
D <- c(as.integer(K[,7]))
CD <- C*D
E <- c(as.integer(K[,8]))
F <- c(as.integer(K[,9]))
EF <- E*F
Plan <- as.data.frame(cbind(A,B,C,D,E,F,CD,EF))

n = nrow(K)
for (i in 1:n){
  if (Plan$A[i] == 1){Plan$fullbase[i] <- "no"}
  if (Plan$A[i] == 2){Plan$fullbase[i] <- "yes"}
 
  if (Plan$B[i] == 1){Plan$prefarea[i] <- "no"}
  if (Plan$B[i] == 2){Plan$prefarea[i] <- "yes"}

  if (Plan$CD[i] == 1){Plan$bedrooms2[i] <- "Low"}
  if (Plan$CD[i] == 2){Plan$bedrooms2[i] <- "Med"}
  if (Plan$CD[i] == 4){Plan$bedrooms2[i] <- "High"}
  
  if (Plan$EF[i] == 1){Plan$lotsize2[i] <- "Low"}
  if (Plan$EF[i] == 2){Plan$lotsize2[i] <- "Med"}
  if (Plan$EF[i] == 4){Plan$lotsize2[i] <- "High"}
  

}
Prjplan <- cbind(K,Plan$fullbase,Plan$prefarea,Plan$lotsize2,Plan$bedrooms2)
summary(Prjplan)

   StandOrder      RunOrder      Replicate       A             B      
 Min.   :1.00   Min.   :1.00   Min.   :1   Min.   :1.0   Min.   :1.0  
 1st Qu.:2.75   1st Qu.:2.75   1st Qu.:1   1st Qu.:1.0   1st Qu.:1.0  
 Median :4.50   Median :4.50   Median :1   Median :1.5   Median :1.5  
 Mean   :4.50   Mean   :4.50   Mean   :1   Mean   :1.5   Mean   :1.5  
 3rd Qu.:6.25   3rd Qu.:6.25   3rd Qu.:1   3rd Qu.:2.0   3rd Qu.:2.0  
 Max.   :8.00   Max.   :8.00   Max.   :1   Max.   :2.0   Max.   :2.0  
       C             D             E             F             G      
 Min.   :1.0   Min.   :1.0   Min.   :1.0   Min.   :1.0   Min.   :1.0  
 1st Qu.:1.0   1st Qu.:1.0   1st Qu.:1.0   1st Qu.:1.0   1st Qu.:1.0  
 Median :1.5   Median :1.5   Median :1.5   Median :1.5   Median :1.5  
 Mean   :1.5   Mean   :1.5   Mean   :1.5   Mean   :1.5   Mean   :1.5  
 3rd Qu.:2.0   3rd Qu.:2.0   3rd Qu.:2.0   3rd Qu.:2.0   3rd Qu.:2.0  
 Max.   :2.0   Max.   :2.0   Max.   :2.0   Max.   :2.0   Max.   :2.0  
    y           Plan$fullbase Plan$prefarea Plan$lotsize2 Plan$bedrooms2
 Mode:logical   no :4         no :4         High:2        High:2        
 NA's:8         yes:4         yes:4         Low :2        Low :2        
                                            Med :4        Med :4

kable(Prjplan, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	E	F	G	y	Plan$fullbase	Plan$prefarea	Plan$lotsize2	Plan$bedrooms2
8	1	1	2	2	1	2	1	1	2	NA	yes	yes	Low	Med
6	2	1	2	1	2	2	1	2	1	NA	yes	no	Med	High
5	3	1	2	1	2	1	2	1	2	NA	yes	no	Med	Med
1	4	1	1	1	1	1	1	1	1	NA	no	no	Low	Low
7	5	1	2	2	1	1	2	2	1	NA	yes	yes	High	Low
4	6	1	1	2	2	2	2	1	1	NA	no	yes	Med	High
3	7	1	1	2	2	1	1	2	2	NA	no	yes	Med	Med
2	8	1	1	1	1	2	2	2	2	NA	no	no	High	Med

Prjplan2<-unite(Prjplan, refcol , c(12,13,14,15), remove=FALSE)
kable(Prjplan2, align = 'c')

StandOrder	RunOrder	Replicate	A	B	C	D	E	F	G	y	refcol	Plan$fullbase	Plan$prefarea	Plan$lotsize2	Plan$bedrooms2
8	1	1	2	2	1	2	1	1	2	NA	yes_yes_Low_Med	yes	yes	Low	Med
6	2	1	2	1	2	2	1	2	1	NA	yes_no_Med_High	yes	no	Med	High
5	3	1	2	1	2	1	2	1	2	NA	yes_no_Med_Med	yes	no	Med	Med
1	4	1	1	1	1	1	1	1	1	NA	no_no_Low_Low	no	no	Low	Low
7	5	1	2	2	1	1	2	2	1	NA	yes_yes_High_Low	yes	yes	High	Low
4	6	1	1	2	2	2	2	1	1	NA	no_yes_Med_High	no	yes	Med	High
3	7	1	1	2	2	1	1	2	2	NA	no_yes_Med_Med	no	yes	Med	Med
2	8	1	1	1	1	2	2	2	2	NA	no_no_High_Med	no	no	High	Med

Plan.refcol <- c(Prjplan2$refcol)


Prj<-unite(Prj, refcol , c(2,3,4,5), remove=FALSE)
head(Prj)

  price         refcol fullbase prefarea lotsize2 bedrooms2
1 42000 yes_no_Med_Med      yes       no      Med       Med
2 38500  no_no_Med_Low       no       no      Med       Low
3 49500  no_no_Low_Med       no       no      Low       Med
4 60500 no_no_High_Med       no       no     High       Med
5 61000  no_no_Med_Low       no       no      Med       Low
6 66000 yes_no_Med_Med      yes       no      Med       Med

tail(Prj)

     price         refcol fullbase prefarea lotsize2 bedrooms2
541  85000 no_no_High_Med       no       no     High       Med
542  91500  no_no_Med_Med       no       no      Med       Med
543  94000  no_no_Med_Med       no       no      Med       Med
544 103000  no_no_Med_Med       no       no      Med       Med
545 105000  no_no_Med_Med       no       no      Med       Med
546 105000  no_no_Med_Med       no       no      Med       Med

for (j in 1:546){if (Prjplan2$refcol[1] == Prj$refcol[j]){Prj$RO1[j] <- Prj$price[j]}else{Prj$RO1[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[2] == Prj$refcol[j]){Prj$RO2[j] <- Prj$price[j]}else{Prj$RO2[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[3] == Prj$refcol[j]){Prj$RO3[j] <- Prj$price[j]}else{Prj$RO3[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[4] == Prj$refcol[j]){Prj$RO4[j] <- Prj$price[j]}else{Prj$RO4[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[5] == Prj$refcol[j]){Prj$RO5[j] <- Prj$price[j]}else{Prj$RO5[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[6] == Prj$refcol[j]){Prj$RO6[j] <- Prj$price[j]}else{Prj$RO6[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[7] == Prj$refcol[j]){Prj$RO7[j] <- Prj$price[j]}else{Prj$RO7[j] <- 0}}
for (j in 1:546){if (Prjplan2$refcol[8] == Prj$refcol[j]){Prj$RO8[j] <- Prj$price[j]}else{Prj$RO8[j] <- 0}}
R1 <- sample.rows(subset(Prj, RO1 > 0), 1)
R2 <- sample.rows(subset(Prj, RO2 > 0), 1)
R3 <- sample.rows(subset(Prj, RO3 > 0), 1)
R4 <- sample.rows(subset(Prj, RO4 > 0), 1)
R5 <- sample.rows(subset(Prj, RO5 > 0), 1)
R6 <- sample.rows(subset(Prj, RO6 > 0), 1)
R7 <- sample.rows(subset(Prj, RO7 > 0), 1)
R8 <- sample.rows(subset(Prj, RO8 > 0), 1)

RV <- c(R1$RO1,R2$RO2,R3$RO3,R4$RO4,R5$RO5,R6$RO6,R7$RO7,R8$RO8)

Our response variables randomly selected for our experiment are:

RV

[1]  52000  56000  42000  34000  95500  75000  80000 120000

response(plan.design) = RV
summary(plan.design)

## Taguchi SINGLE Design
## Information about the factors:
## 
##               A       B       C       D       E       F       G
## value 1       1       1       1       1       1       1       1
## value 2       2       2       2       2       2       2       2
## name                                                           
## unit                                                           
## type    numeric numeric numeric numeric numeric numeric numeric
## 
## -----------
## 
##   StandOrder RunOrder Replicate A B C D E F G     RV
## 1          8        1         1 2 2 1 2 1 1 2  52000
## 2          6        2         1 2 1 2 2 1 2 1  56000
## 3          5        3         1 2 1 2 1 2 1 2  42000
## 4          1        4         1 1 1 1 1 1 1 1  34000
## 5          7        5         1 2 2 1 1 2 2 1  95500
## 6          4        6         1 1 2 2 2 2 1 1  75000
## 7          3        7         1 1 2 2 1 1 2 2  80000
## 8          2        8         1 1 1 1 2 2 2 2 120000
## 
## -----------

Now we are ready for plotting the effects :

effectPlot(plan.design)

Finally the ANOVA for this design :

Z <- as.data.frame(plan.design)
Z

##   StandOrder RunOrder Replicate A B C D E F G     RV
## 1          8        1         1 2 2 1 2 1 1 2  52000
## 2          6        2         1 2 1 2 2 1 2 1  56000
## 3          5        3         1 2 1 2 1 2 1 2  42000
## 4          1        4         1 1 1 1 1 1 1 1  34000
## 5          7        5         1 2 2 1 1 2 2 1  95500
## 6          4        6         1 1 2 2 2 2 1 1  75000
## 7          3        7         1 1 2 2 1 1 2 2  80000
## 8          2        8         1 1 1 1 2 2 2 2 120000

fullbase <- c(as.integer(Z[,4]))
prefarea <- c(as.integer(Z[,5]))
bedrooms1 <- c(as.integer(Z[,6]))
bedrooms2 <- c(as.integer(Z[,7]))
lotsize1 <- c(as.integer(Z[,8]))
lotsize2 <- c(as.integer(Z[,9]))

HousePrice <- c(as.integer(Z[,11]))

Z1 <- as.data.frame(cbind(fullbase,prefarea,bedrooms1,bedrooms2,lotsize1,lotsize2,HousePrice))
Z1

##   fullbase prefarea bedrooms1 bedrooms2 lotsize1 lotsize2 HousePrice
## 1        2        2         1         2        1        1      52000
## 2        2        1         2         2        1        2      56000
## 3        2        1         2         1        2        1      42000
## 4        1        1         1         1        1        1      34000
## 5        2        2         1         1        2        2      95500
## 6        1        2         2         2        2        1      75000
## 7        1        2         2         1        1        2      80000
## 8        1        1         1         2        2        2     120000

Z1$fullbase <- as.factor(Z1$fullbase)
Z1$prefarea <- as.factor(Z1$prefarea)
Z1$bedrooms1 <- as.factor(Z1$bedrooms1)
Z1$lotsize1 <- as.factor(Z1$lotsize1)
Z1$bedrooms2 <- as.factor(Z1$bedrooms2)
Z1$lotsize2 <- as.factor(Z1$lotsize2)

Z1

##   fullbase prefarea bedrooms1 bedrooms2 lotsize1 lotsize2 HousePrice
## 1        2        2         1         2        1        1      52000
## 2        2        1         2         2        1        2      56000
## 3        2        1         2         1        2        1      42000
## 4        1        1         1         1        1        1      34000
## 5        2        2         1         1        2        2      95500
## 6        1        2         2         2        2        1      75000
## 7        1        2         2         1        1        2      80000
## 8        1        1         1         2        2        2     120000

summary(Z1)

##  fullbase prefarea bedrooms1 bedrooms2 lotsize1 lotsize2   HousePrice    
##  1:4      1:4      1:4       1:4       1:4      1:4      Min.   : 34000  
##  2:4      2:4      2:4       2:4       2:4      2:4      1st Qu.: 49500  
##                                                          Median : 65500  
##                                                          Mean   : 69313  
##                                                          3rd Qu.: 83875  
##                                                          Max.   :120000

zan=lm(HousePrice~fullbase+prefarea+bedrooms1+bedrooms2+lotsize1+lotsize2,data=Z1)
anova(zan)

## Analysis of Variance Table
## 
## Response: HousePrice
##           Df     Sum Sq    Mean Sq F value Pr(>F)
## fullbase   1  504031250  504031250  3.5930 0.3090
## prefarea   1  318781250  318781250  2.2724 0.3729
## bedrooms1  1  294031250  294031250  2.0960 0.3848
## bedrooms2  1  331531250  331531250  2.3633 0.3671
## lotsize1   1 1526281250 1526281250 10.8802 0.1874
## lotsize2   1 2756531250 2756531250 19.6500 0.1413
## Residuals  1  140281250  140281250

Again we donot find any significant effects, both arrays of Taguchi designs and previously generated fractional factorial analysis have similiar findings.

However when we compare our results with fulldata, it represents a complete different senario, and all these factors are significant :

Full Data ANOVA :

DNM3=lm(price~fullbase+prefarea+lotsize2+bedrooms2,data=Prj.Orj)
anova(DNM3)

Analysis of Variance Table

Response: price
           Df     Sum Sq    Mean Sq F value    Pr(>F)    
fullbase    1 1.3476e+10 1.3476e+10  31.412 3.334e-08 ***
prefarea    1 3.3656e+10 3.3656e+10  78.454 < 2.2e-16 ***
lotsize2    2 7.7017e+10 3.8508e+10  89.765 < 2.2e-16 ***
bedrooms2   2 3.3228e+10 1.6614e+10  38.728 < 2.2e-16 ***
Residuals 539 2.3123e+11 4.2899e+08                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

qqnorm(residuals(DNM3))
qqline(residuals(DNM3))

par(mfrow=c(2,2))
plot(DNM3)

4. References to the literature

http://www.cbinet.com/sites/default/files/files/Goode_Roberta_pres2_bonus.pdf

http://www.ecs.umass.edu/mie/labs/mda/fea/sankar/chap2.html

5. Appendices

https://cran.r-project.org/web/packages/Ecdat/Ecdat.pdf

Project_4