From the Ecdat package, VietNamH dataset was used for this project. This dataset has a total of 5999 observations describing the household expenses of households in Vietnam. The dataset can be loaded as follows:
library(Ecdat)
## Loading required package: Ecfun
##
## Attaching package: 'Ecfun'
## The following object is masked from 'package:base':
##
## sign
##
## Attaching package: 'Ecdat'
## The following object is masked from 'package:datasets':
##
## Orange
df <- VietNamH
As described above, the VietNamH dataset has data about the household expenses of households in Vietnam. Specifically, we have the following variables in the data set:
sex: Gender of household head (male, female)age: Age of household headeducyr: Schooling year of household headfarm: yes if farm householdurban: yes if urban householdhhsize: Size of the householdlntotal: Natural logarighm of total expenditure of the householdlnmed: Natural logarithm of medical expenditure of the householdlnrlfood: Natural logarithm of food expenditure of the householdlnexp12m: Natural logarithm of total household health care expenditure for 12 monthscommune: CommuneThis dataset has been taken from the Vietnam World Bank Livings Standards Survey.
head(df)
## sex age educyr farm urban hhsize lntotal lnmed lnrlfood
## 1 female 68 4 no yes 6 10.13649 11.233210 8.639339
## 2 female 57 8 no yes 6 10.25206 8.505120 9.345752
## 3 male 42 14 no yes 6 10.93231 8.713418 10.226330
## 4 female 72 9 no yes 6 10.26749 9.291736 9.263722
## 5 female 73 1 no yes 8 10.48811 7.555382 9.592890
## 6 female 66 13 no yes 7 10.52660 9.789702 9.372034
## lnexp12m commune
## 1 11.233210 1
## 2 8.505120 1
## 3 8.713418 1
## 4 9.291736 1
## 5 7.555382 1
## 6 9.789702 1
First of all, we must keep only those variables that will be needed in our analysis. For our analysis, we have two 2-level IVs (sex and urban) and two 3-level IVs (age and hhsize). The response variable is lntotal. Thus, our modified dataset becomes:
df = df[c(1,5,2,6,7)]
for (i in 1:nrow(df))
{
if (df$age[i] <= 35)
{
df$age_factor[i] <- "Young"
}
else if (df$age[i] <= 60)
{
df$age_factor[i] <- "Adult"
}
else
{
df$age_factor[i] <- "Old"
}
if (df$hhsize[i] <= 4)
{
df$hhsize_factor[i] <- "Small"
}
else if (df$hhsize[i] <= 8)
{
df$hhsize_factor[i] <- "Medium"
}
else
{
df$hhsize_factor[i] <- "Large"
}
}
df$sex <- as.factor(df$sex)
df$urban <- as.factor(df$urban)
df$age <- as.factor(df$age_factor)
df$hhsize <- as.factor(df$hhsize_factor)
df <- df[-c(6,7)]
As can be seen above, we have two 2-level IVs (sex and urban) and two 3-level IVs (age and hhsize).
The response variable in our study (lntotal) is the only continuous variable. We converted the continuous factors into categorical variables.
Following commands can be used to get a general idea of how the dataset looks like.
head(df)
## sex urban age hhsize lntotal
## 1 female yes Old Medium 10.13649
## 2 female yes Adult Medium 10.25206
## 3 male yes Adult Medium 10.93231
## 4 female yes Old Medium 10.26749
## 5 female yes Old Medium 10.48811
## 6 female yes Old Medium 10.52660
str(df)
## 'data.frame': 5999 obs. of 5 variables:
## $ sex : Factor w/ 2 levels "male","female": 2 2 1 2 2 2 2 1 1 1 ...
## $ urban : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ age : Factor w/ 3 levels "Adult","Old",..: 2 1 1 2 2 2 2 1 1 1 ...
## $ hhsize : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 1 3 2 3 ...
## $ lntotal: num 10.1 10.3 10.9 10.3 10.5 ...
summary(df)
## sex urban age hhsize lntotal
## male :4375 no :4269 Adult:3497 Large : 233 Min. : 6.543
## female:1624 yes:1730 Old :1283 Medium:2920 1st Qu.: 8.920
## Young:1219 Small :2846 Median : 9.311
## Mean : 9.342
## 3rd Qu.: 9.759
## Max. :12.202
To examine Taguchi design in this project, we have two options: (i) Taguchi design for a \(2^6\) experiment, wherein each of the two 3-level factors has been decomposed into two 2-level factors; or (ii) Taguchi design for a \(2^2 \times 3^2\) experiment, where each of the two 3-level factors stay as they are.
Since we have to compare the Taguchi design to the fractional factorial design, we will use the one which has experimental runs comparable to those of fractional factorial design in Project 3. From the Taguchi design table in Montgomery, we see that the \(2^6\) Taguchi design would require 8 experimental runs, while \(2^2 \times 3^2\) Taguchi design would require 36 experimental runs. Thus, it makes sense to use \(2^6\) Taguchi designs if we have to compare it to \(2^6\) fractional factorial design.
We will need the qualityTools package in R, which can be loaded as follows:
library(qualityTools)
## Loading required package: Rsolnp
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:Ecdat':
##
## SP500
##
## Attaching package: 'qualityTools'
## The following object is masked from 'package:stats':
##
## sigma
From a theoretical point of view, we mentioned above that \(2^6\) Taguchi design will be preferred. But let’s examine in practice what is actually the case. The \(2^6\) Taguchi design can be generated as follows:
taguchiChoose(factors1=6, level1=2)
## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
##
## Possible Designs:
##
## L8_2 L12_2 L16_2 L32_2
##
## Use taguchiDesign("L8_2") or different to create a taguchi design object
taguchiDesign("L8_2")
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## StandOrder RunOrder Replicate A B C D E F G y
## 1 5 1 1 2 1 2 1 2 1 2 NA
## 2 4 2 1 1 2 2 2 2 1 1 NA
## 3 1 3 1 1 1 1 1 1 1 1 NA
## 4 7 4 1 2 2 1 1 2 2 1 NA
## 5 8 5 1 2 2 1 2 1 1 2 NA
## 6 6 6 1 2 1 2 2 1 2 1 NA
## 7 2 7 1 1 1 1 2 2 2 2 NA
## 8 3 8 1 1 2 2 1 1 2 2 NA
The \(2^3 \times 3^2\) Taguchi design can be generated as follows:
taguchiChoose(factors1=2, level1=2, factors2=2, level2=3)
## 2 factors on 2 levels and 2 factors on 3 levels with 0 desired interactions to be estimated
##
## Possible Designs:
##
## L36_2_3_a L36_2_3_b
##
## Use taguchiDesign("L36_2_3_a") or different to create a taguchi design object
taguchiDesign("L36_2_3_a")
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## StandOrder RunOrder Replicate A B C D E F G H J K L M N O P Q R S T U V
## 1 32 1 1 2 2 1 2 1 2 1 1 1 2 2 2 1 1 1 3 1 3 3 2 3
## 2 33 2 1 2 2 1 2 1 2 1 1 1 2 2 3 2 2 2 1 2 1 1 3 1
## 3 26 3 1 2 1 1 2 2 2 1 2 2 1 1 2 1 3 2 3 1 1 2 1 2
## 4 14 4 1 1 2 2 1 2 2 1 2 1 2 1 2 3 1 2 1 3 2 1 1 3
## 5 6 5 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 1 1 1 1 2 2
## 6 8 6 1 1 1 2 2 2 1 1 1 2 2 2 2 2 3 1 2 3 1 1 2 3
## 7 22 7 1 2 1 2 1 2 2 2 1 1 1 2 1 2 2 3 3 1 2 1 1 3
## 8 15 8 1 1 2 2 1 2 2 1 2 1 2 1 3 1 2 3 2 1 3 2 2 1
## 9 30 9 1 2 2 2 1 1 1 1 2 2 1 2 3 2 1 1 1 3 3 2 1 2
## 10 18 10 1 1 2 2 2 1 2 2 1 2 1 1 3 1 2 1 3 3 2 1 2 2
## 11 17 11 1 1 2 2 2 1 2 2 1 2 1 1 2 3 1 3 2 2 1 3 1 1
## 12 36 12 1 2 2 1 1 2 1 2 1 2 2 1 3 2 3 1 2 1 2 3 1 1
## 13 4 13 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 3 3
## 14 7 14 1 1 1 2 2 2 1 1 1 2 2 2 1 1 2 3 1 2 3 3 1 2
## 15 28 15 1 2 2 2 1 1 1 1 2 2 1 2 1 3 2 2 2 1 1 3 2 3
## 16 24 16 1 2 1 2 1 2 2 2 1 1 1 2 3 1 1 2 2 3 1 3 3 2
## 17 21 17 1 2 1 2 2 1 1 2 2 1 2 1 3 1 3 2 2 2 3 1 1 3
## 18 9 18 1 1 1 2 2 2 1 1 1 2 2 2 3 3 1 2 3 1 2 2 3 1
## 19 23 19 1 2 1 2 1 2 2 2 1 1 1 2 2 3 3 1 1 2 3 2 2 1
## 20 19 20 1 2 1 2 2 1 1 2 2 1 2 1 1 2 1 3 3 3 1 2 2 1
## 21 11 21 1 1 2 1 2 2 1 2 2 1 1 2 2 2 1 3 2 1 3 1 3 2
## 22 25 22 1 2 1 1 2 2 2 1 2 2 1 1 1 3 2 1 2 3 3 1 3 1
## 23 31 23 1 2 2 1 2 1 2 1 1 1 2 2 1 3 3 3 2 3 2 2 1 2
## 24 29 24 1 2 2 2 1 1 1 1 2 2 1 2 2 1 3 3 3 2 2 1 3 1
## 25 5 25 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 1 1
## 26 13 26 1 1 2 2 1 2 2 1 2 1 2 1 1 2 3 1 3 2 1 3 3 2
## 27 12 27 1 1 2 1 2 2 1 2 2 1 1 2 3 3 2 1 3 2 1 2 1 3
## 28 34 28 1 2 2 1 1 2 1 2 1 2 2 1 1 3 1 2 3 2 3 1 2 2
## 29 35 29 1 2 2 1 1 2 1 2 1 2 2 1 2 1 2 3 1 3 1 2 3 3
## 30 2 30 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
## 31 3 31 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3
## 32 10 32 1 1 2 1 2 2 1 2 2 1 1 2 1 1 3 2 1 3 2 3 2 1
## 33 1 33 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 34 16 34 1 1 2 2 2 1 2 2 1 2 1 1 1 2 3 2 1 1 3 2 3 3
## 35 20 35 1 2 1 2 2 1 1 2 2 1 2 1 2 3 2 1 1 1 2 3 3 2
## 36 27 36 1 2 1 1 2 2 2 1 2 2 1 1 3 2 1 3 1 2 2 3 2 3
## W X y
## 1 2 2 NA
## 2 3 3 NA
## 3 3 3 NA
## 4 2 3 NA
## 5 2 2 NA
## 6 3 1 NA
## 7 3 2 NA
## 8 3 1 NA
## 9 3 2 NA
## 10 1 3 NA
## 11 3 2 NA
## 12 2 3 NA
## 13 3 3 NA
## 14 2 3 NA
## 15 1 3 NA
## 16 2 1 NA
## 17 1 2 NA
## 18 1 2 NA
## 19 1 3 NA
## 20 2 3 NA
## 21 1 3 NA
## 22 2 2 NA
## 23 1 1 NA
## 24 2 1 NA
## 25 1 1 NA
## 26 1 2 NA
## 27 2 1 NA
## 28 3 1 NA
## 29 1 2 NA
## 30 2 2 NA
## 31 3 3 NA
## 32 3 2 NA
## 33 1 1 NA
## 34 2 1 NA
## 35 3 1 NA
## 36 1 1 NA
Randomization must occur at three levels: Random Selection, Random Assignment and Random Execution. Random Selection refers to selecting a random set of sample from the population. Random Assignment refers to assigning those samples to different groups. Random Execution refers to ordering the experimental trials randomly [1]. The data comes from the Vietnam World Bank Livings Standards Survey. There doesn’t seem to be any information pertaining to random selection. However, random assignment and random execution is ensured in the analysis.
Blocking refers to arranging the experimental runs in blocks that are similar to each other [2]. In this experiment, as mentioned above, we would be blocking on the categorical IV sex in order to eliminate any variablity caused by the levels of this factor.
Replication is the repeatition of an experimental condition so that the variablity associated with the phenomenon under study can be estimated [3]. Repeated measures refers to using the same subjects with every branch of research, including the control group [4]. In this dataset, there are several replications of possible configurations of the input IVs. There is no evidence of repeated measures.
It is important to note that the EDA performed here is only done to obtain a feel for the descriptive statistics of the problem because we are trying to use eight runs for our fractional factorial designs.
summary(df)
## sex urban age hhsize lntotal
## male :4375 no :4269 Adult:3497 Large : 233 Min. : 6.543
## female:1624 yes:1730 Old :1283 Medium:2920 1st Qu.: 8.920
## Young:1219 Small :2846 Median : 9.311
## Mean : 9.342
## 3rd Qu.: 9.759
## Max. :12.202
str(df)
## 'data.frame': 5999 obs. of 5 variables:
## $ sex : Factor w/ 2 levels "male","female": 2 2 1 2 2 2 2 1 1 1 ...
## $ urban : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ age : Factor w/ 3 levels "Adult","Old",..: 2 1 1 2 2 2 2 1 1 1 ...
## $ hhsize : Factor w/ 3 levels "Large","Medium",..: 2 2 2 2 2 2 1 3 2 3 ...
## $ lntotal: num 10.1 10.3 10.9 10.3 10.5 ...
head(df)
## sex urban age hhsize lntotal
## 1 female yes Old Medium 10.13649
## 2 female yes Adult Medium 10.25206
## 3 male yes Adult Medium 10.93231
## 4 female yes Old Medium 10.26749
## 5 female yes Old Medium 10.48811
## 6 female yes Old Medium 10.52660
Boxplots can be generated for the four input variables as follows.
boxplot(df$lntotal~df$sex, xlab="Sex", ylab="Log of Total Expenditure", main="Boxplot of the IV Sex")
boxplot(df$lntotal~df$urban, xlab="Urban?", ylab="Log of Total Expenditure", main="Boxplot of the IV Urban")
boxplot(df$lntotal~df$age, xlab="Age", ylab="Log of Total Expenditure", main="Boxplot of the IV Age")
boxplot(df$lntotal~df$hhsize, xlab="Household Income", ylab="Log of Total Expenditure", main="Boxplot of the IV Household Size")
Let us review the Taguchi design matrix once again:
taguchiChoose(factors1=6, level1=2)
## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
##
## Possible Designs:
##
## L8_2 L12_2 L16_2 L32_2
##
## Use taguchiDesign("L8_2") or different to create a taguchi design object
des <- taguchiDesign("L8_2")
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
## Warning in `[<-`(`*tmp*`, i, value = <S4 object of class
## structure("taguchiFactor", package = "qualityTools")>): implicit list
## embedding of S4 objects is deprecated
Now, we can form the design matrix along with the response variable as follows:
# Input Variables
A <- c(2,1,1,2,2,2,1,1)
B <- c(1,1,2,1,2,2,2,1)
C <- c(2,1,2,2,1,1,2,1)
D <- c(2,1,1,1,1,2,2,2)
E <- c(1,1,1,2,2,1,2,2)
F <- c(2,1,2,1,2,1,1,2)
# Response Variable
R <- c(0,0,0,0,0,0,0,0)
design <- data.frame(A,B,C,D,E,F,R)
c = 0
temp = df[(df$sex == "male") & (df$urban == "no") & (df$age == "Young") & (df$hhsize == "Medium"),]
design$R[1] = temp$lntotal[sample(nrow(temp), size = 1)]
temp = df[(df$sex == "female") & (df$urban == "no") & (df$age == "Adult") & (df$hhsize == "Large"),]
design$R[2] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "female") & (df$urban == "yes") & (df$age == "Old") & (df$hhsize == "Medium"),]
design$R[3] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "male") & (df$urban == "no") & (df$age == "Old") & (df$hhsize == "Large"),]
design$R[4] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "male") & (df$urban == "yes") & (df$age == "Adult") & (df$hhsize == "Small"),]
design$R[5] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "male") & (df$urban == "yes") & (df$age == "Old") & (df$hhsize == "Large"),]
design$R[6] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "female") & (df$urban == "yes") & (df$age == "Young") & (df$hhsize == "Medium"),]
design$R[7] = temp$lntotal[sample(nrow(temp), 1)]
temp = df[(df$sex == "female") & (df$urban == "no") & (df$age == "Old") & (df$hhsize == "Small"),]
design$R[8] = temp$lntotal[sample(nrow(temp), 1)]
design
## A B C D E F R
## 1 2 1 2 2 1 2 9.532887
## 2 1 1 1 1 1 1 9.838478
## 3 1 2 2 1 1 2 9.360664
## 4 2 1 2 1 2 1 9.253464
## 5 2 2 1 1 2 2 8.919670
## 6 2 2 1 2 1 1 11.338720
## 7 1 2 2 2 2 1 9.697331
## 8 1 1 1 2 2 2 7.557857
Now that we have the Taguchi design matrix, we can generate the effect plot as follows:
par(mar=c(1,1,1,1))
response(des) = design$R
effectPlot(des, ppoints = TRUE, col=2, lty = 3, main="Effect Plot for Taguchi Design")
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
## Warning in plot.window(...): "ppoints" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter
## Warning in box(...): "ppoints" is not a graphical parameter
## Warning in title(...): "ppoints" is not a graphical parameter
## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter
Now that we have generated the effect plots for our Taguchi Design, we can estimate the main effects of the model as follows.
# Taguchi Design
linear_fit <- lm(R ~ A + B + C + D + E + F, data = design)
anova(linear_fit)
## Analysis of Variance Table
##
## Response: R
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 0.83878 0.83878 29.4102 0.11609
## B 1 1.22751 1.22751 43.0403 0.09630 .
## C 1 0.00449 0.00449 0.1576 0.75942
## D 1 0.07116 0.07116 2.4952 0.35929
## E 1 2.69402 2.69402 94.4606 0.06527 .
## F 1 2.82853 2.82853 99.1771 0.06371 .
## Residuals 1 0.02852 0.02852
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Actual Data
AOVObject = aov(df$lntotal~df$sex+df$urban+df$age+df$hhsize, data = df)
From the results of Taguchi design, it is evident that eight experimental runs are too small a number to form any kind of conclusions about the main effects of the model. Furthermore, they fail to estimate the interaction effects as well. In this particular case, the fractional factorial design (Project 3) seemed to work better and provide much better results.
anova(AOVObject)
## Analysis of Variance Table
##
## Response: df$lntotal
## Df Sum Sq Mean Sq F value Pr(>F)
## df$sex 1 18.87 18.87 62.744 2.785e-15 ***
## df$urban 1 635.53 635.53 2113.101 < 2.2e-16 ***
## df$age 2 113.70 56.85 189.030 < 2.2e-16 ***
## df$hhsize 2 266.78 133.39 443.519 < 2.2e-16 ***
## Residuals 5992 1802.13 0.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
First of all, we can plot the histogram of the response variable
hist(df$lntotal, breaks=10, main="Histogram of logarithm of total household expenditure")
The distribution seems normal from the histogram.
qqnorm(residuals(AOVObject))
qqline(residuals(AOVObject))
plot(fitted(AOVObject), residuals(AOVObject))
The QQ plot and residuals vs fitted plots indeed confirm that the data was normally distributed and randomly selected.