Project #4: Taguchi Designs

1. Setting

System Under Test

In this project, I will again be examining the ‘Health Insurance and Hours Worked by Wives’ data set found in the Ecdat package.

library("Ecdat")

## Loading required package: Ecfun

## Warning: package 'Ecfun' was built under R version 3.2.5

## 
## Attaching package: 'Ecfun'

## The following object is masked from 'package:base':
## 
##     sign

## 
## Attaching package: 'Ecdat'

## The following object is masked from 'package:datasets':
## 
##     Orange

hours <- HI
colnames(hours)

##  [1] "whrswk"     "hhi"        "whi"        "hhi2"       "education" 
##  [6] "race"       "hispanic"   "experience" "kidslt6"    "kids618"   
## [11] "husby"      "region"     "wght"

colnames <- c("WifeWorkHours", "HusbandHI", "WifeHI", "Race", "Region")

project <- data.frame(hours$whrswk, hours$hhi, hours$whi, hours$race, hours$region)
colnames(project) <- colnames

project3 <- subset(project, project$Region == "northcentral" | project$Region == "south" | project$Region == "west")

Factors and Levels

In this experiment, four variables will be examined: two binary factors and two three-level factors.

HusbandHI: is the wife covered by the husband’s health insurance? Yes/No

WifeHI: Wife has health insurance through her job? Yes/No

Race: White/Black/Other (NOTE: it is unclear whether this field denotes the race of the wife, the husband, or both. Here we will assume it denotes the race of the wife.)

Region: Northcentral/South/West

Response Variables

The response variable is continous variable denoting the hours the wife works per week.

The data: How is it organized and what does it look like?

As per the instructions for Project #3, the two three-level variables will be represented by two two-level factors (each). Therefore, we will have a 2^6 experimental design.

head(project3, 10)

##    WifeWorkHours HusbandHI WifeHI  Race       Region
## 1              0        no     no white northcentral
## 2             50        no    yes white northcentral
## 3             40       yes     no white northcentral
## 4             40        no    yes white northcentral
## 5              0       yes     no white northcentral
## 6             40       yes    yes white northcentral
## 7             40       yes     no white northcentral
## 8             25        no     no white northcentral
## 9             45        no    yes white northcentral
## 10            30        no     no white northcentral

2. Experimental Design

How will the experiment be organized and conducted?

As mentioned in class, the Taguchi design will be examined with both a 2^6 factorial design AND a 2^2 * 3^2 design. The Taguchi experimental design with the least number of experimental runs will be chosen for futher analysis.

What is the rationale for this design?

When examining a Taguchi experimenal design for a 2^6 experiment, the table shows that 8 experimental runs are needed. Now, for a 2^2 * 3^2 experiment, 36 experimental runs are required. Because of this, we will choose to analyze the 2^6 experiment (same as in the fractional factorial project), as it requires fewer experimental runs, and therefore saves time and resources.

library("qualityTools")

## Warning: package 'qualityTools' was built under R version 3.2.3

## Loading required package: Rsolnp

## Warning: package 'Rsolnp' was built under R version 3.2.3

## Loading required package: MASS

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:Ecdat':
## 
##     SP500

taguchiChoose(factors1 = 6, level1 = 2)

## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L8_2 L12_2 L16_2 L32_2
## 
## Use taguchiDesign("L8_2") or different to create a taguchi design object

taguchiDesign("L8_2")

##   StandOrder RunOrder Replicate A B C D E F G  y
## 1          6        1         1 2 1 2 2 1 2 1 NA
## 2          5        2         1 2 1 2 1 2 1 2 NA
## 3          8        3         1 2 2 1 2 1 1 2 NA
## 4          4        4         1 1 2 2 2 2 1 1 NA
## 5          2        5         1 1 1 1 2 2 2 2 NA
## 6          3        6         1 1 2 2 1 1 2 2 NA
## 7          7        7         1 2 2 1 1 2 2 1 NA
## 8          1        8         1 1 1 1 1 1 1 1 NA

taguchiChoose(factors1 = 2, level1 = 3, factors2 = 2, level2 = 2)

## 2 factors on 3 levels and 2 factors on 2 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L36_2_3_a L36_2_3_b
## 
## Use taguchiDesign("L36_2_3_a") or different to create a taguchi design object

taguchiDesign("L36_2_3_a")

##    StandOrder RunOrder Replicate A B C D E F G H J K L M N O P Q R S T U V
## 1           9        1         1 1 1 2 2 2 1 1 1 2 2 2 3 3 1 2 3 1 2 2 3 1
## 2          33        2         1 2 2 1 2 1 2 1 1 1 2 2 3 2 2 2 1 2 1 1 3 1
## 3           4        3         1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 3 3
## 4          26        4         1 2 1 1 2 2 2 1 2 2 1 1 2 1 3 2 3 1 1 2 1 2
## 5          22        5         1 2 1 2 1 2 2 2 1 1 1 2 1 2 2 3 3 1 2 1 1 3
## 6          15        6         1 1 2 2 1 2 2 1 2 1 2 1 3 1 2 3 2 1 3 2 2 1
## 7          36        7         1 2 2 1 1 2 1 2 1 2 2 1 3 2 3 1 2 1 2 3 1 1
## 8           8        8         1 1 1 2 2 2 1 1 1 2 2 2 2 2 3 1 2 3 1 1 2 3
## 9           1        9         1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## 10          5       10         1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 1 1
## 11         23       11         1 2 1 2 1 2 2 2 1 1 1 2 2 3 3 1 1 2 3 2 2 1
## 12         17       12         1 1 2 2 2 1 2 2 1 2 1 1 2 3 1 3 2 2 1 3 1 1
## 13         24       13         1 2 1 2 1 2 2 2 1 1 1 2 3 1 1 2 2 3 1 3 3 2
## 14         11       14         1 1 2 1 2 2 1 2 2 1 1 2 2 2 1 3 2 1 3 1 3 2
## 15         10       15         1 1 2 1 2 2 1 2 2 1 1 2 1 1 3 2 1 3 2 3 2 1
## 16         13       16         1 1 2 2 1 2 2 1 2 1 2 1 1 2 3 1 3 2 1 3 3 2
## 17         12       17         1 1 2 1 2 2 1 2 2 1 1 2 3 3 2 1 3 2 1 2 1 3
## 18         16       18         1 1 2 2 2 1 2 2 1 2 1 1 1 2 3 2 1 1 3 2 3 3
## 19         34       19         1 2 2 1 1 2 1 2 1 2 2 1 1 3 1 2 3 2 3 1 2 2
## 20         27       20         1 2 1 1 2 2 2 1 2 2 1 1 3 2 1 3 1 2 2 3 2 3
## 21         21       21         1 2 1 2 2 1 1 2 2 1 2 1 3 1 3 2 2 2 3 1 1 3
## 22          2       22         1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
## 23         31       23         1 2 2 1 2 1 2 1 1 1 2 2 1 3 3 3 2 3 2 2 1 2
## 24         35       24         1 2 2 1 1 2 1 2 1 2 2 1 2 1 2 3 1 3 1 2 3 3
## 25         29       25         1 2 2 2 1 1 1 1 2 2 1 2 2 1 3 3 3 2 2 1 3 1
## 26          3       26         1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3
## 27         20       27         1 2 1 2 2 1 1 2 2 1 2 1 2 3 2 1 1 1 2 3 3 2
## 28         18       28         1 1 2 2 2 1 2 2 1 2 1 1 3 1 2 1 3 3 2 1 2 2
## 29          7       29         1 1 1 2 2 2 1 1 1 2 2 2 1 1 2 3 1 2 3 3 1 2
## 30         30       30         1 2 2 2 1 1 1 1 2 2 1 2 3 2 1 1 1 3 3 2 1 2
## 31         19       31         1 2 1 2 2 1 1 2 2 1 2 1 1 2 1 3 3 3 1 2 2 1
## 32          6       32         1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 1 1 1 1 2 2
## 33         28       33         1 2 2 2 1 1 1 1 2 2 1 2 1 3 2 2 2 1 1 3 2 3
## 34         32       34         1 2 2 1 2 1 2 1 1 1 2 2 2 1 1 1 3 1 3 3 2 3
## 35         25       35         1 2 1 1 2 2 2 1 2 2 1 1 1 3 2 1 2 3 3 1 3 1
## 36         14       36         1 1 2 2 1 2 2 1 2 1 2 1 2 3 1 2 1 3 2 1 1 3
##    W X  y
## 1  1 2 NA
## 2  3 3 NA
## 3  3 3 NA
## 4  3 3 NA
## 5  3 2 NA
## 6  3 1 NA
## 7  2 3 NA
## 8  3 1 NA
## 9  1 1 NA
## 10 1 1 NA
## 11 1 3 NA
## 12 3 2 NA
## 13 2 1 NA
## 14 1 3 NA
## 15 3 2 NA
## 16 1 2 NA
## 17 2 1 NA
## 18 2 1 NA
## 19 3 1 NA
## 20 1 1 NA
## 21 1 2 NA
## 22 2 2 NA
## 23 1 1 NA
## 24 1 2 NA
## 25 2 1 NA
## 26 3 3 NA
## 27 3 1 NA
## 28 1 3 NA
## 29 2 3 NA
## 30 3 2 NA
## 31 2 3 NA
## 32 2 2 NA
## 33 1 3 NA
## 34 2 2 NA
## 35 2 2 NA
## 36 2 3 NA

Randomization, Replication, and Blocking

The data can be assumed to be randomly collected, as it is part of an R package of data sets that are meant to be used in analyses like these. There are no deliberate repeated measures in this data set, as husband/wife pairings can be assumed to be unique data points (no polygamy, probably, or couples who divored & remarried in the span of the data collection). No blocking factors were specified.

3. Statistical Analysis

Exploratory Data Analysis

Before we split the three-level factors into several two-level factors, we will do some exploratory data analysis.

HHI <- as.factor(project3$HusbandHI)
WHI <- as.factor(project3$WifeHI)
WWH <- as.numeric(project3$WifeWorkHours)
Race <- as.factor(project3$Race)
Region <- as.factor(project3$Region)

boxplot(WWH ~ HHI, xlab = "Husband Health Insurance", ylab = "Wife Working Hours/Week")

boxplot(WWH ~ WHI, xlab = "Wife Health Insurance", ylab = "Wife Working Hours/Week")

boxplot(WWH ~ Race, xlab = "Race", ylab = "Wife Working Hours/Week")

boxplot(WWH ~ Region, xlab = "Region", ylab = "Wife Working Hours/Week")

Now that we’ve looked at that, we can split the two three-level factors into two two-level factors (each). These new factors will be WhiteB, BlackB, SouthB, and WestB (where the B stands for ‘binary’). Each factor will be either 1 (true) or 2 (false). The third level of the original factor is when both of the binary factors are two. For example, if the region is neither ‘south’ (SouthB = 2) nor ‘west’ (WestB = 2), then the region must be ‘northcentral.’

Note that 1 and 2 are used instead of 1 and 0, as the taguchi design matrix is composed on 1s and 2s.

White <- matrix(nrow = 17102, ncol = 1)
Black <- matrix(nrow = 17102, ncol = 1)
South <- matrix(nrow = 17102, ncol = 1)
West <- matrix(nrow = 17102, ncol = 1)

project33 <- data.frame(project3, White, Black, South, West)
colnames(project33) <- c("WifeWorkHours", "HusbandHI", "WifeHI", "Race3", "Region3", "WhiteB", "BlackB", "SouthB", "WestB")

project33$WhiteB[project33$Race3 == "white"] <- 1
project33$WhiteB[project33$Race3 == "black"] <- 2
project33$WhiteB[project33$Race3 == "other"] <- 2

project33$BlackB[project33$Race3 == "white"] <- 2
project33$BlackB[project33$Race3 == "other"] <- 2
project33$BlackB[project33$Race3 == "black"] <- 1

project33$WestB[project33$Region3 == "west"] <- 1
project33$WestB[project33$Region3 == "south"] <- 2
project33$WestB[project33$Region3 == "northcentral"] <- 2

project33$SouthB[project33$Region3 == "west"] <- 2
project33$SouthB[project33$Region3 == "south"] <- 1
project33$SouthB[project33$Region3 == "northcentral"] <- 2

levels(project33$WifeHI) <- c(levels(project33$WifeHI), 1, 2)
project33$WifeHI[project33$WifeHI == "yes"] <- 1
project33$WifeHI[project33$WifeHI == "no"] <- 2

levels(project33$HusbandHI) <- c(levels(project33$HusbandHI), 1, 2)
project33$HusbandHI[project33$HusbandHI == "yes"] <- 1
project33$HusbandHI[project33$HusbandHI == "no"] <- 2

project4 <- data.frame(project33$WifeWorkHours, project33$HusbandHI, project33$WifeHI, project33$WhiteB, project33$BlackB, project33$SouthB, project33$WestB)
colnames(project4) <- c("Hours", "HHI", "WHI", "White", "Black", "South", "West")

head(project4)

##   Hours HHI WHI White Black South West
## 1     0   2   2     1     2     2    2
## 2    50   2   1     1     2     2    2
## 3    40   1   2     1     2     2    2
## 4    40   2   1     1     2     2    2
## 5     0   1   2     1     2     2    2
## 6    40   1   1     1     2     2    2

Testing

Now that we’ve done some exploratory data analysis, we form the Taguchi experimental design, as seen above and replicated here for convenience.

taguchiChoose(factors1 = 6, level1 = 2)

## 6 factors on 2 levels and 0 factors on 0 levels with 0 desired interactions to be estimated
## 
## Possible Designs:
## 
## L8_2 L12_2 L16_2 L32_2
## 
## Use taguchiDesign("L8_2") or different to create a taguchi design object

tdesign <- taguchiDesign("L8_2")

Now we’ll assign this design matrix of 1s and 2s to a data frame. (Theoretically, this could also be done using the subset() command, but I kept getting an error when using that, so I did it the long way. )

whi <- matrix(nrow = 8, ncol = 1)
hhi <- matrix(nrow = 8, ncol = 1)
w <- matrix(nrow = 8, ncol = 1)
b <- matrix(nrow = 8, ncol = 1)
we <- matrix(nrow = 8, ncol = 1)
sou <- matrix(nrow = 8, ncol = 1)
hours <- matrix(nrow = 8, ncol = 1)

mydesign <- data.frame(hhi, whi, w, b, we, sou, hours)
mydesign$hhi <- c(2,2,1,2,1,2,1,1)
mydesign$whi <- c(1,2,1,1,2,2,2,1)
mydesign$w <- c(2,1,2,1,1,2,2,1)
mydesign$b <- c(1,1,2,2,2,2,1,1)
mydesign$we <- c(2,2,2,1,2,1,1,1)
mydesign$sou <- c(1,1,1,1,2,2,2,2)

mydesign$hours <- c(70, 48, 40, 50, 40, 0, 20, 38)

mydesign

##   hhi whi w b we sou hours
## 1   2   1 2 1  2   1    70
## 2   2   2 1 1  2   1    48
## 3   1   1 2 2  2   1    40
## 4   2   1 1 2  1   1    50
## 5   1   2 1 2  2   2    40
## 6   2   2 2 2  1   2     0
## 7   1   2 2 1  1   2    20
## 8   1   1 1 1  1   2    38

Now we can create an effect plot using the qualityTools package and the taguchi design:

par(mar=c(1,1,1,1))

response(tdesign) = c(70, 48, 40, 50, 40, 0, 20, 38)

effectPlot(tdesign, ppoints = TRUE, col = 2, lty = 3, main = "Taguchi Design Main Effect Plot")

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

## Warning in plot.window(...): "ppoints" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "ppoints" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "ppoints" is
## not a graphical parameter

## Warning in box(...): "ppoints" is not a graphical parameter

## Warning in title(...): "ppoints" is not a graphical parameter

## Warning in axis(1, x, ...): "ppoints" is not a graphical parameter

Estimation

Now we can estimate the main effects of the model using the data from the eight experimental runs in the taguchi design. For comparison, the anova from the fractional factorial data is also included here.

fit <- lm(hours ~ hhi + whi + w + b + sou + we, data = mydesign)
anova(fit)

## Analysis of Variance Table
## 
## Response: hours
##           Df Sum Sq Mean Sq F value Pr(>F)
## hhi        1  112.5   112.5  0.3086 0.6772
## whi        1 1012.5  1012.5  2.7778 0.3440
## w          1  264.5   264.5  0.7257 0.5508
## b          1  264.5   264.5  0.7257 0.5508
## sou        1  625.0   625.0  1.7147 0.4152
## we         1  400.0   400.0  1.0974 0.4852
## Residuals  1  364.5   364.5

fit2 <- lm(WifeWorkHours ~ HusbandHI + WifeHI + WhiteB + BlackB + SouthB + WestB, data = project33)
anova(fit2)

## Analysis of Variance Table
## 
## Response: WifeWorkHours
##              Df  Sum Sq Mean Sq   F value    Pr(>F)    
## HusbandHI     1   97859   97859  379.9270 < 2.2e-16 ***
## WifeHI        1 1582215 1582215 6142.7741 < 2.2e-16 ***
## WhiteB        1    5625    5625   21.8390 2.988e-06 ***
## BlackB        1     715     715    2.7753 0.0957471 .  
## SouthB        1     147     147    0.5720 0.4494618    
## WestB         1    3229    3229   12.5354 0.0004004 ***
## Residuals 17095 4403217     258                        
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It is evident that the eight experimental runs do not seem to be enough to estimate the main effects of the model, and also do not address the interaction effects. The frational factorial design seemed to be a better fit, and give more useful results.

Model Adequacy Checking

#plots different residual plots for normality checks
par(mfrow = c(1,2))
qqnorm(residuals(fit))
qqline(residuals(fit))
plot(fitted(fit), residuals(fit))

It is important to note that the Q-Q plot shows that this data probably does not follow a normal distribution, so further analysis could be done on what distribution best fits this data, or potential confounding factors that were left out of the data (husband’s salary perhaps).

4. References to Literature

Sigma Quotient. “Taguchi method - Introduction.” YouTube. October 6, 2016. Available December 12, 2016 https://www.youtube.com/watch?v=VPMhTpr95lM.

5. Appendices

The raw data can be found in the Ecdat package in R. A pdf describing the data sets found in this package can be found here. The description of the ‘Health Insurance and Hours Worked by Wives’ is found on page 54.

The complete code in this project can be seen here:

library("Ecdat")
hours <- HI
colnames(hours)

colnames <- c("WifeWorkHours", "HusbandHI", "WifeHI", "Race", "Region")

#assigns relevant columns to a new dataframe for simpler analysis
project <- data.frame(hours$whrswk, hours$hhi, hours$whi, hours$race, hours$region)
colnames(project) <- colnames

#subsets into new data set to discard values of 'other' in the region factor
project3 <- subset(project, project$Region == "northcentral" | project$Region == "south" | project$Region == "west")

#displays the first 10 rows of the 2^2 * 3^2 data set (project3)
head(project3, 10)

# examines the difference between a 2^6 taguchi design and a 2^2 * 3^2 taguchi design

library("qualityTools")

taguchiChoose(factors1 = 6, level1 = 2)

taguchiDesign("L8_2")

taguchiChoose(factors1 = 2, level1 = 3, factors2 = 2, level2 = 2)

taguchiDesign("L36_2_3_a")


#sets columns as appropriate value types
HHI <- as.factor(project3$HusbandHI)
WHI <- as.factor(project3$WifeHI)
WWH <- as.numeric(project3$WifeWorkHours)
Race <- as.factor(project3$Race)
Region <- as.factor(project3$Region)

#displays a 4x4 grid of plots
par(mfrow = c(2,2))

#boxplots
boxplot(WWH ~ HHI, xlab = "Husband Health Insurance", ylab = "Wife Working Hours/Week")
boxplot(WWH ~ WHI, xlab = "Wife Health Insurance", ylab = "Wife Working Hours/Week")
boxplot(WWH ~ Race, xlab = "Race", ylab = "Wife Working Hours/Week")
boxplot(WWH ~ Region, xlab = "Region", ylab = "Wife Working Hours/Week")

#starts the process to decompose three-level factors into two two-level factors (each) by creating empty matrices to put data into
White <- matrix(nrow = 17102, ncol = 1)
Black <- matrix(nrow = 17102, ncol = 1)
South <- matrix(nrow = 17102, ncol = 1)
West <- matrix(nrow = 17102, ncol = 1)

#creates another new data frame for decomposed three-level factors
project33 <- data.frame(project3, White, Black, South, West)
colnames(project33) <- c("WifeWorkHours", "HusbandHI", "WifeHI", "Race3", "Region3", "WhiteB", "BlackB", "SouthB", "WestB")

#assigns appropriate values to new binary factors
project33$WhiteB[project33$Race3 == "white"] <- 1
project33$WhiteB[project33$Race3 == "black"] <- 2
project33$WhiteB[project33$Race3 == "other"] <- 2

project33$BlackB[project33$Race3 == "white"] <- 2
project33$BlackB[project33$Race3 == "other"] <- 2
project33$BlackB[project33$Race3 == "black"] <- 1

project33$WestB[project33$Region3 == "west"] <- 1
project33$WestB[project33$Region3 == "south"] <- 2
project33$WestB[project33$Region3 == "northcentral"] <- 2

project33$SouthB[project33$Region3 == "west"] <- 2
project33$SouthB[project33$Region3 == "south"] <- 1
project33$SouthB[project33$Region3 == "northcentral"] <- 2

levels(project33$WifeHI) <- c(levels(project33$WifeHI), 1, 2)
project33$WifeHI[project33$WifeHI == "yes"] <- 1
project33$WifeHI[project33$WifeHI == "no"] <- 2

levels(project33$HusbandHI) <- c(levels(project33$HusbandHI), 1, 2)
project33$HusbandHI[project33$HusbandHI == "yes"] <- 1
project33$HusbandHI[project33$HusbandHI == "no"] <- 2


project4 <- data.frame(project33$WifeWorkHours, project33$HusbandHI, project33$WifeHI, project33$WhiteB, project33$BlackB, project33$SouthB, project33$WestB)
colnames(project4) <- c("Hours", "HHI", "WHI", "White", "Black", "South", "West")

# displays the first six rows of the new simplified data set
head(project4)

#assigns taguchi design of choice to variable

taguchiChoose(factors1 = 6, level1 = 2)
tdesign <- taguchiDesign("L8_2")

# creates base matrices for design data frame
whi <- matrix(nrow = 8, ncol = 1)
hhi <- matrix(nrow = 8, ncol = 1)
w <- matrix(nrow = 8, ncol = 1)
b <- matrix(nrow = 8, ncol = 1)
we <- matrix(nrow = 8, ncol = 1)
sou <- matrix(nrow = 8, ncol = 1)
hours <- matrix(nrow = 8, ncol = 1)

#creates data frame for taguchi design
mydesign <- data.frame(hhi, whi, w, b, we, sou, hours)
mydesign$hhi <- c(2,2,1,2,1,2,1,1)
mydesign$whi <- c(1,2,1,1,2,2,2,1)
mydesign$w <- c(2,1,2,1,1,2,2,1)
mydesign$b <- c(1,1,2,2,2,2,1,1)
mydesign$we <- c(2,2,2,1,2,1,1,1)
mydesign$sou <- c(1,1,1,1,2,2,2,2)
# adds hours to mydesign data frame (obtained randomly through filtering the data)
mydesign$hours <- c(70, 48, 40, 50, 40, 0, 20, 38)
mydesign

#creates an effect plot for the main effects of the factors in the taguchi design
par(mar=c(1,1,1,1))
response(tdesign) = c(70, 48, 40, 50, 40, 0, 20, 38)
effectPlot(tdesign, ppoints = TRUE, col = 2, lty = 3, main = "Taguchi Design Main Effect Plot")

# runs an anova models on taguchi design matrix
fit <- lm(hours ~ hhi + whi + w + b + sou + we, data = mydesign)
anova(fit)

# fits a linear model from original data and runs ANOVA on it
fit2 <- lm(WifeWorkHours ~ HusbandHI + WifeHI + WhiteB + BlackB + SouthB + WestB, data = project33)
anova(fit2)

#plots different residual plots to check normality fit
par(mfrow = c(1,2))
qqnorm(residuals(fit))
qqline(residuals(fit))
plot(fitted(fit), residuals(fit))

6020_renaud_project4

Molly Renaud

12/10/2016