IE 5342 | HW1 Design and Analysis of Experiments (8th Edition)
Two machines are used for filling plastic bottles with a net volume of 16.0 ounces. The filling processes can be assumed to be normal, with standard deviations of \(\sigma_1 = 0.015\) and \(\sigma_2\) = 0.018 $ . The quality engineering department suspects that * both machines fill to the same net volume * , whether or not this volume is 16.0 ounces. An experiment is performed by taking a random sample from the output of each machine.
machine1 <- c(0.03,0.04,0.05,0.05,0.02,0.01,-0.04,-.02,.02,-.01)
machine2 <- c(.02,-.03,-.04,.01,-.01,.03,.04,.02,.01,0)
df <- data.frame(machine1,machine2)
df+16
## machine1 machine2
## 1 16.03 16.02
## 2 16.04 15.97
## 3 16.05 15.96
## 4 16.05 16.01
## 5 16.02 15.99
## 6 16.01 16.03
## 7 15.96 16.04
## 8 15.98 16.02
## 9 16.02 16.01
## 10 15.99 16.00
\[
H_0 : \mu_1 = \mu_2 \\H_1: \mu_1 \neq \mu_2
\]
Test these hypotheses using \(\alpha = 0.05\). What are your conclusions?
\(H_1\) will be chosen, which means two machines don’t seem to have the
same mean.
t.test(machine1+60,machine2+60,alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: machine1 + 60 and machine2 + 60
## t = 0.79894, df = 17.493, p-value = 0.435
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.01635123 0.03635123
## sample estimates:
## mean of x mean of y
## 60.015 60.005
Find a 95 percent confidence interval on the difference in mean fill volume for the two machines.
The following are the burning times (in minutes) of chemical flares of two different formulations. The design engineers are interested in both the mean and variance of the burning times.
library(lawstat)
# Sample data
t1 <- c(65, 81, 57, 66, 82, 82, 67, 59, 75, 70) # Type 1
t2 <- c(64, 71, 83, 59, 65, 56, 69, 74, 82, 79) # Type 2
# Combine the data
dat = data.frame("BurningTimes" = c(t1,t2),"type" = c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2))
levene.test(dat$BurningTimes,dat$type)
##
## Modified robust Brown-Forsythe Levene-type test based on the absolute
## deviations from the median
##
## data: dat$BurningTimes
## Test Statistic = 5.125e-31, p-value = 1
boxplot(t1,t2)
This shows that the variances seem to be equal.
t.test(t1,t2,var.equal = TRUE)
##
## Two Sample t-test
##
## data: t1 and t2
## t = 0.048008, df = 18, p-value = 0.9622
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.552441 8.952441
## sample estimates:
## mean of x mean of y
## 70.4 70.2
Now p-value is equal to 0.9622.
Photoresist is a light-sensitive material applied to semiconductor wafers so that the circuit pattern can be imaged on to the wafer. After application, the coated wafers are baked to remove the solvent in the photoresist mixture and to harden the resist. Here are measurements of photoresist thickness (in kA) for eight wafers baked at two different temperatures. Assume that all of the runs were made in random order.
# Data for each column
temp_95 <- c(11.176, 7.089, 8.097, 11.739, 11.291, 10.759, 6.467, 8.315)
temp_100 <- c(5.263, 6.748, 7.461, 7.015, 8.133, 7.418, 3.772, 8.963)
# Combine into a data frame
data <- data.frame(
`95°C` = temp_95,
`100°C` = temp_100)
t.test(temp_95, temp_100, alternative = "less", var.equal = TRUE)
##
## Two Sample t-test
##
## data: temp_95 and temp_100
## t = 2.6751, df = 14, p-value = 0.9909
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 4.179184
## sample estimates:
## mean of x mean of y
## 9.366625 6.846625
(a) Is there evidence to support the claim that the
higher baking temperature results in wafers with a lower mean
photoresist thickness? Based on the result of our t-test,
yes.
We could also look at box plots.
boxplot(data)
We could claim the effect by demonstrating the above plot.
(b) What is the P-value for the test conducted in part
(a)? 0.9909
(c) Find a 95 percent confidence interval on the difference in means. Provide a practical interpretation of this interval.
I am already done with part 1. Having a 95% confidence interval means we are 95% sure our estimation is correctly identified as acceptable.
qqnorm(temp_95, main="Q-Q Plot for 95°C")
qqline(temp_95)
qqnorm(temp_100, main="Q-Q Plot for 100°C")
qqline(temp_100)
They appear to be somewhat Normal, but it’s tough to be sure since we don’t have enough observations.
The diameter of a ball bearing was measured by 12 inspectors, each using two different kinds of calipers. The results were:
Data for each inspector’s measurements
Caliper_1 <- c(0.265, 0.265, 0.266, 0.267, 0.267, 0.265, 0.267, 0.267, 0.265, 0.268, 0.268, 0.265)
Caliper_2 <- c(0.264, 0.265, 0.264, 0.266, 0.267, 0.268, 0.264, 0.265, 0.265, 0.267, 0.268, 0.269)
caliper_data <- data.frame(Caliper_1, Caliper_2)
t.test(Caliper_1,Caliper_2,paired=TRUE)
##
## Paired t-test
##
## data: Caliper_1 and Caliper_2
## t = 0.43179, df = 11, p-value = 0.6742
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -0.001024344 0.001524344
## sample estimates:
## mean difference
## 0.00025
With 95% confidence interval we can claim means of two populations are the same with p-value = 0.6742.
An article in the Journal of Strain Analysis (vol. 18, no. 2, 1983) compares several procedures for predicting the shear strength for steel plate girders. Data for nine girders in the form of the ratio of predicted to observed load for two of these procedures, the Karlsruhe and Lehigh methods, are as follows:
a,b,c
# Data for Karlsruhe and Lehigh Methods
K <- c(1.186, 1.151, 1.322, 1.339, 1.200, 1.402, 1.365, 1.537, 1.559)
L <- c(1.061, 0.992, 1.063, 1.062, 1.065, 1.178, 1.037, 1.086, 1.052)
t.test(K, L, paired = TRUE)
##
## Paired t-test
##
## data: K and L
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## 0.1700423 0.3777355
## sample estimates:
## mean difference
## 0.2738889
qqnorm(K)
qqline(K)
qqnorm(L)
qqline(L)
dif <- K - L
qqnorm(dif)
qqline(dif)
shapiro_test <- shapiro.test(dif)
print(shapiro_test)
##
## Shapiro-Wilk normality test
##
## data: dif
## W = 0.91678, p-value = 0.3663
The normality assumption in the paired t-test is essential for ensuring the validity of the test results. It requires that the differences between paired observations be normally distributed. This assumption is especially critical for small sample sizes, where deviations from normality can lead to inaccurate test results and unreliable p-values.
Find the power of this test for detecting an actual difference in means of 2.5 kA.
library(pwr)
# Calculate the pooled standard deviation
s1 <- sd(temp_95)
s2 <- sd(temp_100)
s_p <- sqrt((s1^2 + s2^2) / 2)
psd = 2.5/s_p
pwr.t.test ( n=8,d=s_p,sig.level=0.05,type="two.sample")
##
## Two-sample t test power calculation
##
## n = 8
## d = 1.884034
## sig.level = 0.05
## power = 0.9381252
## alternative = two.sided
##
## NOTE: n is number in *each* group
An article in Solid State Technology, “Orthogonal Design for Process Optimization and Its Application to Plasma Etching” by G. Z. Yin and D. W. Jillie (May 1987) describes an experiment to determine the effect of the C2F6 flow rate on the uniformity of the etch on a silicon wafer used in integrated circuit manufacturing. All of the runs were made in random order. Data for two flow rates are as follows:
# Data
flow_125 <- c(2.7, 4.6, 2.6, 3.0, 3.2, 3.8)
flow_200 <- c(4.6, 3.4, 2.9, 3.5, 4.1, 5.1)
# Perform the Wilcoxon rank-sum test
wilcox.test(flow_125, flow_200)
##
## Wilcoxon rank sum test with continuity correction
##
## data: flow_125 and flow_200
## W = 9.5, p-value = 0.1994
## alternative hypothesis: true location shift is not equal to 0