Homework - Week 3: (Douglas C. Montgomery 8th Edition)

  1. 2.32

  2. 2.34

  3. 2.29 (e,f)

  4. 2.27 (a - Using a Non-Parameteric Method)

1 Question 2.32:

The diameter of a ball bearing was measured by 12 inspectors, each using two different kinds of calipers. The results were:

Inspector Caliper 1 Caliper 2
1 0.265 0.264
2 0.265 0.265
3 0.266 0.264
4 0.267 0.266
5 0.267 0.267
6 0.265 0.268
7 0.267 0.264
8 0.267 0.265
9 0.265 0.265
10 0.268 0.267
11 0.268 0.268
12 0.265 0.269
  1. Is there a significant difference between the means of the population of measurements from which the two samples were selected? Use ( \(\alpha\)= 0.05)
  2. Find the P-value for the test in part (a).
  3. Construct a 95 percent confidence interval on the difference in mean diameter measurements for the two types of calipers.

1.1 Solution:

Setting up data:

caliper1 <- c(0.265, 0.265, 0.266, 0.267, 0.267, 0.265, 0.267, 0.267, 0.265, 0.268, 0.268, 0.265)
caliper2 <- c(0.264, 0.265, 0.264, 0.266, 0.267, 0.268, 0.264, 0.265, 0.265, 0.267, 0.268, 0.269)

Determining Hypothesis Test to be used: By the nature of the question, the data seems to be paired i.e. Caliper1 measurement is related to Caliper2 measurement considering same inspector took those measurements.

Checking possibility for Paired T-Test:

Difference <- caliper1-caliper2
qqnorm(Difference,main = "Normal Probability Plot of the Difference", cex =2, bg = "black", lwd = 2, pch = 21)
qqline(Difference, lwd = 2)

#Finding Correlation:
cor(caliper1,caliper2)
## [1] 0.1276307
#NPP's of individual populations i.e. of caliper1 and caliper2
qqnorm(caliper1, main = "Normal Probability Plot of Caliper 1", cex =2, bg = "red", lwd = 2, pch = 21)
qqline(caliper1, lwd = 2)

qqnorm(caliper2, main = "Normal Probability Plot of Caliper 2", cex =2, bg = "blue", lwd = 2, pch = 21)
qqline(caliper2, lwd = 2)

• Pair of observations have a positive correlation of 12.7 %

• Both the samples are approximately normally distributed , as points fall around the straight line.

Hence we will use Paired T-Test:

Stating the Hypothesis:

\[ H_0: D = 0 \space \space OR \space \space \mu_{1}=\mu_{2} \]

\[ H_a : D \ne 0 \space \space OR \space \space \mu_{1}\ne\mu_{2} \]

Where μ1 & μ2 = mean diameters of ball bearings measured using Calipers 1 and 2 and D = Difference between means of Caliper 1 and 2

Part a) Paired T-Test:

t.test(caliper1,caliper2,paired=TRUE)
## 
##  Paired t-test
## 
## data:  caliper1 and caliper2
## t = 0.43179, df = 11, p-value = 0.6742
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.001024344  0.001524344
## sample estimates:
## mean difference 
##         0.00025

Conclusion:

----> Since our p-value is much greater than a=0.05, so we fail to reject Ho and thus we conclude that there’s no difference between the means of the population of measurements from which the two samples were selected.

Part b) P-value:

----> P-value = 0.6742

Part c) 95 percent confidence interval:

Lower Bound = -0.001024 and Upper Bound = 0.001524

---->So confidence interval is -0.001024 \(\leq\) D (\(\mu_{1} - \mu_{2}\))\(\leq\) 0.00152

1.2 Handwritten Solution:

Solution 2.32 Page 1

Solution 2.32 Page 2

Solution 2.32 Page 3

2 Question 2.34:

An article in the Journal of Strain Analysis (vol. 18, no. 2, 1983) compares several procedures for predicting the shear strength for steel plate girders. Data for nine girders in the form of the ratio of predicted to observed load for two of these procedures, the Karlsruhe and Lehigh methods, are as follows:

Girder Karlsruhe Method Lehigh Method
S1/1 1.186 1.061
S2/1 1.151 0.992
S3/1 1.322 1.063
S4/1 1.339 1.062
S5/1 1.200 1.065
S2/1 1.402 1.178
S2/2 1.365 1.037
S2/3 1.537 1.086
S2/4 1.559 1.052
  1. Is there any evidence to support a claim that there is a difference in mean performance between the two methods? Use ( \(\alpha\)= 0.05)

  2. What is the P-value for the test in part (a)?

  3. Construct a 95 percent confidence interval for the difference in mean predicted to observed load.

  4. Investigate the normality assumption for both samples.

  5. Investigate the normality assumption for the difference in ratios for the two methods.

  6. Discuss the role of the normality assumption in the paired t-test.

2.1 Solution:

Setting up data:

karlsrushe <-c(1.186,1.151,1.322,1.339,1.200,1.402,1.365,1.537,1.559)
lehigh <- c(1.061,0.992,1.063,1.062,1.065,1.178,1.037,1.086,1.052)
Difference <- (karlsrushe-lehigh)

Determining Hypothesis Test to be used: By the nature of the question, the data seems to be paired b/c the nine girders were used to perform both methods. This is what connects the data sets to each other.

Stating the Hypothesis:

\[ H_0: \mu_{1}=\mu_{2} \]

\[ H_a :\mu_{1}\ne\mu_{2} \]

Where \(\mu_{1}\) = Mean data recorded by Karlsrushe method & \(\mu_{2}\)= Mean data recorded by Lehigh method

Part a) Paired T-Test:

t.test(karlsrushe,lehigh, paired = TRUE)
## 
##  Paired t-test
## 
## data:  karlsrushe and lehigh
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.1700423 0.3777355
## sample estimates:
## mean difference 
##       0.2738889

Conclusion:

—–> Since our p-value ( 0.0002953) < a=0.05, so we reject Ho and thus we conclude that there is a difference in mean performance between the two methods.

Part b) P-value:

----> P-value = 0.0002953

Part c) 95 percent confidence interval:

Lower Bound = 0.1700423 and Upper Bound = 0.3777355

---->So confidence interval is 0.1700423 \(\leq\) D (\(\mu_{1} - \mu_{2}\)) \(\leq\) 0.3777355

Part d) Normality assumption for both samples:

NPP for Karlsrushe Method:

qqnorm(karlsrushe,main="Normal Plot of Caliper 1 Data", cex=2, bg="blue", lwd=2, pch=21)
qqline(karlsrushe, lwd=2)

Remarks:

----> if we ignore some outlines the plot seems to show a generally normal form -points are aligned around the straight line.

NPP for Lehigh Method:

qqnorm(lehigh, main="Normal Plot of Caliper 2 Data", cex=2, bg="red", lwd=2, pch=21)
qqline(lehigh, lwd=2)

Remarks:

----> The plot seems to display some characteristics of heavy-tailedness (S-Shape). However, most of the data points do follow normality

Part e) Normality assumption for the difference in ratios for the two methods:

qqnorm(Difference, main="Normal Probability Plot of Ratio Differences", cex=2, bg="black", lwd=2, pch=21)
qqline(Difference, lwd=2)

Remarks:

----> This plot shows that the data mostly follows normality & it validates the assumption that the data is normal.

Part f) Role of the normality assumption in the paired t-test:

----> \(D\) \(\simeq\) \(T_{n - 1}\) = D is distributed as t- distribution with (n-1) degrees of freedom

Where, D = Random Variable that denotes difference between population 1 & 2
n = No of Observations that are drawn.

For above statement to hold true, both Population 1 & 2 should approximately be NORMAL

\[ OR \]
n (Sample sizes) are LARGE.

As far as Normality assumption is concerned in Paired t-test, the assumption of normality applies to the distribution of the differences i.e. “D” & the individual sample measurements do not have to be normally distributed, only their difference as we look at the distribution of the difference “D” in Paired t-tests.

Also, like any t-test, the assumption of normality holds moderate importance.

2.2 Handwritten Solution:

Solution 2.34 Page 1

Solution 2.34 Page 2

Solution 2.34 Page 3

3 Question 2.29:

Photoresist is a light-sensitive material applied to semiconductor wafers so that the circuit pattern can be imaged on to the wafer. After application, the coated wafers are baked to remove the solvent in the photoresist mixture and to harden the resist. Here are measurements of photoresist thickness (in kA) for eight wafers baked at two different temperatures. Assume that all of the runs were made in random order.

95 C 100 C
11.176 5.263
7.089 6.748
8.097 7.461
11.739 7.015
11.291 8.133
10.759 7.418
6.476 3.772
8.315 8.963
  1. Check the assumption of normality of the photoresist thickness.

  2. Find the power of this test for detecting an actual difference in means of 2.5 kA.

3.1 Solution:

Part e) Assumption of normality of the photo resist thickness:

Setting up data:

BakingAt95 <- c(11.176, 7.089, 8.097, 11.739, 11.291, 10.759, 6.467, 8.315)
BakingAt100 <- c(5.263, 6.748, 7.461, 7.015, 8.133, 7.418, 3.772, 8.963)

To check the assumption of normality for the Photo resist thickness for 95°C and 100°C baking temperature, we’ll draw their normal probability plots:

NPP of Baking at 95 C:

qqnorm(BakingAt95,main="Normal Probability Plot for Baking at 95 C", ylab = "Photoresist Thickness", cex=2, bg="blue", lwd=2, pch=21)
qqline(BakingAt95, lwd=2)

NPP of Baking at 100 C:

qqnorm(BakingAt100,main="Normal Probability Plot for Baking at 100 C", ylab = "Photoresist Thickness", cex=2, bg="blue", lwd=2, pch=21)
qqline(BakingAt100, lwd=2)

Comment:

----> Although normal probability plots aren’t perfect, they still are very close to normality and thus can be assumed normal.

Part f) Power for detecting an actual difference in means of 2.5 kÅ:

DATA:

Sig level = 0.05
d = effect size = (diff in means/pooled std.dev) = (2.5 / Sd)
n = 8
Type: Power Analysis of 2 Sample T-Test
Power = To be Calculated

Finding pooled standard deviation for the two sample T-Test:

sd95<- c(sd(BakingAt95))
print(sd95)
## [1] 2.099564
sd100<- c(sd(BakingAt100))
print(sd100)
## [1] 1.640427
Sp <- sqrt(((8-1)*sd95^2+(8-1)*sd100^2)/(8+8-2))
print(Sp)
## [1] 1.884034

Now Find Power for detecting difference in means of 2.5kA:

library(pwr)
Power1<- c(pwr.t.test(n=8,d=1.3269,sig.level=0.05,power=NULL,type="two.sample"))
print(Power1)
## $n
## [1] 8
## 
## $d
## [1] 1.3269
## 
## $sig.level
## [1] 0.05
## 
## $power
## [1] 0.6945613
## 
## $alternative
## [1] "two.sided"
## 
## $note
## [1] "n is number in *each* group"
## 
## $method
## [1] "Two-sample t test power calculation"

Answer:

----> So power of this test is Power = 0.6946

3.2 Handwritten Solution:

Solution 2.29 Page 1

Solution 2.29 Page 2

4 Question 2.27:

An article in Solid State Technology, “Orthogonal Design for Process Optimization and Its Application to Plasma Etching” by G. Z. Yin and D. W. Jillie (May 1987) describes an experiment to determine the effect of the C2F6 flow rate on the uniformity of the etch on a silicon wafer used in integrated circuit manufacturing. All of the runs were made in random order. Data for two flow rates are as follows:

Uniformity Observations
C2F6 Flow
(SCCM)
1 2 3 4 5 6
125 2.7 4.6 2.6 3.0 3.2 3.8
200 4.6 3.4 2.9 3.5 4.1 5.1
  1. Does the C2F6 flow rate affect average etch uniformity? Use \(\alpha\)= 0.05. And use Non - Parametric Method:

4.1 Solution:

Part a) Using Non Parametric Test:

Setting the data:

flow125 <- c(2.7,4.6,2.6,3.0,3.2,3.8)
flow200 <- c(4.6,3.4,2.9,3.5,4.1,5.1)

Stating the Hypothesis:

\[ H_0: \mu_{1} = \mu_{2} \]

\[ H_a: \mu_{1} \neq \mu_{2} \]

where, \(\mu_{1}\)= Mean flow rate at 125 SCCM and \(\mu_{2}\)= Mean flow rate at 200 SCCM

Before performing Non Parametric Test, we need to access Normal Probability Plots:

NPP for flow rate at 125 SCCM:

qqnorm(flow125,main="Flow 125 SCCM Normal Probability Plot", cex=2, bg="red", lwd=2, pch=21)
qqline(flow125, lwd=2)

Comment:

—->
Distribution is slightly skewed to the right

NPP for flow rate at 200 SCCM:

qqnorm(flow200,main="Flow 200 SCCM Normal Probability Plot", cex=2, bg="red", lwd=2, pch=21)
qqline(flow200, lwd=2)

Comment:

—->
Distribution is approximately NORMAL

Analyzing difference in Std. dev by using Box Plots:

boxplot(flow125,flow200,names=c('Flow 125 SCCM','Flow 200 SCCM'),main='Flow 125 SCCM vs Flow 200 SCCM Boxplots')

Overall Comment:

• From box plots we can see that both populations have almost same sizes, t/f approxly same std.dev’s

• If 2 populations have same std.devs , approxly same shapes and if sample sizes are equal i.e. n1=n2=6
then validity of t-tools is affected moderately by Long-taildness and very little by skewness.

Performing Non Parametric Test:

wilcox.test(flow125,flow200)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  flow125 and flow200
## W = 9.5, p-value = 0.1994
## alternative hypothesis: true location shift is not equal to 0

Conclusion:

----> Since our p-value is 0.1994, so at a significance level of 0.05 we fail to reject Ho. In other words, we conclude that flow rate doesn't affect average etch uniformity.

4.2 Handwritten Solution:

Solution 2.27 Page 1

Solution 2.27 Page 2

5 Source Code:

getwd()
#Question: 2.32::

##Read the Data:
caliper1 <- c(0.265, 0.265, 0.266, 0.267, 0.267, 0.265, 0.267, 0.267, 0.265, 0.268, 0.268, 0.265)
caliper2 <- c(0.264, 0.265, 0.264, 0.266, 0.267, 0.268, 0.264, 0.265, 0.265, 0.267, 0.268, 0.269)
##Checking possibility for Paired T-Test:
Difference <- caliper1-caliper2
qqnorm(Difference,main = "Normal Probability Plot of the Difference", cex =2, bg = "black", lwd = 2, pch = 21)
qqline(Difference, lwd = 2)
##Correlation:
cor(caliper1,caliper2)
#Positive correlation of 12% captured

#NPP's of individual populations i.e. of caliper1 and caliper2
qqnorm(caliper1, main = "Normal Probability Plot of Caliper 1", cex =2, bg = "red", lwd = 2, pch = 21)
qqline(caliper1, lwd = 2)
qqnorm(caliper2, main = "Normal Probability Plot of Caliper 2", cex =2, bg = "blue", lwd = 2, pch = 21)
qqline(caliper2, lwd = 2)

#Using Paired T-Test:
t.test(caliper1,caliper2,paired=TRUE)


#Question: 2.34::

##Read the data:
karlsrushe <-c(1.186,1.151,1.322,1.339,1.200,1.402,1.365,1.537,1.559)
lehigh <- c(1.061,0.992,1.063,1.062,1.065,1.178,1.037,1.086,1.052)
Difference <- (karlsrushe-lehigh)

##Performing Paired Sample T - Test:
t.test(karlsrushe,lehigh, paired = TRUE)

##normal probability plot for both sets of data:
qqnorm(karlsrushe,main="Normal Plot of Caliper 1 Data", cex=2, bg="blue", lwd=2, pch=21)
qqline(karlsrushe, lwd=2)
qqnorm(lehigh, main="Normal Plot of Caliper 2 Data", cex=2, bg="red", lwd=2, pch=21)
qqline(lehigh, lwd=2)

##normal probability for the difference in ratios for the two methods:
qqnorm(Difference, main="Normal Probability Plot of Ratio Differences", cex=2, bg="black", lwd=2, pch=21)
qqline(Difference, lwd=2)

##Question: 2.29:

##Setting up data:
BakingAt95 <- c(11.176, 7.089, 8.097, 11.739, 11.291, 10.759, 6.467, 8.315)
BakingAt100 <- c(5.263, 6.748, 7.461, 7.015, 8.133, 7.418, 3.772, 8.963)
##Normal Probability Plots:
qqnorm(BakingAt95,main="Normal Probability Plot for Baking at 95 C", ylab = "Photoresist Thickness", cex=2, bg="blue", lwd=2, pch=21)
qqline(BakingAt95, lwd=2)
qqnorm(BakingAt100,main="Normal Probability Plot for Baking at 100 C", ylab = "Photoresist Thickness", cex=2, bg="blue", lwd=2, pch=21)
qqline(BakingAt100, lwd=2)

##power of this test for detecting an actual difference in means of 2.5 kA:
install.packages("pwr")
library(pwr)
sd95<- c(sd(BakingAt95))
print(sd95)
sd100<- c(sd(BakingAt100))
print(sd100)

Sp <- sqrt(((8-1)*sd95^2+(8-1)*sd100^2)/(8+8-2))
pwr.t.test(n=8,d=(2.5/Sp),sig.level=0.05,power=NULL,type="two.sample")


##Question: 2.27:
#Reading the data:
flow125 <- c(2.7,4.6,2.6,3.0,3.2,3.8)
flow200 <- c(4.6,3.4,2.9,3.5,4.1,5.1)
#Normal Probability Plots:
qqnorm(flow125,main="Flow 125 SCCM Normal Probability Plot", cex=2, bg="red", lwd=2, pch=21)
qqline(flow125, lwd=2)
qqnorm(flow200,main="Flow 200 SCCM Normal Probability Plot", cex=2, bg="red", lwd=2, pch=21)
qqline(flow200, lwd=2)

#Boxplots:
boxplot(flow125,flow200,names=c('Flow 125 SCCM','Flow 200 SCCM'),main='Flow 125 SCCM vs Flow 200 SCCM Boxplots')
#Mann-Whitney-U test with alpha=0.05:
wilcox.test(flow125,flow200)