library(lawstat)
library(dplyr)
library(pwr)

1 Problem 2.24

Reading the data

machine_1 <- c(16.03,16.04,16.05,16.05,16.02,16.01,15.96,15.98,16.02,15.99)
machine_2 <- c(16.02,15.97,15.96,16.01,15.99,16.03,16.04,16.02,16.01,16.0)

1.1 (a)

The Hypotheses that should be tested are \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

1.2 (b)

Testing the hypotheses using alpha =0.05

t.test(machine_1,machine_2)
## 
##  Welch Two Sample t-test
## 
## data:  machine_1 and machine_2
## t = 0.79894, df = 17.493, p-value = 0.435
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01635123  0.03635123
## sample estimates:
## mean of x mean of y 
##    16.015    16.005
  • From the Welch Two sample t-test we can see that the p-value is greater than \(\alpha\)(0.05)
  • So we do not reject the Null Hypothesis.

1.3 (c)

  • p-value for this test is 0.435

1.4 (d)

  • 95% confidence interval on the difference of the means \(-0.01635 \leq \mu_{1}-\mu_{2} \leq 0.03635\)

2 Problem 2.26

Reading the data

T1 <- c(65,82,81,67,57,59,66,75,82,70)
T2<-c(64,56,71,69,83,74,59,82,65,79)

2.1 (a)

We have to test the hypothesis that the two variances are equal when \(\alpha=0.05\) using Levene’s Test \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

dat_1<-data.frame(time=T1, type = factor(rep('t1')))
dat_2<-data.frame(time=T2, type = factor(rep('t2')))
dat_3<-rbind.data.frame(dat_1,dat_2)
levene.test(dat_3$time,dat_3$type)
## 
##  Modified robust Brown-Forsythe Levene-type test based on the absolute
##  deviations from the median
## 
## data:  dat_3$time
## Test Statistic = 5.125e-31, p-value = 1
  • We can see the p-value=1, therefore we do not reject the null hypothesis.

2.2 (b)

  • From the levene’s test we can see the test statistic is very low and p-value = 1
  • Therefore we do not reject the Null Hypothesis.

3 Problem 2.27

Reading the data

sccm_125<-c(2.7,4.6,2.6,3.0,3.2,3.8)
sccm_200<-c(4.6,3.4,2.9,3.5,4.1,5.1)

3.1 (a)

The hypotheses that needs to be tested \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

wilcox.test(sccm_125,sccm_200)
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  sccm_125 and sccm_200
## W = 9.5, p-value = 0.1994
## alternative hypothesis: true location shift is not equal to 0
  • From the test we can see the p-value is 0.1994, which is greater than \(\alpha\).
  • We do not reject the null hypothesis
  • We can conclude that the \(C_{2}F_{6}\) flow rate does not affect the average etch uniformity.

4 Problem 2.29

Reading the data

x<-c(11.176,7.089,8.097,11.739,11.291,10.759,6.467,8.315)
y<-c(5.263,6.748,7.461,7.015,8.133,7.418,3.772,8.963)

4.1 (a)

The hypotheses that needs to be tested \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

t.test(x,y,alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  x and y
## t = 2.6751, df = 13.226, p-value = 0.009423
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.8539293       Inf
## sample estimates:
## mean of x mean of y 
##  9.366625  6.846625
  • From the Welch Two Sample T-test we can see the p-value is way lower than alpha=0.05
  • Therefore we reject the null hypothesis.
  • Mean thickness at higher temperature seems to be lower.

4.2 (b)

  • The p-value = 0.009

4.3 (c)

  • From the test result we can observe that the 95% confidence interval’s bound is \(0.8539 \leq\mu_{1}-\mu_{2}\)
  • The lower bound is greater than zero, so we can say that the thickness of the photo-resist portion is different due to different temperatures \((95^{0}C , 100^{0}C)\) .

4.4 (e)

Checking the assumption of normality

qqnorm(x,main='For 95C photo-resist Thickness NPP',col='red')
qqline(x,col='black')

qqnorm(y,main='For 100C photo-resist Thickness NPP',col='blue')
qqline(y,col='green')

  • Both sample do not deviate much from normality assumption.

4.5 (f)

Find the power of this test

c_d=(mean(x)-mean(y))/((((sd(x)^2)+(sd(y)^2))/2)^0.5)
c_d
## [1] 1.337555
pwr.t.test(n=8,d=c_d,sig.level = 0.05,power=NULL,type = c("two.sample"), alternative = c("two.sided"))
## 
##      Two-sample t test power calculation 
## 
##               n = 8
##               d = 1.337555
##       sig.level = 0.05
##           power = 0.701445
##     alternative = two.sided
## 
## NOTE: n is number in *each* group
  • The power of the test is 0.701445

5 Problem 2.32

Reading the data

cal_1 <- c(0.265,0.265,0.266,0.267,0.267,0.265,0.267,0.267,0.265,0.268,0.268,0.265)
cal_2 <- c(0.264,0.265,0.264,0.266,0.267,0.268,0.264,0.265,0.265,0.267,0.268,0.269)

The hypotheses that needs to be tested \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

t.test(cal_1,cal_2,paired=TRUE)
## 
##  Paired t-test
## 
## data:  cal_1 and cal_2
## t = 0.43179, df = 11, p-value = 0.6742
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.001024344  0.001524344
## sample estimates:
## mean difference 
##         0.00025

5.1 (a)

  • From the test we can see that the mean difference is very small for the given samples, therefore we can assume that the mean difference of the population is not significant.

5.2 (b)

  • The p-value for this test is 0.6742.

5.3 (c)

  • 95% confidence interval for the two calipers is \(0.00102 \leq\mu_{1}-\mu_{2}\leq 0.00152\)

6 Problem 2.34 (a)

Reading the data

grider <- c("S1/1","S2/1","S3/1","S4/1","S5/1","S2/1","S2/2","S2/3","S2/4")
k_method <- c(1.186,1.151,1.322,1.339,1.200,1.402,1.365,1.537,1.559)
l_method <- c(1.061,0.992,1.063,1.062,1.065,1.178,1.037,1.086,1.052)
grider <- as.factor(grider)
k_method <- as.numeric(k_method)
l_method <- as.numeric(l_method)
df <- data.frame(grider,k_method,l_method)

The hypotheses that needs to be tested \[ H_{0} : \mu_{1}=\mu_{2} =>\mu_{1}-\mu_{2}=0 \] \[ H_{a} : \mu_{1} \neq \mu_{2} =>\mu_{1}-\mu_{2} \neq 0 \]

t.test(df$k_method,df$l_method,paired=TRUE,alternative='two.sided')
## 
##  Paired t-test
## 
## data:  df$k_method and df$l_method
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.1700423 0.3777355
## sample estimates:
## mean difference 
##       0.2738889
  • The p-value is significantly lower than alpha(0.05), so we reject the null hypothesis.
  • From the t-test we can say that there is a mean difference(0.2739) between these two tests.

6.1 (b)

  • The p-value = 0.0002953

6.2 (c)

  • 95% Confidence interval \(0.17\leq\mu_{1}-\mu_{2}\leq0.3777\)

6.3 (d)

Investigate Normality for both Samples

qqnorm(df$k_method,main="Karlsruhe Method",col='violet')
qqline(df$k_method)

qqnorm(df$l_method,main = "Lehigh Method",col='gold')
qqline(df$l_method)

  • If we ignore some outliers especially from Lehigh Method then both the sample’s are approximately normally distributed.

6.4 (e)

Normality assumption for difference in ratios of two methods

qqnorm(df$k_method-df$l_method,main="NPP for Difference in Ratio between two methods",col='purple')
qqline(df$k_method-df$l_method)

  • Like the samples, if we ignore some outliers the difference in ratio between these two methods are approximately normally distributed.

6.5 (f)

  • The normality assumption for paired t-test is important, but we do not need strong assumption.
  • The assumption of normality for paired t-test relies on the differences of the paired observations.

`

7 Complete R Code

library(lawstat)
library(dplyr)
library(pwr)
#Question 2.24
machine_1 <- c(16.03,16.04,16.05,16.05,16.02,16.01,15.96,15.98,16.02,15.99)
machine_2 <- c(16.02,15.97,15.96,16.01,15.99,16.03,16.04,16.02,16.01,16.0)

#2.24(a) The Hypotheses that should be tested are
# H0: u1=u2 => u1-u2=0 and
#the alternative Ha: u1!=u2 => u1-u2!=0


#2.24(b) Testing the hypotheses using alpha =0.05

t.test(machine_1,machine_2)

#From the Welch Two sample t-test we can see that the p-value is greater than alpha(0.05)
#So we do not reject the Null Hypothesis.

#2.24(c) p-value for this test is 0.435

#2.24(d) 95% confidence interval on the difference of the means -0.01635 <= u1-u2 =< 0.03635


#2.26 (a)
T1 <- c(65,82,81,67,57,59,66,75,82,70)
T2<-c(64,56,71,69,83,74,59,82,65,79)

# We have to test the hypothesis that the two variances are equal when alpha=0.05 using Levene's Test
# H0: u1=u2 => u1-u2=0 
# the alternative Ha: u1!=u2 => u1-u2!=0

?levene.test
dat_1<-data.frame(time=T1, type = factor(rep('t1')))
dat_2<-data.frame(time=T2, type = factor(rep('t2')))
dat_3<-rbind.data.frame(dat_1,dat_2)
levene.test(dat_3$time,dat_3$type)
# We can see the p-value=1, therefore we do not reject the null hypothesis.

#2.26 (b)
# From the levene's test we can see the test statistic is very low and p-value = 1
# Therefore we do not reject the Null Hypothesis.


#2.27 (a)
sccm_125<-c(2.7,4.6,2.6,3.0,3.2,3.8)
sccm_200<-c(4.6,3.4,2.9,3.5,4.1,5.1)
# The hypotheses that needs to be tested
# H0 : u1=u2 =>u1-u2=0
# Ha : u1!=u2 =>u1-u2!=0

wilcox.test(sccm_125,sccm_200)
# From the test we can see the p-value is 0.1994, which is greater than alpha.
# We do not reject the null hypothesis
# We can conclude that the C2F6 flow rate does not affect the average etch uniformity.


#2.29 (a)
x<-c(11.176,7.089,8.097,11.739,11.291,10.759,6.467,8.315)
y<-c(5.263,6.748,7.461,7.015,8.133,7.418,3.772,8.963)

# The hypotheses that needs to be tested are
# H0 : u1=u2 => u1-u2=0
# Ha : u1>u2 => u1-u2>0

t.test(x,y,alternative = "greater")
# From the Welch Two Sample T-test we can see the p-value is way lower than alpha=0.05
# Therefore we reject the null hypothesis.
# Mean thickness at higher temperature seems to be lower.

#2.29 (b) The p-value=0.009

#2.29 (c)
# From the test result we can observe that the 95% confidence interval's bound is 0.8539<=u1-u2
# The lower bound is greater than zero, so we can say that the thickness of the photo-resist portion is
# different due to different temperatures(95C & 100C).

#2.29 (e) Checking the assumption of normality
qqnorm(x,main='For 95C photo-resist Thickness NPP',col='red')
qqline(x,col='black')
qqnorm(y,main='For 100C photo-resist Thickness NPP',col='blue')
qqline(y,col='green')
# Both sample do not deviate much from normality assumption.

#2.29 (f) Find the power of this test
c_d=(mean(x)-mean(y))/((((sd(x)^2)+(sd(y)^2))/2)^0.5)
c_d
pwr.t.test(n=8,d=c_d,sig.level = 0.05,power=NULL,type = c("two.sample"), alternative = c("two.sided"))
# The power of the test is 0.701445

#2.32 (a)
cal_1 <- c(0.265,0.265,0.266,0.267,0.267,0.265,0.267,0.267,0.265,0.268,0.268,0.265)
cal_2 <- c(0.264,0.265,0.264,0.266,0.267,0.268,0.264,0.265,0.265,0.267,0.268,0.269)
# We need to test the hypotheses
# H0 : u1=u2 =>u1-u2=0
# Ha : u1!=u2 =>u1-u2!=0
t.test(cal_1,cal_2,paired=TRUE)
# From the test we can see that the mean difference is very small for the given samples, therefore
# we can assume that the mean difference of the population is not significant. 

#2.32 (b) The p-value for this test is 0.6742.

#2.32 (c) 95% confidence interval for the two calipers is
# 0.00102 <= u1-u2 =< 0.00152


#2.34 (a)
# Testing the hypotheses
# H0 : u1=u2 =>u1-u2=0
# Ha : u1!=u2 =>u1-u2!=0

grider <- c("S1/1","S2/1","S3/1","S4/1","S5/1","S2/1","S2/2","S2/3","S2/4")
k_method <- c(1.186,1.151,1.322,1.339,1.200,1.402,1.365,1.537,1.559)
l_method <- c(1.061,0.992,1.063,1.062,1.065,1.178,1.037,1.086,1.052)
grider <- as.factor(grider)
k_method <- as.numeric(k_method)
l_method <- as.numeric(l_method)
df <- data.frame(grider,k_method,l_method)
t.test(df$k_method,df$l_method,paired=TRUE,alternative='two.sided')

# The p-value is significantly lower than alpha(0.05), so we reject the null hypothesis.
#  From the t-test we can say that there is a mean difference(0.2739) between these two tests.

#2.34 (b) The p-value = 0.0002953

#2.34 (c) 95% Confidence interval
# 0.17 <=u1-u2=<0.3777

#2.34 (d) Investigate Normality for both Samples
qqnorm(df$k_method,main="Karlsruhe Method",col='violet')
qqline(df$k_method)
qqnorm(df$l_method,main = "Lehigh Method",col='gold')
qqline(df$l_method)

# If we ignore some outliers especially from Lehigh Method then both the sample's are
# approximately normally distributed.

#2.34 (e) Normality assumption for difference in ratios of two methods

qqnorm(df$k_method-df$l_method,main="NPP for Difference in Ratio between two methods",col='purple')
qqline(df$k_method-df$l_method)

# Like the samples, if we ignore some outliers the difference in ratio between these two methods are
# approximately normally distributed.

#2.34 (f)
# The normality assumption for paired t-test is important, but we do not need strong assumption.
# The assumption of normality for paired t-test relies on the differences of the paired observations.