The diameter of a ball bearing was measured by 12 inspectors, each using two different kinds of calipers. The results were:
Inspector <- c(1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
Caliper1 <- c( 0.265,0.265 ,0.266, 0.267 ,0.267, 0.265 ,0.267 ,0.267, 0.265 ,0.268,0.268, 0.265)
Caliper2 <- c( 0.264,0.265, 0.264 ,0.266, 0.267, 0.268 ,0.264, 0.265 ,0.265, 0.267, 0.268 ,0.269)
dafr <- data.frame(Inspector,Caliper1,Caliper2)
qqnorm(abs(Caliper1-Caliper2),main='Difference Between Points Q-Q plot')
First checking for normality, it does not seem to me that these two are normally distributed. Because these two samples are paired, we plot the difference between them. Because we do not have normality, we use a Wilcox test.
Ho: mu1=mu2 or Caliper1’s mean and Caliper2’s mean are not different
Ha: mu1≠mu2 or Caliper1’s mean and Caliper2’s mean are different
wilcox.test(Caliper1,Caliper2,paired=TRUE,conf.int=TRUE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Caliper1 and Caliper2
## V = 21.5, p-value = 0.6721
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.001529272 0.002024144
## sample estimates:
## (pseudo)median
## 0.0009999638
Because our p-value is greater then our alpha, we do not reject the null hypothesis and state that there is not a significant difference between the two means
The p-value was given as .6721
The confidence interval at 95% was given as (-.001 & .002)
An article in the Journal of Strain Analysis (vol. 18, no. 2, 1983) compares several procedures for predicting the shear strength for steel plate girders. Data for nine girders in the form of the ratio of predicted to observed load for two of these procedures, the Karlsruhe and Lehigh methods, are as follows:
Girder <- c('S1/1', 'S2/1', 'S3/1', 'S4/1', 'S5/1', 'S2/1', 'S2/2', 'S2/3', 'S2/4')
Karlsruhe <- c(1.186, 1.151 ,1.322, 1.339, 1.2, 1.402, 1.365 ,1.537, 1.559)
Lehigh <- c(1.061 ,0.992, 1.063, 1.062 ,1.065, 1.178, 1.037 ,1.086, 1.052)
dafr <- data.frame(Girder,Karlsruhe,Lehigh)
Testing for equal variances:
var.test(Karlsruhe, Lehigh, alternative = "two.sided")
##
## F test to compare two variances
##
## data: Karlsruhe and Lehigh
## F = 8.7454, num df = 8, denom df = 8, p-value = 0.006008
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.972674 38.770520
## sample estimates:
## ratio of variances
## 8.745375
Our ratio of variances is not equal, by this Ftest(this test assumes normality, and that assumption is proven in later questions)
Therefore, we must use a unpooled variances.
Ho: mu1=mu2 or Karlsruhe’s mean and Lehigh’s mean are not different
Ha: mu1≠mu2 or Karlsruhe’s mean and Lehigh’s mean are different
t.test(Karlsruhe,Lehigh,paired=TRUE,var.equal=FALSE)
##
## Paired t-test
##
## data: Karlsruhe and Lehigh
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1700423 0.3777355
## sample estimates:
## mean of the differences
## 0.2738889
Yes, there is enough evidence to state the the two means are not equal because our pvalue is below our alpha.
Our pvalue is stated to be .0002953
Our confidence interval on the difference between the loads was (.1700 & .3777)
qqnorm(Karlsruhe,main='Karlsruhe Q-Q plot')
qqnorm(Lehigh,main='Lehigh Q-Q plot')
It is a little hard to tell, given that we have so few data points, but the data does not appear to be centered in a line, but rather seems to drift around towards the ends
qqnorm(abs(Karlsruhe-Lehigh),main='Difference Between both Q-Q Plot')
It may be arguable, but this does look to me to be Normally distributed, since they are mostly centered around a center line.
For the paired t test, we are checking to see if the differences between the two samples is normally distributed If it is not, then our paired t test should not be performed, as it relies on the assumption that it is normal.
Photoresist is a light-sensitive material applied to semiconductor wafers so that the circuit pattern can be imaged on to the wafer. After application, the coated wafers are baked to remove the solvent in the photoresist mixture and to harden the resist. Here are measurements of photoresist thickness (in kA) for eight wafers baked at two different temperatures. Assume that all of the runs were made in random order.
data <- c(11.176, 7.089, 8.097, 11.739 ,11.291 ,10.759 ,6.467 ,8.315 ,5.263 ,6.748 ,7.461 ,7.015 ,8.133, 7.418, 3.772, 8.963)
temp <- c(95, 95, 95, 95, 95, 95, 95, 95 ,100 ,100, 100, 100 ,100 ,100 ,100 ,100)
qqnorm(data[1:8])
qqline(data[1:8],main='Temp 95 Q-Q Plot')
qqnorm(data[8:16])
qqline(data[8:16],main='Temp 100 Q-Q Plot')
The data does appear to be normally distributed
library(pwr)
pwr.t.test(n=8,d=(abs(mean(data[1:8])-mean(data[8:16]))/sd(data)),sig.level=0.05,power=NULL,type="two.sample",alternative="two.sided")
##
## Two-sample t test power calculation
##
## n = 8
## d = 1.053342
## sig.level = 0.05
## power = 0.5006574
## alternative = two.sided
##
## NOTE: n is number in *each* group
The power of this test is .5, which means we are 50% confident that we will correctly reject our original hypothesis (that the two means are equal)
An article in Solid State Technology, “Orthogonal Design for Process Optimization and Its Application to Plasma Etching” by G. Z. Yin and D. W. Jillie (May 1987) describes an experiment to determine the effect of the C2F6 flow rate on the uniformity of the etch on a silicon wafer used in integrated circuit manufacturing. All of the runs were made in random order. Data for two flow rates are as follows:
cf125 <- c(2.7, 4.6 ,2.6 ,3.0 ,3.2 ,3.8)
cf200 <- c(4.6, 3.4, 2.9, 3.5 ,4.1 ,5.1)
qqnorm(cf125,main='C2F6 Flow Rate 125 Q-Q Plot')
qqnorm(cf200,main='C2F6 Flow Rate 200 Q-Q Plot')
It’s somewhat difficult to tell, but they don’t appear to be not normally distributed
We should then test to see if the Variances are equal
boxplot(cf125,cf200,names=c('CF125','CF200'))
The variances seem equal, based on visual inspection. Both have the same kind of spread of quartiles.
t.test(Karlsruhe,Lehigh)
##
## Welch Two Sample t-test
##
## data: Karlsruhe and Lehigh
## t = 5.3302, df = 9.8059, p-value = 0.0003557
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1590886 0.3886892
## sample estimates:
## mean of x mean of y
## 1.340111 1.066222
Since our p value is below our alpha of .05, we do reject the null hypothesis and state that C2F6 flow rate does affect etch uniformity.
All Code Used:
Inspector <- c(1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
Caliper1 <- c( 0.265,0.265 ,0.266, 0.267 ,0.267, 0.265 ,0.267 ,0.267, 0.265 ,0.268,0.268, 0.265)
Caliper2 <- c( 0.264,0.265, 0.264 ,0.266, 0.267, 0.268 ,0.264, 0.265 ,0.265, 0.267, 0.268 ,0.269)
dafr <- data.frame(Inspector,Caliper1,Caliper2)
qqnorm(abs(Caliper1-Caliper2),main='Difference Between Points Q-Q plot')
wilcox.test(Caliper1,Caliper2,paired=TRUE,conf.int=TRUE)
## Warning in wilcox.test.default(Caliper1, Caliper2, paired = TRUE, conf.int =
## TRUE): cannot compute exact p-value with ties
## Warning in wilcox.test.default(Caliper1, Caliper2, paired = TRUE, conf.int =
## TRUE): cannot compute exact confidence interval with ties
## Warning in wilcox.test.default(Caliper1, Caliper2, paired = TRUE, conf.int =
## TRUE): cannot compute exact p-value with zeroes
## Warning in wilcox.test.default(Caliper1, Caliper2, paired = TRUE, conf.int =
## TRUE): cannot compute exact confidence interval with zeroes
##
## Wilcoxon signed rank test with continuity correction
##
## data: Caliper1 and Caliper2
## V = 21.5, p-value = 0.6721
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
## -0.001529272 0.002024144
## sample estimates:
## (pseudo)median
## 0.0009999638
Girder <- c('S1/1', 'S2/1', 'S3/1', 'S4/1', 'S5/1', 'S2/1', 'S2/2', 'S2/3', 'S2/4')
Karlsruhe <- c(1.186, 1.151 ,1.322, 1.339, 1.2, 1.402, 1.365 ,1.537, 1.559)
Lehigh <- c(1.061 ,0.992, 1.063, 1.062 ,1.065, 1.178, 1.037 ,1.086, 1.052)
dafr <- data.frame(Girder,Karlsruhe,Lehigh)
var.test(Karlsruhe, Lehigh, alternative = "two.sided")
##
## F test to compare two variances
##
## data: Karlsruhe and Lehigh
## F = 8.7454, num df = 8, denom df = 8, p-value = 0.006008
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.972674 38.770520
## sample estimates:
## ratio of variances
## 8.745375
t.test(Karlsruhe,Lehigh,paired=TRUE,var.equal=FALSE)
##
## Paired t-test
##
## data: Karlsruhe and Lehigh
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1700423 0.3777355
## sample estimates:
## mean of the differences
## 0.2738889
qqnorm(Karlsruhe,main='Karlsruhe Q-Q plot')
qqnorm(Lehigh,main='Lehigh Q-Q plot')
qqnorm(abs(Karlsruhe-Lehigh),main='Difference Between Both Q-Q Plot')
data <- c(11.176, 7.089, 8.097, 11.739 ,11.291 ,10.759 ,6.467 ,8.315 ,5.263 ,6.748 ,7.461 ,7.015 ,8.133, 7.418, 3.772, 8.963)
temp <- c(95, 95, 95, 95, 95, 95, 95, 95 ,100 ,100, 100, 100 ,100 ,100 ,100 ,100)
qqnorm(data[1:8])
qqline(data[1:8],main='Temp 95 Q-Q Plot')
qqnorm(data[8:16])
qqline(data[8:16],main='Temp 100 Q-Q Plot')
library(pwr)
pwr.t.test(n=8,d=(abs(mean(data[1:8])-mean(data[8:16]))/sd(data)),sig.level=0.05,power=NULL,type="two.sample",alternative="two.sided")
##
## Two-sample t test power calculation
##
## n = 8
## d = 1.053342
## sig.level = 0.05
## power = 0.5006574
## alternative = two.sided
##
## NOTE: n is number in *each* group
cf125 <- c(2.7, 4.6 ,2.6 ,3.0 ,3.2 ,3.8)
cf200 <- c(4.6, 3.4, 2.9, 3.5 ,4.1 ,5.1)
qqnorm(cf125,main='C2F6 Flow Rate 125 Q-Q Plot')
qqnorm(cf200,main='C2F6 Flow Rate 200 Q-Q Plot')
boxplot(cf125,cf200,names=c('CF125','CF200'))
t.test(Karlsruhe,Lehigh)
##
## Welch Two Sample t-test
##
## data: Karlsruhe and Lehigh
## t = 5.3302, df = 9.8059, p-value = 0.0003557
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1590886 0.3886892
## sample estimates:
## mean of x mean of y
## 1.340111 1.066222