Homework 1

Q 2.24.

2.24 (a) Statement of the hypothesis:

We will test both null and alternative hypotheses mathematically given as:

\(H_0: \mu_1 = \mu_2\\ H_a: \mu_1 \neq \mu_2\)

where \(\mu_1\) and \(\mu_2\) are the means from sample 1 and 2.

2.24 (b)

# Data given
Mach1 <- c(16.03,16.04,16.05,16.05,16.02,16.01,15.96,15.98,16.02,15.99)
Mach2 <- c(16.02,15.97,15.96,16.01,15.99,16.03,16.04,16.02,16.01,16.00)
# Two sample t-test:
t.test(Mach1,Mach2,var.equal = TRUE, alternative= "two.sided")

## 
##  Two Sample t-test
## 
## data:  Mach1 and Mach2
## t = 0.79894, df = 18, p-value = 0.4347
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01629652  0.03629652
## sample estimates:
## mean of x mean of y 
##    16.015    16.005

Since the p-value of 0.4347 is much larger than \(\alpha=0.05\)so we fail to reject \(H_0\).

2.24 (c):

p-value=0.4347.

2.24 (d):

95% confindence interval is given by the t test and it is in between

\(-.016296 \leq \mu_1 -\mu_2 \leq 0.036296\)

Q 2.26 (a)

#2.26 a (Testing using Levene's test since the variances are stated equal)
BT <- c(65, 81, 57, 66, 82, 82, 67, 59, 75, 70, 64, 71, 83, 59, 65, 56, 69, 74, 82, 79)
B <- c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2")
data <- cbind(BT, B)
data1 <- data.frame(data)
str(data1)

## 'data.frame':    20 obs. of  2 variables:
##  $ BT: chr  "65" "81" "57" "66" ...
##  $ B : chr  "1" "1" "1" "1" ...

Since the variable BT is consisting of numbers and B is a factor so making the appropriate changes.

data1$B <- as.factor(data1$B)
data1$BT <- as.numeric(data1$BT)
str(data1)

## 'data.frame':    20 obs. of  2 variables:
##  $ BT: num  65 81 57 66 82 82 67 59 75 70 ...
##  $ B : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...

#install.packages("lawstat")
library(lawstat)
levene.test(data1$BT,data1$B, location = "mean")

## 
##  Classical Levene's test based on the absolute deviations from the mean
##  ( none not applied because the location is not set to median )
## 
## data:  data1$BT
## Test Statistic = 0.0014598, p-value = 0.9699

Since the p-value is greater than \(\alpha\) so we fail to reject \(H_0\).

2.26 (b):

Since it is given that the data is normal and variances are equal so we can do the two sample T-test like we did in 2.24.

T1 <- c(65, 81, 57, 66, 82, 82, 67, 59, 75, 70)
T2 <- c(64, 71, 83, 59, 65, 56, 69, 74, 82, 79)
t.test(T1,T2,var.equal=TRUE, alternative = "two.sided")

## 
##  Two Sample t-test
## 
## data:  T1 and T2
## t = 0.048008, df = 18, p-value = 0.9622
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.552441  8.952441
## sample estimates:
## mean of x mean of y 
##      70.4      70.2

Since the p-values again is 0.9622 which is way greater than \(\alpha\) so we fail to reject \(H_0\).

Q 2.29 (a):

Since we are to test the averages of the samples given so we could try it using two sample t-test to check for the normality , we could also use normality and boxplot to check whether the data is normal or not and then perform out t-test.

B95 <- c(11.176, 7.089, 8.097, 11.739, 11.291, 10.759, 6.467, 8.315)
B100 <- c(5.263, 6.748, 7.461, 7.015, 8.133, 7.418, 3.772, 8.963)
qqnorm(B95)
qqline(B95)

qqnorm(B100)
qqline(B100)

# since the data look reasonably normal so we can do the two sample t-test
t.test(B95,B100,alternative = "greater", var.equal=TRUE)

## 
##  Two Sample t-test
## 
## data:  B95 and B100
## t = 2.6751, df = 14, p-value = 0.009059
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.8608158       Inf
## sample estimates:
## mean of x mean of y 
##  9.366625  6.846625

Since the p-values of 0.009059 is way lower than 0.05 so we reject \(H_0\).

2.29 (b):

The p-value is 0.009059.

2.29 (c):

The 95% confidence interval is given by the two sample t-test as

\(-.8608158 \leq \mu_1 -\mu_2 \leq \infty\)

2.29 (e):

qqnorm(B95,main="Normal Probability Plot for PopC at 95 C", ylab = "Photoresist Thickness")
qqline(B95)

qqnorm(B100,main="Normal Probability Plot for PopD at 100 C", ylab = "Photoresist Thickness")
qqline(B100)

2.32 (a):

# Loading the data
c1 <- c(0.265, 0.265, 0.266, 0.267, 0.267, 0.265, 0.267, 0.267, 0.265, 0.268, 0.268, 0.265)
c2 <- c(0.264, 0.265, 0.264, 0.266, 0.267, 0.268, 0.264, 0.265, 0.265, 0.267, 0.268, 0.269)

Since both sets of data are inspected by 12 inspectors meaning that inspector has has made two inspections each from one of the data sets so there could be dependance on each other.

t.test(c1,c2,paired=TRUE)

## 
##  Paired t-test
## 
## data:  c1 and c2
## t = 0.43179, df = 11, p-value = 0.6742
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.001024344  0.001524344
## sample estimates:
## mean difference 
##         0.00025

Since p-value is 0.6742 which is greater than 0.05 so we fail to reject \(H_0\).

2.32 (b):

p-value is 0.6742.

2.32(c):

95% confidence interval is

\(-0.001024344 \leq \mu_1 - \mu_2 \leq 0.001524344\)

2.34 (a):

k <-c(1.186,1.151,1.322,1.339,1.200,1.402,1.365,1.537,1.559)
l <- c(1.061,0.992,1.063,1.062,1.065,1.178,1.037,1.086,1.052)
t.test(k,l,paired=TRUE)

## 
##  Paired t-test
## 
## data:  k and l
## t = 6.0819, df = 8, p-value = 0.0002953
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.1700423 0.3777355
## sample estimates:
## mean difference 
##       0.2738889

Since p-value is 0.0002953 which is than 0.05 so we reject \(H_0\) and conclude that there is a significant difference between the means.

2.34 (b):

p-value is 0.0002953.

2.34 (c):

The confidence interval is \(0.1700423 \leq \mu_1 - \mu_2 \leq 0.3777355\).

2.34 (d):

qqnorm(k)
qqline(k)

qqnorm(l)
qqline(l)

The data for sample k appears reaonably normal while that of sample l appears not normal.

2.34 (e):

Difference <- (k-l)
qqnorm(Difference)
qqline(Difference)

Apart from the last few points , the data appears to be reaonably normal.

2.34(f):

Since the sample size is 12 which is not large and paired T-test is sensitive to normality of the difference of the data.

2.29 (e):

qqnorm(B95,main="Normal Probability Plot for PopC at 95 C", ylab = "Photoresist Thickness")
qqline(B95)

qqnorm(B100,main="Normal Probability Plot for PopD at 100 C", ylab = "Photoresist Thickness")
qqline(B100)

2.29(f):

To test the power of the sample

# d=difference in mean/standard deviation, n=sample size


mean_B95 <- mean(B95)
mean_B100 <- mean(B100)

sd_B95 <- sd(B95)
sd_B100 <- sd(B100)

# pooled standard deviation
pooled_sd <- sqrt(((length(B95) - 1) * sd_B95^2 + (length(B100) - 1) * sd_B100^2) / (length(B95) + length(B100) - 2))

# the d
d <- (mean_B95 - mean_B100) / pooled_sd
print(d)

## [1] 1.337555

#install.packages("pwr")
library(pwr)
pwr.t.test(n=8,d=1.337555,sig.level=0.05,power=NULL,type="two.sample")

## 
##      Two-sample t test power calculation 
## 
##               n = 8
##               d = 1.337555
##       sig.level = 0.05
##           power = 0.7014448
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

power is 0.7014448 of roughly 70%.

Q 2.27 (a):

f125 <- c(2.7,4.6,2.6,3.0,3.2,3.8)
f200 <- c(4.6,3.4,2.9,3.5,4.1,5.1)

Since the data set is very small, so we need to check for normality first and then try using some test on it.

qqnorm(f125,main="Flow 125 SCCM Normal Probability Plot")
qqline(f125)

qqnorm(f200,main="Flow 200 SCCM Normal Probability Plot")
qqline(f200)

boxplot(f125,f200,names=c("f125","f200"),col=c("blue","red"))

Since the data seems to deviate from the normality indicated by both npp and box plots so we can use the non parametric t test.

wilcox.test(f125,f200)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  f125 and f200
## W = 9.5, p-value = 0.1994
## alternative hypothesis: true location shift is not equal to 0

Since p-value is greater than \(\alpha\) so we fail to reject \(H_0\).

Homework 1

Yasir Iqbal

Last compiled on September 13, 2024 at 1:18 PM - CDT