Homework 2

Question 2.24

Hypothesis to be tested

H_o: \(\mu_1 = \mu_2\) - Null Hypothesis

H_a: \(\mu_1 \neq \mu_2\) - Alternative Hypothesis

where:

\(\sigma^2_1\) = 0.015, \(\sigma^2_2\) = 0.018, \(n_1\) = 10, \(n_2\) = 10

\(\mu_1\) is the mean of machine 1

\(\mu_2\) is the mean of machine 2

\(\sigma^2_1\) is variance of machine 1

\(\sigma^2_2\) is vairance of machine 2

Note sample sizes are not large and we can’t be certain to assume population variances are equal or not equal

Testing Hypothesis

Before testing the hypothesis, we need to clarify the assumption that variances are equal. First we check the boxplot to see how spread out the data samples which gives us an idea if the variance are equal

Mach1<-data.frame("v/ounces"=c(16.03,16.01,16.04,15.96,16.05,15.98,16.05,16.02,16.02,15.99))
Mach2<-data.frame("v/ounces"=c(16.02,16.03,15.97,16.04,15.96,16.02,16.01,16.01,15.99,16.00))

#boxplot to compare variance
boxplot(Mach1$v.ounces, Mach2$v.ounces, names = c("Mach1", "Mach2"), main="Comparing Boxplot for Machine1 and Machine2")

From the boxplot the variances are not equal as Mach1 is more spread out than Mach2. To correct this will do a Log Transformation to make the data appear that they are the same.

LMach1<-log(Mach1)
LMach2<-log(Mach2)
LMach1$v.ounces

##  [1] 2.774462 2.773214 2.775086 2.770086 2.775709 2.771338 2.775709 2.773838
##  [9] 2.773838 2.771964

LMach2$v.ounces

##  [1] 2.773838 2.774462 2.770712 2.775086 2.770086 2.773838 2.773214 2.773214
##  [9] 2.771964 2.772589

boxplot(LMach1$v.ounces, LMach2$v.ounces, names = c("Mach1", "Mach2"), main="Comparing Boxplot for Transformed Machine1 and Machine2")

After log transformation, it appears the variance from the sample data are equal. This will be further clarified after running T-statistic of both the original sample data and the transformed data

T-Statistic of original Data with pooled variance

t.test(Mach1,Mach2,var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  Mach1 and Mach2
## t = 0.79894, df = 18, p-value = 0.4347
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01629652  0.03629652
## sample estimates:
## mean of x mean of y 
##    16.015    16.005

T-Statistic for Transformed Data

t.test(LMach1,LMach2,var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  LMach1 and LMach2
## t = 0.79804, df = 18, p-value = 0.4353
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.001018978  0.002267264
## sample estimates:
## mean of x mean of y 
##  2.773524  2.772900

Conclusion

The assumption that pooled variances are equal is true, from the result obtained after running the T statistics are very much identical. However Levene’s test is a much accurate approach to resolving variance equality.

In conclusion, we Fail to reject the null hypothesis H_o at /t_o/ = 0.79894, since P-value at 0.4347 is greater than \(\alpha\) = 0.05 is the probability H_o is true

The 95% confidence interval are -0.01629652 and 0.03629652

Question 2.26

Testing Hypothesis for equal variances

H_o: \(\sigma^2_1 = \sigma^2_2\) - Null Hypothesis variances are equal

H_a: \(\sigma^2_1 \neq \sigma^2_2\) - Alternative Hypothesis are not equal

\(\alpha\) = 0.05

Type1<-data.frame("t-mins"=c(65,81,57,66,82,82,67,59,75,70))
Type2<-data.frame("t-mins"=c(64,71,83,59,65,56,69,74,82,79))

#mean(Type1$t.mins)
#mean(Type2$t.mins)
#var(Type1$t.mins) 
#var(Type2$t.mins)

Where:

\(\sigma^2_1\) = 85.8222222

\(\sigma^2_2\) = 87.7333333

sample sizes of \(n_1\) = 10, \(n_2\) = 10 and

\(\mu_1\) = 70.4 is the mean of Type1

\(\mu_2\) = 70.2 is the mean of Type2

boxplot(Type1$t.mins, Type2$t.mins, names = c("Type1", "Type2"), main="Comparing Boxplot of Burning Times of Type1 and Type2")

From the boxplot, it appears that the data of both samples of burning times temperature kA are evenly spread out, however we would do a levene’s test to be certain that variances are equal.

#install.packages("lawstat")
library(lawstat)

group<-factor(rep(c("1", "2"), each = 10))

Type3<-rbind(Type1,Type2)
Type3$group<-group
str(Type3)

## 'data.frame':    20 obs. of  2 variables:
##  $ t.mins: num  65 81 57 66 82 82 67 59 75 70 ...
##  $ group : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...

levene.test(Type3$t.mins, Type3$group, location = "mean")

## 
##  Classical Levene's test based on the absolute deviations from the mean
##  ( none not applied because the location is not set to median )
## 
## data:  Type3$t.mins
## Test Statistic = 0.0014598, p-value = 0.9699

After running the levene’s test, /t_o/ = 0.0014598 is small and very close to 0 and p-value at 0.9699 is greater than \(\alpha\) = 0.05. Hence we fail to reject the null hypothesis H_o

Question 2.26

Hypothesis to be tested

H_o: \(\mu_1 = \mu_2\) - Null Hypothesis that the mean are equal

H_a: \(\mu_1 \neq \mu_2\) - Alternative Hypothesis that the mean are not equal

Calculating T statistics with the assumption that sample size is not large and that the variance are equal from our conclusion in 2.26a

t.test(Type1,Type2,var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  Type1 and Type2
## t = 0.048008, df = 18, p-value = 0.9622
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -8.552441  8.952441
## sample estimates:
## mean of x mean of y 
##      70.4      70.2

In conclusion, we Fail to reject the null hypothesis H_o at /t_o/ = 0.048008, since P-value at 0.9622 is greater than \(\alpha\) = 0.05 is the probability H_o is true

The 95% confidence interval are -8.552441 and 8.952441

Question 2.29

Temp1<-data.frame("kA95C"=c(11.176,7.089,8.097,11.739,11.291,10.759,6.467,8.315))
Temp2<-data.frame("kA100C"=c(5.263,6.748,7.461,7.015,8.133,7.418,3.772,8.963))

mean(Temp1$kA95C)

## [1] 9.366625

mean(Temp2$kA100C)

## [1] 6.846625

var(Temp1$kA95C)

## [1] 4.40817

var(Temp2$kA100C)

## [1] 2.690999

Hypothesis to be tested

H_o: \(\mu_1 = \mu_2\) - Null Hypothesis is the mean are equal

H_a: \(\mu_1 > \mu_2\) - Alternative Hypothesis is the low temp mean is greater than high temp mean

where:

\(\mu_1\) = 9.366625 is the mean of low temp

\(\mu_2\) = 6.846625 is the mean of high temp

\(\sigma^2_1\) = 4.4081703 is is the mean of low temp

\(\sigma^2_2\) = 2.6909991 is the mean of high temp

sample sizes of \(n_1\) = 8, \(n_2\) = 8 and

Note sample sizes are not large and we can’t be certain to assume population variances are equal or not equal

Testing Hypothesis

#boxplot to compare variance
boxplot(Temp1$kA95C, Temp2$kA100C, names = c("low", "high"), main="Comparing Boxplot for low and high temperatures")

From the boxplot the variances are not equal as low temp is more spread out than high temp. To correct this will do a Log Transformation to make the data appear that they are the same.

LTemp<-log(Temp1)
HTemp<-log(Temp2)
LTemp$kA95C

## [1] 2.413769 1.958544 2.091494 2.462917 2.424006 2.375743 1.866712 2.118061

HTemp$kA100C

## [1] 1.660701 1.909246 2.009689 1.948051 2.095930 2.003909 1.327605 2.193105

boxplot(LTemp$kA95C, HTemp$kA100C, names = c("low", "high"), main="Comparing Boxplot forTransformed low and high temperatures")

After log transformation, it appears the variance from the sample data are equal. This will be further clarified after running T-statistic of both the original sample data and the transformed data

T-Statistic of original Data with pooled variance

t.test(LTemp,HTemp,var.equal = TRUE, alternative = "greater")

## 
##  Two Sample t-test
## 
## data:  LTemp and HTemp
## t = 2.5046, df = 14, p-value = 0.01262
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.09507682        Inf
## sample estimates:
## mean of x mean of y 
##  2.213906  1.893530

Conclusion

In conclusion, we reject the null hypothesis H_o at /t_o/ = 2.5046, since P-value at 0.01262 is less than \(\alpha\) = 0.05 is the probability H_o False. We accept the alternative hypothesis H_a is the low temp mean is greater than high temp mean. Therefore there is enough evidence to support the claim.

The 95% confidence interval on the difference in means is 0.09507682 to infinity

Testing normality assumption

qqnorm(LTemp$kA95C,main="Log Tranformation of low temp", xlab = 'temp', ylab = 'kA')
qqline(LTemp$kA95C, datax = FALSE, distribution = qnorm,
       probs = c(0.25, 0.75), qtype = 7)

From the plot, the data appears to be on a straight line, with few exception to an extreme value. Other than that low temp sample is normal

qqnorm(HTemp$kA100C,main="Log Tranformation of high temp", xlab = 'temp', ylab = 'kA')
qqline(HTemp$kA100C, datax = FALSE, distribution = qnorm,
       probs = c(0.25, 0.75), qtype = 7)

From the plot, the data appears to be on a straight line, with few exception of outliers. Other than that high temp sample is normal

Also its important to note that smaller sample sizes results in non-normal distribution as the size has a significant effect on the distribution

### Question 2.24
#**a)**

#***Hypothesis to be tested***
#**H~o~**: $\mu_1 =  \mu_2$ - Null Hypothesis
#**H~a~**: $\mu_1 \neq \mu_2$ - Alternative Hypothesis

***
#**b)**
#***Testing Hypothesis***

Mach1<-data.frame("v/ounces"=c(16.03,16.01,16.04,15.96,16.05,15.98,16.05,16.02,16.02,15.99))
Mach2<-data.frame("v/ounces"=c(16.02,16.03,15.97,16.04,15.96,16.02,16.01,16.01,15.99,16.00))

#boxplot to compare variance
boxplot(Mach1$v.ounces, Mach2$v.ounces, names = c("Mach1", "Mach2"), main="Comparing Boxplot for Machine1 and Machine2")

#log transformation
LMach1<-log(Mach1)
LMach2<-log(Mach2)
LMach1$v.ounces
LMach2$v.ounces

boxplot(LMach1$v.ounces, LMach2$v.ounces, names = c("Mach1", "Mach2"), main="Comparing Boxplot for Transformed Machine1 and Machine2")

#***T-Statistic of original Data with pooled variance***
t.test(Mach1,Mach2,var.equal = TRUE)

#***T-Statistic for Transformed Data***
t.test(LMach1,LMach2,var.equal = TRUE)

***
### Question 2.26
**a)**

#***Testing Hypothesis for equal variances***
#**H~o~**: $\sigma^2_1 = \sigma^2_2$ - Null Hypothesis variances are equal
#**H~a~**: $\sigma^2_1 \neq \sigma^2_2$ - Alternative Hypothesis are not equal
#*$\alpha$* = 0.05
        
Type1<-data.frame("t-mins"=c(65,81,57,66,82,82,67,59,75,70))
Type2<-data.frame("t-mins"=c(64,71,83,59,65,56,69,74,82,79))

mean(Type1$t.mins)
mean(Type2$t.mins)
var(Type1$t.mins) 
var(Type2$t.mins) 

boxplot(Type1$t.mins, Type2$t.mins, names = c("Type1", "Type2"), main="Comparing Boxplot of Burning Times of Type1 and Type2")

#install.packages("lawstat")
library(lawstat)

group<-factor(rep(c("1", "2"), each = 10))

Type3<-rbind(Type1,Type2)
Type3$group<-group
str(Type3)

levene.test(Type3$t.mins, Type3$group, location = "mean")

***
#b)

#Hypothesis to be tested

#**H~o~**: $\mu_1 =  \mu_2$ - Null Hypothesis that the mean are equal
#**H~a~**: $\mu_1 \neq \mu_2$ - Alternative Hypothesis that the mean are not equal
#Calculating T statistics with the assumption that sample size is not large and that the variance are equal from our conclusion in 2.26a
t.test(Type1,Type2,var.equal = TRUE)

***
#Question 2.29
#a)

Temp1<-data.frame("kA95C"=c(11.176,7.089,8.097,11.739,11.291,10.759,6.467,8.315))
Temp2<-data.frame("kA100C"=c(5.263,6.748,7.461,7.015,8.133,7.418,3.772,8.963))

mean(Temp1$kA95C)
mean(Temp2$kA100C)
var(Temp1$kA95C)
var(Temp2$kA100C)

#***Hypothesis to be tested***
#**H~o~**: $\mu_1 =  \mu_2$ - Null Hypothesis is the mean are equal
#**H~a~**: $\mu_1 > \mu_2$ - Alternative Hypothesis is the low temp mean is greater than high temp mean

***
#**b)**

#Testing Hypothesis
#boxplot to compare variance
boxplot(Temp1$kA95C, Temp2$kA100C, names = c("low", "high"), main="Comparing Boxplot for low and high temperatures")

LTemp<-log(Temp1)
HTemp<-log(Temp2)
LTemp$kA95C
HTemp$kA100C

boxplot(LTemp$kA95C, HTemp$kA100C, names = c("low", "high"), main="Comparing Boxplot forTransformed low and high temperatures")

#***T-Statistic of original Data with pooled variance***
t.test(LTemp,HTemp,var.equal = TRUE, alternative = "greater")

***
#d)
#Testing normality assumption***
qqnorm(LTemp$kA95C,main="Log Tranformation of low temp", xlab = 'temp', ylab = 'kA')
qqline(LTemp$kA95C, datax = FALSE, distribution = qnorm,
       probs = c(0.25, 0.75), qtype = 7)

qqnorm(HTemp$kA100C,main="Log Tranformation of high temp", xlab = 'temp', ylab = 'kA')
qqline(HTemp$kA100C, datax = FALSE, distribution = qnorm,
       probs = c(0.25, 0.75), qtype = 7)

Homework 2

Alex Eghorieta

9/10/2021

Question 2.24

Question 2.26

Question 2.26

Question 2.29