1 1. Question 1

1.1 1.a Sample size to detect 1hr difference between mean lives of the fluids

Using the power.anova.test to find the sample size:

power.anova.test(groups = 4, n = NULL, between.var = 1, within.var = 4.5, sig.level = 0.05, power = 0.8)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 17.3624
##     between.var = 1
##      within.var = 4.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

We need 18 samples to detect 1hr difference between mean lives of the fluids given the alpha = 0.05, within variation = 4.5 hrs, power = 80%.

1.2 1.b Sample size to detect 0.5hr difference between mean lives of the fluids

Similar to 1.a, using the power.anova.test to find the sample size with the new between variation is 0.5.

power.anova.test(groups = 4, n = NULL, between.var = 0.5, within.var = 4.5, sig.level = 0.05, power = 0.8)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 33.70068
##     between.var = 0.5
##      within.var = 4.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

We need 34 samples to detect 30 minutes ( 0.5hr ) difference between mean lives of the fluids given the alpha = 0.05, within variation = 4.5 hrs, power = 80%.

2 2. Question 2

Input data into R

FluidType_1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
FluidType_2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
FluidType_3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
FluidType_4<-c(19.3,21.1,16.9,17.5,18.3,19.8)

dat<-data.frame(FluidType_1,FluidType_2,FluidType_3,FluidType_4)

Transform data to Tidyr form.

library(tidyr)
dat<-pivot_longer(dat,c(FluidType_1,FluidType_2,FluidType_3,FluidType_4))

rmarkdown::paged_table(dat)

2.1 2.a

Assume using the variation of question 1 as 4.5

2.2 2.b ANOVA test with alpha = 0.10

Hypothesis:

Ho: all means are equal

Ha: at lest one mean is different

aov.model<-aov(value ~ name,data = dat)
summary(aov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The P value is 0.0525 < alpha = 0.10 => Reject Ho

2.3 2.c Check the adequate of the model

Create plots to check the adequate of the model

plot(aov.model)

Check the normality assumption: Normal Q-Q plot - In this we can see it follows the normal probability as it follows almost the straight line.

Check the equal variance assumption:

ResiduvalvsFitted- In this we can see the distribution is normal and also it has almost equal variance.

3 Complete R Code

It is a good idea to include this at the end of every RMarkdown document

##Create the data frame
df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")

##Normal Probability Plot of before transformation mpg of US cars
qqnorm(df$USCars, main = "Normal Probability Plot of mpg of US cars", ylab = "Mpg of US cars ", col = "blue")
qqline(df$USCars)


##Normal Probability Plot before transformation of mpg of Japanese cars
qqnorm(df$JapaneseCars, main = "Normal Probability Plot of mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df$JapaneseCars)


##Side-by-side boxplots to before transformation
boxplot(df$USCars,df$JapaneseCars,names = c("US cars", "Japanese cars"), main = "Box plot mpg of US cars and Japanese cars")

##Transform data to log
df_trans <- data.frame(log(df$USCars),log(df$JapaneseCars))

##Normal Probability Plot of after transformation mpg of US cars
qqnorm(df_trans$log.df.USCars., main = "Normal Probability Plot of log mpg of US cars", ylab = "Log mpg of US cars ", col = "blue")
qqline(df_trans$log.df.USCars.)

##Normal Probability Plot of after transformation mpg of Japanese cars
qqnorm(df_trans$log.df.JapaneseCars., main = "Normal Probability Plot of log mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df_trans$log.df.JapaneseCars.)

##Side-by-side boxplots to after transformation
boxplot(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars, names = c("US cars", "Japanese cars"), main = "Box plot log mpg of US cars and Japanese cars")

##T-test
?t.test
t.test(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars.)