Using the power.anova.test to find the sample size:
power.anova.test(groups = 4, n = NULL, between.var = 1, within.var = 4.5, sig.level = 0.05, power = 0.8)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 17.3624
## between.var = 1
## within.var = 4.5
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
We need 18 samples to detect 1hr difference between mean lives of the fluids given the alpha = 0.05, within variation = 4.5 hrs, power = 80%.
Similar to 1.a, using the power.anova.test to find the sample size with the new between variation is 0.5.
power.anova.test(groups = 4, n = NULL, between.var = 0.5, within.var = 4.5, sig.level = 0.05, power = 0.8)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 33.70068
## between.var = 0.5
## within.var = 4.5
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
We need 34 samples to detect 30 minutes ( 0.5hr ) difference between mean lives of the fluids given the alpha = 0.05, within variation = 4.5 hrs, power = 80%.
Input data into R
FluidType_1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
FluidType_2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
FluidType_3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
FluidType_4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(FluidType_1,FluidType_2,FluidType_3,FluidType_4)
Transform data to Tidyr form.
library(tidyr)
dat<-pivot_longer(dat,c(FluidType_1,FluidType_2,FluidType_3,FluidType_4))
rmarkdown::paged_table(dat)
Assume using the variation of question 1 as 4.5
Hypothesis:
Ho: all means are equal
Ha: at lest one mean is different
aov.model<-aov(value ~ name,data = dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 30.17 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The P value is 0.0525 < alpha = 0.10 => Reject Ho
Create plots to check the adequate of the model
plot(aov.model)
Check the normality assumption: Normal Q-Q plot - In this we can see it follows the normal probability as it follows almost the straight line.
Check the equal variance assumption:
ResiduvalvsFitted- In this we can see the distribution is normal and also it has almost equal variance.
It is a good idea to include this at the end of every RMarkdown document
##Create the data frame
df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
##Normal Probability Plot of before transformation mpg of US cars
qqnorm(df$USCars, main = "Normal Probability Plot of mpg of US cars", ylab = "Mpg of US cars ", col = "blue")
qqline(df$USCars)
##Normal Probability Plot before transformation of mpg of Japanese cars
qqnorm(df$JapaneseCars, main = "Normal Probability Plot of mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df$JapaneseCars)
##Side-by-side boxplots to before transformation
boxplot(df$USCars,df$JapaneseCars,names = c("US cars", "Japanese cars"), main = "Box plot mpg of US cars and Japanese cars")
##Transform data to log
df_trans <- data.frame(log(df$USCars),log(df$JapaneseCars))
##Normal Probability Plot of after transformation mpg of US cars
qqnorm(df_trans$log.df.USCars., main = "Normal Probability Plot of log mpg of US cars", ylab = "Log mpg of US cars ", col = "blue")
qqline(df_trans$log.df.USCars.)
##Normal Probability Plot of after transformation mpg of Japanese cars
qqnorm(df_trans$log.df.JapaneseCars., main = "Normal Probability Plot of log mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df_trans$log.df.JapaneseCars.)
##Side-by-side boxplots to after transformation
boxplot(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars, names = c("US cars", "Japanese cars"), main = "Box plot log mpg of US cars and Japanese cars")
##T-test
?t.test
t.test(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars.)