Question 1
dat <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
qqnorm(dat$USCars)
qqnorm(dat$JapaneseCars)
Question2
boxplot(dat$USCars, dat$JapaneseCars, names = c("US", "Japanese"), main="Boxplot of Untransformed Data")
The variance does not appear to be constant.
Question3
logUS <- log(dat$USCars)
logJP <- log(dat$JapaneseCars)
qqnorm(logUS)
qqnorm(logJP)
boxplot(logUS, logJP, names = c("US", "Japanese"), main="Boxplot of Transformed Data")
The variance appears to be the same since the interquartile sizes are similar.
Question4
Ho: μ1=μ2
Ha: μ1!=μ2
t.test(logUS, logJP, var.equal = TRUE)
##
## Two Sample t-test
##
## data: logUS and logJP
## t = -9.4828, df = 61, p-value = 1.306e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.6417062 -0.4182053
## sample estimates:
## mean of x mean of y
## 2.741001 3.270957
The p value is much less than 0.05, so Ho can be rejected, which means we can conclude that Japanese cars and US cars have different fuel comsuptions.
mean(logUS)
## [1] 2.741001
mean(logJP, na.rm=TRUE)
## [1] 3.270957
The result of t-test is correct, the means of the mpg of Janpanese cars and US cars are not the same.