d<-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
qqnorm(d$USCars,main="US Cars MPG",xlab="Expected Normal Value",ylab="MPG")
qqline(d$USCars)
qqnorm(d$JapaneseCars,main="Japanese Cars MPG",xlab="Expected Normal Value",ylab="MPG")
qqline(d$JapaneseCars)
The Japanese and US Cars both tend to fall off of the fitted distribution line on the normal probability plot as the expected normal value increases, but the US Cars deviation is greater than the Japanese Cars, so the Japanese Cars MPG sample is more normally distributed than the US Cars.
boxplot(d)
The variance does not appear to be constant across the two samples, as Japanese Cars appear to have a much wider and higher set of quantiles than US Cars.
d_log<-data.frame(log(d$USCars),log(d$JapaneseCars))
colnames(d_log)<-c("USCars_log","JapaneseCars_log")
qqnorm(d_log$USCars_log,main="Log of US Cars MPG",xlab="Expected Normal Value",ylab="MPG")
qqline(d_log$USCars_log)
qqnorm(d_log$JapaneseCars_log,main="Log of Japanese Cars MPG",xlab="Expected Normal Value",ylab="MPG")
qqline(d_log$JapaneseCars_log)
With the logarithmic transformation of the US and Japanese sample data, the US Cars MPG still appears to not align well with the fitted distribution line, but the Japanese Cars MPG seem to fit along the line better than before the transformation. As such, the Japanese Cars MPG data still appears to be more normally distributed than the US Cars.
boxplot(d_log)
#summary(d_log$USCars_log)\
#3rd Qu (2.89) - 1st Qu (2.639) = .251
#summary(d_log$JapaneseCars_log)
#3rd Qu (3.434) - 1st Qu (3.178) = .256
With the logarithmic transformation of the US and Japanese sample data, the variance of both samples appear to be much closer together than before the transformation. The difference of the quartiles in the US Cars sample is .251 while the difference of the quartiles in the Japanese Cars sample is .256.
\(H_o: \mu_1=\mu_2\) versus \(H_a: \mu_1 \neq \mu_2\)
t.test(d_log$USCars_log,d_log$JapaneseCars_log,var.equal=TRUE)
##
## Two Sample t-test
##
## data: d_log$USCars_log and d_log$JapaneseCars_log
## t = -9.4828, df = 61, p-value = 1.306e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.6417062 -0.4182053
## sample estimates:
## mean of x mean of y
## 2.741001 3.270957
With the p-value of 1.306e-13 being less than the .05 level of significance, there is enough evidence to reject the null hypothesis that the means of US and Japanese Cars MPG are equal with 95% confidence.