Importing the data

df<- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")

USCars<-df[,1]
JapaneseCars<-df[1:28,2]

1. Normal Probability Plots

qqnorm(df$USCars, main = "Normal probability plot for US Cars", ylab = "US Cars", col = "blue")

qqnorm(df$JapaneseCars, main = "Normal probability plot for Japanese Cars", ylab = "Japanese Cars", col = "red")

Comment - The MPG of Japanese cars appears to be normally distributed compared to the MPG of the US Cars.

2. Comparison using Box Plots

boxplot(USCars,JapaneseCars, names = c("US Cars", "Japanese Cars"), main = "Box plot of US Cars vs Japanese Cars")

Comment - From the box plot illustration, we can conclude that there is a huge difference in variance between the US Cars and Japanese Cars. Therefore we take the log for both to approximate equality the variances to perform the Two Sample T-Test.

3. Transformation using Log

UCars <- log(USCars)
JCars<- log(JapaneseCars)

qqnorm(UCars, main = "Normal probability plot for US Cars", ylab = "US Cars", col = "blue")

qqnorm(JCars, main = "Normal probability plot for Japanese Cars", ylab = "Japanese Cars", col = "red")

boxplot(UCars,JCars, names = c("US Cars", "Japanese Cars"), main = "Box plot of US Cars vs Japanese Cars")

Comment - From the box plot illustration, we can conclude that the variances are approximately equal after log transformation.

4. Two Sample T-Test

mean(UCars)

## [1] 2.741001

mean(JCars)

## [1] 3.270957

Comment - Sample avg for the log of the MPG of US Cars =

2.741001

Sample avg for the log of the MPG of Japanese Cars =

3.270957

t.test(UCars, JCars, var.equal = TRUE, alternative = c("less"))

## 
##  Two Sample t-test
## 
## data:  UCars and JCars
## t = -9.4828, df = 61, p-value = 6.528e-14
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##        -Inf -0.4366143
## sample estimates:
## mean of x mean of y 
##  2.741001  3.270957

Comment - Based on our data, the P-value is less than the significance level, hence there is a significant difference between the two samples.

Therefore, we can reject \(H_0\) and can conclude that the mean MPG of cars manufactured in the US is less than that of those manufactured in Japan.

\(H_a: \mu\_1 - \mu\_2 \neq 0\)

5. Complete R-Code

df<- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
USCars<-df[,1]
JapaneseCars<-df[1:28,2]
qqnorm(df$USCars, main = "Normal probability plot for US Cars", ylab = "US Cars", col = "blue")
qqnorm(df$JapaneseCars, main = "Normal probability plot for Japanese Cars", ylab = "Japanese Cars", col = "red")
boxplot(USCars,JapaneseCars, names = c("US Cars", "Japanese Cars"), main = "Box plot of US Cars vs Japanese Cars")
UCars <- log(USCars)
JCars<- log(JapaneseCars)
qqnorm(UCars, main = "Normal probability plot for US Cars", ylab = "US Cars", col = "blue")
qqnorm(JCars, main = "Normal probability plot for Japanese Cars", ylab = "Japanese Cars", col = "red")
boxplot(UCars,JCars, names = c("US Cars", "Japanese Cars"), main = "Box plot of US Cars vs Japanese Cars")
mean(UCars)
mean(JCars)
t.test(UCars, JCars, var.equal = TRUE, alternative = c("less"))

Assignment - 4

Rohit, Abasi-emek Okpokpo,Prateeksha

2025-09-09