Data of mpg of US cars and Japanese cars
Create the Normal probability plot of US cars
Only the last two data are far from the normal line, the mpg of US cars before transforming follows the normal distribution.
Create the Normal probability plot of Japanese cars
Similar to US cars, only the last two data are far from the normal line, the mpg of Japanese cars before transforming follows the normal distribution.
The variance of mpg Japanese cars is higher than US cars as shown by the IQR on the box plot.
Data after transformation
rmarkdown::paged_table(df_trans <- data.frame(log(df$USCars),log(df$JapaneseCars)))
Normality of the mpg of US cars after transforming
Only the first three data and the last two data are far from the normal line, the mpg of US cars after transforming follows the normal distribution. The plot does not change much after transformation.
Normality of the mpg of Japanese cars after transforming
Only the last two data are far from the normal line, the mpg of Japanese cars after transforming follows the normal distribution. The plot does not change much after transformation.
After transformation, the variance of mpg Japanese cars is mostly equal to US cars as shown by the IQR on the box plot.
Hypothesis
$$
$$
It is a good idea to include this at the end of every RMarkdown document
##Create the data frame
df <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
##Normal Probability Plot of before transformation mpg of US cars
qqnorm(df$USCars, main = "Normal Probability Plot of mpg of US cars", ylab = "Mpg of US cars ", col = "blue")
qqline(df$USCars)
##Normal Probability Plot before transformation of mpg of Japanese cars
qqnorm(df$JapaneseCars, main = "Normal Probability Plot of mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df$JapaneseCars)
##Side-by-side boxplots to before transformation
boxplot(df$USCars,df$JapaneseCars,names = c("US cars", "Japanese cars"), main = "Box plot mpg of US cars and Japanese cars")
##Transform data to log
df_trans <- data.frame(log(df$USCars),log(df$JapaneseCars))
##Normal Probability Plot of after transformation mpg of US cars
qqnorm(df_trans$log.df.USCars., main = "Normal Probability Plot of log mpg of US cars", ylab = "Log mpg of US cars ", col = "blue")
qqline(df_trans$log.df.USCars.)
##Normal Probability Plot of after transformation mpg of Japanese cars
qqnorm(df_trans$log.df.JapaneseCars., main = "Normal Probability Plot of log mpg of Japanese cars", ylab = "Mpg of Japanese cars ", col = "green")
qqline(df_trans$log.df.JapaneseCars.)
##Side-by-side boxplots to after transformation
boxplot(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars, names = c("US cars", "Japanese cars"), main = "Box plot log mpg of US cars and Japanese cars")
##T-test
?t.test
t.test(df_trans$log.df.USCars.,df_trans$log.df.JapaneseCars.)