title: “FLipped Assignment 4” author: “Luis Araiza, Yashwanth Dommaraju,Mustafa Ahmed Syed” date: “9/8/2022” output: html_document
Here the csv was read and put into a dataframe
dat <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
# Normal Probablillity Plot of US cars
qqnorm(dat$ï..USCars,
main = "Fuel Efficeny of US Cars")
#Normal Probablillity Plot of Japanese Cars
qqnorm(dat$JapaneseCars,
main = "Fuel Efficenty of Japanese Cars")
From visual inspection of NPP’s the US cars defintly have a skew while the Japanese cars follow a normalish distribution
# Question 2
boxplot(dat$ï..USCars, dat$JapaneseCars,
names= c("US Cars", "Japanese Cars"),
main = "Fuel Efficany of Cars")
As seen in the boxplots the US cars seem to have more varience than the Japanese cars
Here the data is transformed logarthmitcaly and the transformed data is put into dataframe datl
datl <- dat
datl$JapaneseCars <- log(dat$JapaneseCars)
datl$ï..USCars <- log(dat$ï..USCars)
Here the NPP’s of the US and Japanese cars are inspected again after the log transform
qqnorm(datl$ï..USCars,
main = "Log Transformed Fuel Efficeny of US Cars")
#Normal Probablillity Plot of Japanese Cars
qqnorm(datl$JapaneseCars,
main = "Log TransformedFuel Efficenty of Japanese Cars")
As seen the US cars appear to be more normally distributed now after the transform
## Question 3.2
boxplot(datl$ï..USCars, datl$JapaneseCars,
names= c("US Cars", "Japanese Cars"),
main = "Log Transfomed Fuel Efficany of Cars")
As seen above there still seems to be some varienace between the two datasets, but with such a small sample size it could be assumed that the varience’s are constant.
To determine if the null hypothesis is either accepted or rejected a t-test was done. We can do a t-test as we assume equal varience
t.test(datl$ï..USCars,datl$JapaneseCars,var.equal = TRUE,alternative ="less")
##
## Two Sample t-test
##
## data: datl$ï..USCars and datl$JapaneseCars
## t = -9.4828, df = 61, p-value = 6.528e-14
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf -0.4366143
## sample estimates:
## mean of x mean of y
## 2.741001 3.270957
The null hypothesis is that the mean mpg of US cars equal to Japanese cars, while the alternative hypothesis is that the mean of US cars is less than Japanese cars.
After performing a t-tes the null hypothesis is false with a a p-value of 0. Conversely the alternative hypothesis is true.
US cars are defintly less fuel efficent than japanese cars with US cars having a mean mpg of 2.74 and Japanese cars a mean of 3.27
# Complete Code
dat <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")
# Question 1 ______________________________
# Normal Probablillity Plot of US cars
qqnorm(dat$ï..USCars,
main = "Fuel Efficeny of US Cars")
#Normal Probablillity Plot of Japanese Cars
qqnorm(dat$JapaneseCars,
main = "Fuel Efficenty of Japanese Cars")
# From the probablillity Plots it appears that the Japanese cars follow a somewhat normal distribustion
#while american cars definetly have a skew.
# Question 2 ________________________________________________________
boxplot(dat$ï..USCars, dat$JapaneseCars,
names= c("US Cars", "Japanese Cars"),
main = "Fuel Efficany of Cars")
# The Varience between the datasets is defintly different
# The data is more spread out for Japanese cars while for US cars there is less variance bewtween each car
#Question 3 __________________________________________________
datl <- dat
datl$JapaneseCars <- log(dat$JapaneseCars)
datl$ï..USCars <- log(dat$ï..USCars)
# the functions above do the log transform
# Question 3.1
qqnorm(datl$ï..USCars,
main = "Log Transformed Fuel Efficeny of US Cars",
qqline(datl$ï..USCars)
#Normal Probablillity Plot of Japanese Cars
qqnorm(datl$JapaneseCars,
main = "Log TransformedFuel Efficenty of Japanese Cars")
#Question 3.2
boxplot(datl$ï..USCars, datl$JapaneseCars,
names= c("US Cars", "Japanese Cars"),
main = "Log TransfomedFuel Efficany of Cars")
#Question 4
#?t.test
t.test(datl$ï..USCars,datl$JapaneseCars,var.equal = TRUE, alternative ="less" )