title: “FLipped Assignment 4” author: “Luis Araiza, Yashwanth Dommaraju,Mustafa Ahmed Syed” date: “9/8/2022” output: html_document

Set up

Here the csv was read and put into a dataframe

dat <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")

Question 1

# Normal Probablillity Plot of US cars

qqnorm(dat$ï..USCars,
       main = "Fuel Efficeny of US Cars")

#Normal Probablillity Plot of Japanese Cars

qqnorm(dat$JapaneseCars,
       main = "Fuel Efficenty of Japanese Cars")

From visual inspection of NPP’s the US cars defintly have a skew while the Japanese cars follow a normalish distribution

# Question 2

boxplot(dat$ï..USCars, dat$JapaneseCars,
        names= c("US Cars", "Japanese Cars"),
        main = "Fuel Efficany of Cars")

As seen in the boxplots the US cars seem to have more varience than the Japanese cars

Question 3

Here the data is transformed logarthmitcaly and the transformed data is put into dataframe datl

datl <- dat
datl$JapaneseCars <- log(dat$JapaneseCars)
datl$ï..USCars <- log(dat$ï..USCars)

Question 3.1

Here the NPP’s of the US and Japanese cars are inspected again after the log transform

qqnorm(datl$ï..USCars,
       main = "Log Transformed Fuel Efficeny of US Cars")

#Normal Probablillity Plot of Japanese Cars

qqnorm(datl$JapaneseCars,
       main = "Log TransformedFuel Efficenty of Japanese Cars")

As seen the US cars appear to be more normally distributed now after the transform

## Question 3.2

boxplot(datl$ï..USCars, datl$JapaneseCars,
        names= c("US Cars", "Japanese Cars"),
        main = "Log Transfomed Fuel Efficany of Cars")

As seen above there still seems to be some varienace between the two datasets, but with such a small sample size it could be assumed that the varience’s are constant.

Question 4

To determine if the null hypothesis is either accepted or rejected a t-test was done. We can do a t-test as we assume equal varience

t.test(datl$ï..USCars,datl$JapaneseCars,var.equal = TRUE,alternative ="less")
## 
##  Two Sample t-test
## 
## data:  datl$ï..USCars and datl$JapaneseCars
## t = -9.4828, df = 61, p-value = 6.528e-14
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##        -Inf -0.4366143
## sample estimates:
## mean of x mean of y 
##  2.741001  3.270957

The null hypothesis is that the mean mpg of US cars equal to Japanese cars, while the alternative hypothesis is that the mean of US cars is less than Japanese cars.

After performing a t-tes the null hypothesis is false with a a p-value of 0. Conversely the alternative hypothesis is true.

US cars are defintly less fuel efficent than japanese cars with US cars having a mean mpg of 2.74 and Japanese cars a mean of 3.27

# Complete Code

dat <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/main/US_Japanese_Cars.csv")



# Question 1 ______________________________
# Normal Probablillity Plot of US cars

qqnorm(dat$ï..USCars,
       main = "Fuel Efficeny of US Cars")

#Normal Probablillity Plot of Japanese Cars

qqnorm(dat$JapaneseCars,
       main = "Fuel Efficenty of Japanese Cars")


# From the probablillity Plots it appears that the Japanese cars follow a somewhat normal distribustion 
#while american cars definetly have a skew.



# Question 2 ________________________________________________________


boxplot(dat$ï..USCars, dat$JapaneseCars,
        names= c("US Cars", "Japanese Cars"),
        main = "Fuel Efficany of Cars")

# The Varience between the datasets is defintly different


# The data is more spread out for Japanese cars while for US cars there is less variance bewtween each car

#Question 3 __________________________________________________

datl <- dat
datl$JapaneseCars <- log(dat$JapaneseCars)
datl$ï..USCars <- log(dat$ï..USCars)

# the functions above do the log transform

# Question 3.1 

qqnorm(datl$ï..USCars,
       main = "Log Transformed Fuel Efficeny of US Cars",
       qqline(datl$ï..USCars)

#Normal Probablillity Plot of Japanese Cars

qqnorm(datl$JapaneseCars,
       main = "Log TransformedFuel Efficenty of Japanese Cars")



#Question 3.2


boxplot(datl$ï..USCars, datl$JapaneseCars,
        names= c("US Cars", "Japanese Cars"),
        main = "Log TransfomedFuel Efficany of Cars")


#Question 4 
#?t.test
t.test(datl$ï..USCars,datl$JapaneseCars,var.equal = TRUE, alternative ="less" )