# Motor Trend Car Road Tests ####
# The data was extracted from the 1974 Motor Trend US magazine, and 
# comprises fuel consumption and 10 aspects of automobile design and 
# performance for 32 automobiles (1973-74 models)

data("mtcars")
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

A: Plot your data for all independent variable relationships to the dependent variable (as always). Be sure to include informative figure captions.

plot (mpg~wt, data = mtcars)

plot(mpg~hp, data = mtcars)

B: Are your continuous variables displaying normal distribution? Provide two pieces of evidence that display your normality assessment for each variable.

hist (mtcars$mpg) #histogram of continuos variable

qqnorm(mtcars$mpg) #Q-Q plot of continuous variable
qqline (mtcars$mpg)

hist (mtcars$wt) #histogram of continuos variable

qqnorm(mtcars$wt) #Q-Q plot of continuous variable
qqline (mtcars$wt)

hist (mtcars$hp) #histogram of continuos variable

qqnorm(mtcars$hp) #Q-Q plot of continuous variable
qqline (mtcars$hp)

#the mpg, hp and wt are not normal.

C: Did you transform one or more of your variables? If so, state which transformation you used. Provide two pieces of evidence that your data more closely approximates a normal distribution. If not, state why you did not transform the data.

# mpg, wt and hp need to be transform, I use the sqrt transformation in all of them.

mtcars$mpgsqur<- sqrt (mtcars$mpg)
hist (mtcars$mpgsqur)

qqnorm(mtcars$mpgsqur) 
qqline (mtcars$mpgsqur)

mtcars$wtsqur<- sqrt (mtcars$wt)
hist (mtcars$wtsqur)

qqnorm(mtcars$wtsqur) 
qqline (mtcars$wtsqur)

mtcars$hpsqur<- sqrt (mtcars$hp)
hist (mtcars$hpsqur)

qqnorm(mtcars$hpsqur) 
qqline (mtcars$hpsqur)

D: Create the final multiple linear regression object with the appropriate data (raw or transformed) including the interaction. Be sure to place the variables on the correct axis. Present your code.

mtcars.LM <- lm(mpg~hp +wt,  data = mtcars)
plot (mtcars.LM)

mtcars.LM2<-lm(mtcars$mpgsqur~mtcars$hpsqur + mtcars$wtsqur, data = mtcars)

summary(mtcars.LM2)
## 
## Call:
## lm(formula = mtcars$mpgsqur ~ mtcars$hpsqur + mtcars$wtsqur, 
##     data = mtcars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.37385 -0.15388 -0.06524  0.14563  0.52218 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    8.19701    0.27714  29.577  < 2e-16 ***
## mtcars$hpsqur -0.09180    0.02145  -4.279 0.000187 ***
## mtcars$wtsqur -1.51082    0.21733  -6.952 1.22e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2353 on 29 degrees of freedom
## Multiple R-squared:  0.8815, Adjusted R-squared:  0.8733 
## F-statistic: 107.9 on 2 and 29 DF,  p-value: 3.699e-14
plot (mtcars.LM2)

E: Did you accept or reject the null hypothesis for the interaction? Are the results statistically significant? Provide and interpret two evidence graphs that the residuals meet the assumptions of the linear model.

#I reject the null hypothesis because p-value is significant lower 0.0000
plot (mpg~wt, data = mtcars) #original X-Y plot
abline(mtcars.LM) #regresion line the created model

F: Did you drop the interaction term? Why or why not? If so, provide and interpret the output and two evidence graphs that residuals meet the assumptions of the linear model.

#I droop the interaction term because p-value is significant low = 0.0001 and R-squared is = 0.8901

mtcars.LM3<-lm(mtcars$mpgsqur~mtcars$hpsqur+wt+mtcars$hpsqur*wt, data=mtcars)
plot(mtcars.LM3)

summary (mtcars.LM3)
## 
## Call:
## lm(formula = mtcars$mpgsqur ~ mtcars$hpsqur + wt + mtcars$hpsqur * 
##     wt, data = mtcars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.32662 -0.19962 -0.04218  0.15953  0.40776 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       8.45067    0.68606  12.318 8.02e-13 ***
## mtcars$hpsqur    -0.22963    0.05949  -3.860 0.000611 ***
## wt               -0.94580    0.23738  -3.984 0.000438 ***
## mtcars$hpsqur:wt  0.04364    0.01850   2.359 0.025549 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2306 on 28 degrees of freedom
## Multiple R-squared:  0.8901, Adjusted R-squared:  0.8783 
## F-statistic: 75.58 on 3 and 28 DF,  p-value: 1.542e-13

G: Summarize your results in a paragraph similar to the example in the Reporting Your Results section.

#The overall multiple linear regression model for the mpg, hp and wt of the Motor Trend Car Rod Test was significant (p-value <0.0001: Multiple R2=0.8783). A significant positive relationship exists between miles per gallon and gross horse power (p-value <0.001) and also a significan relationship with weight (p-value <0.001) the interaction term was drop because p-value was signigicant low. Mpg, hp, and wt data was transform with a squere-root to approximate normality and homogeneity of variance of the residuals in the model.

Please turn–in your homework via Sakai by saving and submitting an R Markdown PDF or HTML file from R Pubs!