# Motor Trend Car Road Tests ####
# The data was extracted from the 1974 Motor Trend US magazine, and 
# comprises fuel consumption and 10 aspects of automobile design and 
# performance for 32 automobiles (1973-74 models)

data("mtcars")
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

A: Plot your data for all independent variable relationships to the dependent variable (as always). Be sure to include informative figure captions.

plot(mpg~wt,data = mtcars)
Figure 1: Plot of weight versus mpg of the cars.

Figure 1: Plot of weight versus mpg of the cars.

plot(mpg~hp, data = mtcars)
Figure 2: Weight versus gross horse power of the cars.

Figure 2: Weight versus gross horse power of the cars.

B: Are your continuous variables displaying normal distribution? Provide two pieces of evidence that display your normality assessment for each variable.

I think that the gross horse power needs to be transformed, the weight also needs to be transformed, and the miles per gallon is generally normal, but I will test in the next question is the data needs to be transformed.

hist(mtcars$hp)
Figure 3: Histogram of the gross horse power of the cars.

Figure 3: Histogram of the gross horse power of the cars.

qqnorm(mtcars$hp)
qqline(mtcars$hp)
Figure 4: Q-Q Plot of the gross horse power of the cars.

Figure 4: Q-Q Plot of the gross horse power of the cars.

hist(mtcars$wt)
Figure 5: Histogram of the weight of the cars.

Figure 5: Histogram of the weight of the cars.

qqnorm(mtcars$wt)
qqline(mtcars$wt)
Figure 6: Q-Q Plot of the weight of the cars.

Figure 6: Q-Q Plot of the weight of the cars.

hist(mtcars$mpg)
Figure 7: Histogram of the Miles per Galon of the cars.

Figure 7: Histogram of the Miles per Galon of the cars.

qqnorm(mtcars$mpg)
qqline(mtcars$mpg)
Figure 8: Q-Q Plot of the Miles per Galon of the cars.

Figure 8: Q-Q Plot of the Miles per Galon of the cars.

C: Did you transform one or more of your variables? If so, state which transformation you used. Provide two pieces of evidence that your data more closely approximates a normal distribution. If not, state why you did not transform the data.

The only variable we needed to transform is the gross horse power. While looking at the raw data of the graph it is more like a Poisson distribution and when using the square root function we can more closely approximate a normal distribution.

hpSqrt<-sqrt(mtcars$hp)
hist(hpSqrt)
Figure 9: Square Root ransformation test via a histogram and a q-q plot for Horse Power.

Figure 9: Square Root ransformation test via a histogram and a q-q plot for Horse Power.

qqnorm(hpSqrt)
qqline(hpSqrt)
Figure 9: Square Root ransformation test via a histogram and a q-q plot for Horse Power.

Figure 9: Square Root ransformation test via a histogram and a q-q plot for Horse Power.

mpgLog<-log10(mtcars$mpg+0.0001) 
hist(mpgLog)
Figure 10: Log transformation tests via a histagram and a q-q plot for Miles per Galon

Figure 10: Log transformation tests via a histagram and a q-q plot for Miles per Galon

qqnorm(mpgLog)
qqline(mpgLog)
Figure 10: Log transformation tests via a histagram and a q-q plot for Miles per Galon

Figure 10: Log transformation tests via a histagram and a q-q plot for Miles per Galon

D: Create the final multiple linear regression object with the appropriate data (raw or transformed) including the interaction. Be sure to place the variables on the correct axis. Present your code.

cars.LM<-lm(mpg~hp+wt, data = mtcars)
plot(cars.LM)
Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

Figure 11: Linear Model of the raw data which shows a bow-shaped curve.

cars.LM2<-lm(mpgLog~hpSqrt+wt, data = mtcars)
summary(cars.LM2)
## 
## Call:
## lm(formula = mpgLog ~ hpSqrt + wt, data = mtcars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.07654 -0.03460 -0.01095  0.02994  0.11870 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.764760   0.036694  48.094  < 2e-16 ***
## hpSqrt      -0.018448   0.004191  -4.402 0.000133 ***
## wt          -0.081639   0.011893  -6.864 1.53e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04658 on 29 degrees of freedom
## Multiple R-squared:  0.8786, Adjusted R-squared:  0.8703 
## F-statistic:   105 on 2 and 29 DF,  p-value: 5.232e-14
plot(cars.LM2)
Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

Figure 12: Linear Model of the transformed data.

E: Did you accept or reject the null hypothesis for the interaction? Are the results statistically significant? Provide and interpret two evidence graphs that the residuals meet the assumptions of the linear model.

We reject the null hypothesis. The p-value is really low like 0.0000 which is way below the 0.05 threshold. The resuluts are statistically significant. I can’t plot the data because I own a mac.

F: Did you drop the interaction term? Why or why not? If so, provide and interpret the output and two evidence graphs that residuals meet the assumptions of the linear model.

I did drop the interaction term because the p-value is .22508 which is above the 0.05 p-value threshold set.

cars.LM3<-lm(mpgLog~hpSqrt+wt+hpSqrt*wt, data=mtcars)
plot(cars.LM3)
Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

Figure 13: Linear Model checking interactions for significance.

summary(cars.LM3)
## 
## Call:
## lm(formula = mpgLog ~ hpSqrt + wt + hpSqrt * wt, data = mtcars)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.068373 -0.035845 -0.005321  0.033176  0.097228 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.928979   0.137283  14.051 3.31e-14 ***
## hpSqrt      -0.032288   0.011904  -2.712  0.01129 *  
## wt          -0.138722   0.047501  -2.920  0.00683 ** 
## hpSqrt:wt    0.004592   0.003702   1.241  0.22508    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04615 on 28 degrees of freedom
## Multiple R-squared:  0.885,  Adjusted R-squared:  0.8726 
## F-statistic: 71.81 on 3 and 28 DF,  p-value: 2.908e-13

G: Summarize your results in a paragraph similar to the example in the Reporting Your Results section.

The multiple linear regression model for car miles per gallon versus horse power and weight was significant(p-value<0.0001; Multiple R^2=0.8726). There is a significant positive relationship between miles per gallon(MPG) and horsepower(p-value<0.001). There is also a significant positive relationship between mile per gallon(MPG) and weight(p-value<0.001). The interaction term was dropped because of lack of significant(p=0.225). Miles per gallon was log transformed and horsepower was square root transformed to approximate normality and homogeneity of variance of the residuals.

Please turn–in your homework via Sakai by saving and submitting an R Markdown PDF or HTML file from R Pubs!