Using R, build a multiple regression model for data that interests you. Include in this model at least one quadratic term, one dichotomous term, and one dichotomous vs. quantitative interaction term. Interpret all coefficients. Conduct residual analysis. Was the linear model appropriate? Why or why not?

mtcars <- read.csv("https://raw.githubusercontent.com/johnpannyc/data605_wk11_discussion/master/mtcars.csv")
str(mtcars)
## 'data.frame':    32 obs. of  12 variables:
##  $ model: Factor w/ 32 levels "AMC Javelin",..: 18 19 5 13 14 31 7 21 20 22 ...
##  $ mpg  : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl  : int  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp : num  160 160 108 258 360 ...
##  $ hp   : int  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt   : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec : num  16.5 17 18.6 19.4 17 ...
##  $ vs   : int  0 0 1 1 0 1 0 1 1 1 ...
##  $ am   : int  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear : int  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb : int  4 4 1 1 2 1 4 2 2 4 ...

First, create a quadratic variables hp2=hp^2

mtcars$hp2 <- mtcars$hp^2

Check the new variable hp2

head(mtcars)
##               model  mpg cyl disp  hp drat    wt  qsec vs am gear carb
## 1         Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## 2     Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## 3        Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## 4    Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## 5 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## 6           Valiant 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
##     hp2
## 1 12100
## 2 12100
## 3  8649
## 4 12100
## 5 30625
## 6 11025

build a linear model between qsec (how many sec for 1/4 miles) correled with quadratic variable (hp2) and a dichotomous term (am: automatic transmission or manual trasmission).
Hypothesis, qsec is positively correlated iwth cyl, hp, hp2, am, and negatively correlated with wt of the car

model_qsec <- lm(qsec~cyl+hp+hp2+wt+am, mtcars)
model_qsec
## 
## Call:
## lm(formula = qsec ~ cyl + hp + hp2 + wt + am, data = mtcars)
## 
## Coefficients:
## (Intercept)          cyl           hp          hp2           wt  
##   2.372e+01   -6.296e-01   -3.725e-02    6.311e-05    7.999e-01  
##          am  
##  -1.792e+00
summary(model_qsec)
## 
## Call:
## lm(formula = qsec ~ cyl + hp + hp2 + wt + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.6845 -0.3252 -0.0276  0.2686  2.1500 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.372e+01  1.120e+00  21.182  < 2e-16 ***
## cyl         -6.296e-01  2.096e-01  -3.004 0.005835 ** 
## hp          -3.725e-02  1.511e-02  -2.466 0.020567 *  
## hp2          6.311e-05  3.462e-05   1.823 0.079884 .  
## wt           7.999e-01  2.945e-01   2.716 0.011596 *  
## am          -1.792e+00  4.615e-01  -3.883 0.000634 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7982 on 26 degrees of freedom
## Multiple R-squared:  0.8327, Adjusted R-squared:  0.8005 
## F-statistic: 25.88 on 5 and 26 DF,  p-value: 2.533e-09

So we can see the qsec is correlated with cyl, hp, wt and am, but not correlated with hp2. The residual stanadard err is only 0.79 which is close to O. The quantile shows a little bit left skewed. p value is almost 0. It is a relatively linear modle.

plot(model_qsec$fitted.values, model_qsec$residuals, xlab="Fitted Values", ylab="Residuals", main="Residuals vs. Fitted")
abline(h=0)

Residual plot shows the observers are equality disributed around 0. It is a normal distribution.

qqnorm(model_qsec$residuals)
qqline(model_qsec$residuals)

QQ plot shows all the points are curved like S shape along the qq line. It looks barely like a normal distribution.