ASSIGNMENT 11 - LINEAR REGRESSION IN R IS 605 FUNDAMENTALS OF COMPUTATIONAL MATHEMATICS - 2014 Using R’s lm function, perform regression analysis and measure the signicance of the independent variables for the following two data sets. In the rst case, you are evaluating the statement that we hear that Maximum Heart Rate of a person is related to their age by the following equation: MaxHR = 220 ???? Age You have been given the following sample: Age 18 23 25 35 65 54 34 56 72 19 23 42 18 39 37 MaxHR 202 186 187 180 156 169 174 172 153 199 193 174 198 183 178 Perform a linear regression analysis tting the Max Heart Rate to Age using the lm function in R. What is the resulting equation? Is the eect of Age on Max HR signicant? What is the signicance level? Please also plot the tted relationship between Max HR and Age.
#Read Data
age <-c(18,23,25,35,65,54,34,56,72,19,23,42,18,39,37)
hr <- c(202,186,187,180,156,169,174,172,153,199,193,174,198,183,178)
Is the effect of Age on Max HR significant? H0 : Age has no effect on Max HR, from the linear regression equation y=b0+b1x+e H1 : Age has effect on Max HR
lm.r <- lm(hr ~ age)
summary (lm.r)
##
## Call:
## lm(formula = hr ~ age)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.9258 -2.5383 0.3879 3.1867 6.6242
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 210.04846 2.86694 73.27 < 2e-16 ***
## age -0.79773 0.06996 -11.40 3.85e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.578 on 13 degrees of freedom
## Multiple R-squared: 0.9091, Adjusted R-squared: 0.9021
## F-statistic: 130 on 1 and 13 DF, p-value: 3.848e-08
Based on this Our regression model for the above data 210.048 -0.797Age
p-value: 3.848e-08- which is significantly low, so here I would like to reject the NULL Hypothesis There is a significant relationship between Age and MaxHR in the linear regression model of the above data set Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ‘’ 1
Using the Auto data set from Assignment 5 (also attached here) perform a Linear Re- gression analysis using mpg as the dependent variable and the other 4 (displacement, horse- power, weight, acceleration) as independent variables. What is the final linear regression fit equation? Which of the 4 independent variables have a significant impact on mpg? What are their corresponding significance levels? What are the standard errors on each of the coeficients? Please perform this experiment in two ways. First take any random 40 data points from the entire auto data sample and perform the linear regression fit and measure the 95% confidence intervals. Then, take the entire data set (all 392 points) and perform linear regression and measure the 95% confidence intervals. Please report the resulting fit equation, their significance values and confidence intervals for each of the two runs. Please submit an R-markdown file documenting your experiments. Your submission should include the final linear fits, and their corresponding significance levels. In addition, you should clearly state what you concluded from looking at the fit and their significance levels.
autodata <- read.table('auto-mpg.data',
col.names = c('displacement', 'horsepower', 'weight', 'acceleration', 'mpg'))
head(autodata)
## displacement horsepower weight acceleration mpg
## 1 307 130 3504 12.0 18
## 2 350 165 3693 11.5 15
## 3 318 150 3436 11.0 18
## 4 304 150 3433 12.0 16
## 5 302 140 3449 10.5 17
## 6 429 198 4341 10.0 15
set.seed(10)
random40 <- autodata[sample(nrow(autodata), 40),]
head(random40)
## displacement horsepower weight acceleration mpg
## 199 250 78 3574 21.0 18.0
## 120 121 112 2868 15.5 19.0
## 167 140 83 2639 17.0 23.0
## 270 156 105 2745 16.7 23.2
## 34 225 105 3439 15.5 16.0
## 88 302 137 4042 14.5 14.0
(lm.r.auto <- lm(mpg ~ .,data=random40))
##
## Call:
## lm(formula = mpg ~ ., data = random40)
##
## Coefficients:
## (Intercept) displacement horsepower weight acceleration
## 44.117698 -0.023242 -0.006429 -0.005408 0.076267
summary(lm.r.auto)
##
## Call:
## lm(formula = mpg ~ ., data = random40)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.2563 -2.6450 -0.3425 2.2191 12.2042
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 44.117698 10.879547 4.055 0.000266 ***
## displacement -0.023242 0.025681 -0.905 0.371646
## horsepower -0.006429 0.075464 -0.085 0.932590
## weight -0.005408 0.003029 -1.785 0.082860 .
## acceleration 0.076267 0.502775 0.152 0.880301
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.502 on 35 degrees of freedom
## Multiple R-squared: 0.7566, Adjusted R-squared: 0.7288
## F-statistic: 27.2 on 4 and 35 DF, p-value: 2.597e-10
lm.r.auto$coefficients
## (Intercept) displacement horsepower weight acceleration
## 44.117697855 -0.023241502 -0.006429309 -0.005408394 0.076266690
resulting fitting equation is mpg= 44.117697855 -0.023241502displacement - 0.006429309horsepower - 0.005408394weight -0.076266690acceleration
measure the 95% confidence intervals for the 40 set data
confint(lm.r.auto, level=0.95)
## 2.5 % 97.5 %
## (Intercept) 22.03104408 6.620435e+01
## displacement -0.07537641 2.889340e-02
## horsepower -0.15962850 1.467699e-01
## weight -0.01155797 7.411781e-04
## acceleration -0.94442052 1.096954e+00
#entire data set
(lm.r.autodata <- lm(mpg ~ .,data=autodata))
##
## Call:
## lm(formula = mpg ~ ., data = autodata)
##
## Coefficients:
## (Intercept) displacement horsepower weight acceleration
## 45.251140 -0.006001 -0.043608 -0.005281 -0.023148
summary(lm.r.autodata)
##
## Call:
## lm(formula = mpg ~ ., data = autodata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.378 -2.793 -0.333 2.193 16.256
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 45.2511397 2.4560447 18.424 < 2e-16 ***
## displacement -0.0060009 0.0067093 -0.894 0.37166
## horsepower -0.0436077 0.0165735 -2.631 0.00885 **
## weight -0.0052805 0.0008109 -6.512 2.3e-10 ***
## acceleration -0.0231480 0.1256012 -0.184 0.85388
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.247 on 387 degrees of freedom
## Multiple R-squared: 0.707, Adjusted R-squared: 0.704
## F-statistic: 233.4 on 4 and 387 DF, p-value: < 2.2e-16
lm.r.autodata$coefficients
## (Intercept) displacement horsepower weight acceleration
## 45.251139699 -0.006000871 -0.043607731 -0.005280508 -0.023147999
mpg =45.251139699 -0.006000871displacement -0.043607731horsepower - 0.005280508weight - 0.023147999acceleration
measure the 95% confidence intervals for the entire date set
confint(lm.r.autodata, level=0.95)
## 2.5 % 97.5 %
## (Intercept) 40.422278855 50.080000544
## displacement -0.019192122 0.007190380
## horsepower -0.076193029 -0.011022433
## weight -0.006874738 -0.003686277
## acceleration -0.270094049 0.223798050
from the above it shows that weight and horsepower has significant relation with mpg