LOading dataset from Desktop

LungCapData <- read.delim(file.choose(),header = T)
attach(LungCapData)
names(LungCapData)
## [1] "LungCap"   "Age"       "Height"    "Smoke"     "Gender"    "Caesarean"





Linear Regression Model with 2 explanatory variables

mmod <- lm(LungCap~Age + Height)
summary(mmod)
## 
## Call:
## lm(formula = LungCap ~ Age + Height)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4080 -0.7097 -0.0078  0.7167  3.1679 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -11.747065   0.476899 -24.632  < 2e-16 ***
## Age           0.126368   0.017851   7.079 3.45e-12 ***
## Height        0.278432   0.009926  28.051  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.056 on 722 degrees of freedom
## Multiple R-squared:  0.843,  Adjusted R-squared:  0.8425 
## F-statistic:  1938 on 2 and 722 DF,  p-value: < 2.2e-16





Examine the residual plot or error

Checking linear regression assumptions

par(mfrow= c(2,2))
plot(mmod)





Linear Regression Model with all explanatory variables

mmod1 <- lm(LungCap~Age + Height+Smoke+Gender+Caesarean)
summary(mmod1)
## 
## Call:
## lm(formula = LungCap ~ Age + Height + Smoke + Gender + Caesarean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3388 -0.7200  0.0444  0.7093  3.0172 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -11.32249    0.47097 -24.041  < 2e-16 ***
## Age            0.16053    0.01801   8.915  < 2e-16 ***
## Height         0.26411    0.01006  26.248  < 2e-16 ***
## Smokeyes      -0.60956    0.12598  -4.839 1.60e-06 ***
## Gendermale     0.38701    0.07966   4.858 1.45e-06 ***
## Caesareanyes  -0.21422    0.09074  -2.361   0.0185 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.02 on 719 degrees of freedom
## Multiple R-squared:  0.8542, Adjusted R-squared:  0.8532 
## F-statistic: 842.8 on 5 and 719 DF,  p-value: < 2.2e-16





Examine the residual plot or error

Checking linear regression assumptions

par(mfrow= c(2,2))
plot(mmod1)