Problem 1.

lowbirthweight <- read.csv("C:/Users/Kajal/Downloads/lowbirthwt.csv")
headcirc_gestageLM<- lm(headcirc ~ gestage, data=lowbirthweight)
summary(headcirc_gestageLM)
## 
## Call:
## lm(formula = headcirc ~ gestage, data = lowbirthweight)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5358 -0.8760 -0.1458  0.9041  6.9041 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.91426    1.82915    2.14   0.0348 *  
## gestage      0.78005    0.06307   12.37   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.59 on 98 degrees of freedom
## Multiple R-squared:  0.6095, Adjusted R-squared:  0.6055 
## F-statistic: 152.9 on 1 and 98 DF,  p-value: < 2.2e-16
anova(headcirc_gestageLM)
## Analysis of Variance Table
## 
## Response: headcirc
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## gestage    1 386.87  386.87  152.95 < 2.2e-16 ***
## Residuals 98 247.88    2.53                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(headcirc_gestageLM)$r.squared 
## [1] 0.6094799

Problem 2.

headcirc_birthwtLM<- lm(headcirc ~ birthwt, data=lowbirthweight)
summary(headcirc_birthwtLM)
## 
## Call:
## lm(formula = headcirc ~ birthwt, data = lowbirthweight)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1622 -0.9399 -0.3071  0.5471 10.0398 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.822e+01  6.447e-01   28.26   <2e-16 ***
## birthwt     7.492e-03  5.699e-04   13.15   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.531 on 98 degrees of freedom
## Multiple R-squared:  0.6381, Adjusted R-squared:  0.6344 
## F-statistic: 172.8 on 1 and 98 DF,  p-value: < 2.2e-16
anova(headcirc_birthwtLM)
## Analysis of Variance Table
## 
## Response: headcirc
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## birthwt    1 405.06  405.06  172.82 < 2.2e-16 ***
## Residuals 98 229.69    2.34                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(headcirc_birthwtLM)$r.squared 
## [1] 0.6381409

Problem 3.

headcirc_lengthLM<- lm(headcirc ~ length, data=lowbirthweight)
summary(headcirc_lengthLM)
## 
## Call:
## lm(formula = headcirc ~ length, data = lowbirthweight)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.0143 -1.0383 -0.2883  0.6013  8.9644 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.84406    1.85829   4.221 5.44e-05 ***
## length       0.50532    0.05024  10.059  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.785 on 98 degrees of freedom
## Multiple R-squared:  0.508,  Adjusted R-squared:  0.503 
## F-statistic: 101.2 on 1 and 98 DF,  p-value: < 2.2e-16
anova(headcirc_lengthLM)
## Analysis of Variance Table
## 
## Response: headcirc
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## length     1 322.45  322.45  101.18 < 2.2e-16 ***
## Residuals 98 312.30    3.19                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(headcirc_lengthLM)$r.squared 
## [1] 0.5079886

Problem 4.

Based on the highest R-Squared value, we can say that birth weight has the strongest linear association with head circumference. The higher R-Squared value implies that birth weight is the better predictor for head circumference.

Problem 5.

multLM<-lm(headcirc ~ gestage + birthwt, data=lowbirthweight)
summary(multLM)
## 
## Call:
## lm(formula = headcirc ~ gestage + birthwt, data = lowbirthweight)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0350 -0.7271 -0.0765  0.3472  8.5402 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 8.3080154  1.5789429   5.262 8.54e-07 ***
## gestage     0.4487328  0.0672460   6.673 1.56e-09 ***
## birthwt     0.0047123  0.0006312   7.466 3.60e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.274 on 97 degrees of freedom
## Multiple R-squared:  0.752,  Adjusted R-squared:  0.7469 
## F-statistic: 147.1 on 2 and 97 DF,  p-value: < 2.2e-16
anova(multLM)
## Analysis of Variance Table
## 
## Response: headcirc
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## gestage    1 386.87  386.87 238.378 < 2.2e-16 ***
## birthwt    1  90.46   90.46  55.739 3.597e-11 ***
## Residuals 97 157.42    1.62                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SSRfull<-(386.87+90.46)
SSRfull
## [1] 477.33
(477.33/2)/1.62
## [1] 147.3241

With a \(p = <2.2e-16\), which is below \(\alpha = 0.5\), we can reject the null hypothesis. There is sufficient evidence to reject the claim that \(\hat\beta_{ga}\) = \(\hat\beta_{bw}\) = 0. This implies that the predictors gestational age and birth weight are significant predictors of head circumference.

confint(multLM, level=0.95)
##                   2.5 %       97.5 %
## (Intercept) 5.174250734 11.441780042
## gestage     0.315268189  0.582197507
## birthwt     0.003459568  0.005964999

Problem 6.

multLM2<-lm(headcirc ~ gestage + birthwt + length, data=lowbirthweight)
summary(multLM2)
## 
## Call:
## lm(formula = headcirc ~ gestage + birthwt + length, data = lowbirthweight)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0359 -0.7278 -0.0755  0.3469  8.5403 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.3200783  1.9555657   4.255 4.87e-05 ***
## gestage      0.4489698  0.0712246   6.304 8.83e-09 ***
## birthwt      0.0047183  0.0008521   5.537 2.67e-07 ***
## length      -0.0006928  0.0656161  -0.011    0.992    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.281 on 96 degrees of freedom
## Multiple R-squared:  0.752,  Adjusted R-squared:  0.7442 
## F-statistic: 97.03 on 3 and 96 DF,  p-value: < 2.2e-16
anova(multLM2)
## Analysis of Variance Table
## 
## Response: headcirc
##           Df Sum Sq Mean Sq  F value    Pr(>F)    
## gestage    1 386.87  386.87 235.9203 < 2.2e-16 ***
## birthwt    1  90.46   90.46  55.1642 4.536e-11 ***
## length     1   0.00    0.00   0.0001    0.9916    
## Residuals 96 157.42    1.64                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SSRfull<-(386.87+90.46+0.0)
SSRfull
## [1] 477.33
(477.33/3)/1.64
## [1] 97.01829

With a \(p = <2.2e-16\), which is below \(\alpha = 0.5\), we can reject the null hypothesis. There is sufficient evidence to reject the claim that \(\hat\beta_{ga}\) = \(\hat\beta_{bw}\) = \(\hat\beta_{l}\) = 0. This implies that the predictors gestational age, birth weight, and length are significant predictors of head circumference.

confint(multLM2, level=0.95)
##                   2.5 %       97.5 %
## (Intercept)  4.43831099 12.201845545
## gestage      0.30759008  0.590349620
## birthwt      0.00302685  0.006409729
## length      -0.13093973  0.129554084