Estimate a multiple regression model where reading achievement scores (read) are regressed on highest grade completed by child’s mother (medu), whether child was breastfed (breastfed), and whether a child was born low birth weight (bthwht)
Sys.setlocale("LC_ALL","English")
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
setwd("C:/Users/Qiu J/Desktop/MSSP+DA 2021FALL/MSSP 897-002 Applied Linear Modeling/Assignment/Lab Assignment 3")
NLSY <- read.csv("C:/Users/Qiu J/Desktop/MSSP+DA 2021FALL/MSSP 897-002 Applied Linear Modeling/Assignment/Lab Assignment 3/NLSY-2.csv")
Mod1 <- lm(read~medu+breastfed+bthwht,data=NLSY)
summary(Mod1)
Call:
lm(formula = read ~ medu + breastfed + bthwht, data = NLSY)
Residuals:
Min 1Q Median 3Q Max
-50.093 -8.618 0.382 9.148 39.360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 80.0022 1.3612 58.773 < 2e-16 ***
medu 1.7375 0.1065 16.308 < 2e-16 ***
breastfed 3.8161 0.5305 7.193 7.99e-13 ***
bthwht -1.9803 0.9705 -2.041 0.0414 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 13.95 on 2972 degrees of freedom
Multiple R-squared: 0.1217, Adjusted R-squared: 0.1208
F-statistic: 137.3 on 3 and 2972 DF, p-value: < 2.2e-16
Q1: What proportion of the variation in the dependent variable is explained by the independent variables (R2)?
A1: 12%
Q2: What is the, on average, estimated mean reading achievement score when the independent variables are equal to 0 (intercept)?
A2: 80
Q3: Interpret the unstandardized coefficient for the breastfed variable.
A3: An increase of 1 unit in the breastfed variable results, on average, in an increase of 3.8 units in dependent variable (read).
Q4: Using the modelEffectSizes() function in the lmSupport package, produce a Semipartial and partial correlation table.
install.packages("lmSupport")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/Qiu J/Documents/R/win-library/4.1’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.1/lmSupport_2.9.13.zip'
Content type 'application/zip' length 181439 bytes (177 KB)
downloaded 177 KB
package ‘lmSupport’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\Qiu J\AppData\Local\Temp\RtmpkxlbXV\downloaded_packages
library(lmSupport)
Warning: package ‘lmSupport’ was built under R version 4.1.1
Registered S3 method overwritten by 'data.table':
method from
print.data.table
modelEffectSizes(Mod1)
lm(formula = read ~ medu + breastfed + bthwht, data = NLSY)
Coefficients
SSR df pEta-sqr dR-sqr
(Intercept) 672341.5861 1 0.5375 NA
medu 51762.2586 1 0.0821 0.0786
breastfed 10070.5752 1 0.0171 0.0153
bthwht 810.4742 1 0.0014 0.0012
Sum of squared errors (SSE): 578476.1
Sum of squared total (SST): 658638.8
Rerun the model and use the scale() function to standardize the variables
NLSY$read_scale <- scale(NLSY$read)
NLSY$medu_scale <- scale(NLSY$medu)
NLSY$breastfed_scale <- scale(NLSY$breastfed)
NLSY$bthwht_scale <- scale(NLSY$bthwht)
Mod1_scale <- lm(read_scale~medu_scale+breastfed_scale+bthwht_scale,data=NLSY)
summary(Mod1_scale)
Call:
lm(formula = read_scale ~ medu_scale + breastfed_scale + bthwht_scale,
data = NLSY)
Residuals:
Min 1Q Median 3Q Max
-3.3667 -0.5792 0.0256 0.6148 2.6453
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.070e-16 1.719e-02 0.000 1.0000
medu_scale 2.892e-01 1.774e-02 16.308 < 2e-16 ***
breastfed_scale 1.282e-01 1.783e-02 7.193 7.99e-13 ***
bthwht_scale -3.533e-02 1.732e-02 -2.041 0.0414 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9376 on 2972 degrees of freedom
Multiple R-squared: 0.1217, Adjusted R-squared: 0.1208
F-statistic: 137.3 on 3 and 2972 DF, p-value: < 2.2e-16
Q5: Interpret the standardized coefficient for the medu variable.
A5: An increase of 1 standard deviation in medu results, on average, in an increase of 0.289 standard deviation in dependant variable (read).