Appendix E Homework Hint

Multiple Models

Note: This problem originally comes from Chapter 5.

Use the dplyr::do() function and the HELPrct data frame from the mosaicData package to fit a regression model predicting cesd as a function of age separately for each of the levels of the substance variable. Generate a table of results (estimates and confidence intervals) for each level of the grouping variable.

Note: The table twill have six rows: an intercept row and a slope row for each of the three possible values of the substance variable.

The output should look like this:

substance estimate lower upper
alcohol 22.1792409 13.2178084 31.1406735
alcohol 0.3192241 0.0891633 0.5492849
cocaine 40.0209001 28.8095265 51.2322737
cocaine -0.3073006 -0.6264170 0.0118159
heroin 41.6192873 33.0700875 50.1684871
heroin -0.2017824 -0.4504766 0.0469118

Hint

When given a subgroup of the data, the helper-function for dplyr::do() should return a data frame with:

  • two rows (one for the intercept and another for the slope)
  • three columns (one for the estimate of the coefficient, another for the lower bound of the confidence interval, a third for the upper bound)

A skeleton for the helper-function could look like this:

estConf <- function(group) {
  model <- lm(cesd ~ age, data = group)
  data.frame(
    estimate = ## your code here ##,
    lower =    ## your code here ##,
    upper =    ## your code here ##,
  )
}

Notice how in the estimate column, it is the coefficient for each substance of the model

mod <- lm(cesd ~ age, data = HELPrct %>% filter(substance == "cocaine"))
coef(mod)
## (Intercept)         age 
##  40.0209001  -0.3073006

You should find a way to put this into the estimate part of the estConf function

Notice how in the lower column, it is the 2.5%, and the upper column is the 97.5%

confint(mod)
##                 2.5 %      97.5 %
## (Intercept) 28.809527 51.23227374
## age         -0.626417  0.01181587

You are going to have to do something like this to get only the lower, and also with the upper

confint(mod)[, 1]
## (Intercept)         age 
##   28.809527   -0.626417

Find a way to do plug these into the function, then you will be able to use the dplyr::do()

Hunter Nosek

29 October, 2018