Appendix E Homework Hint

Multiple Models
- Hint

Multiple Models

Note: This problem originally comes from Chapter 5.

Use the dplyr::do() function and the HELPrct data frame from the mosaicData package to fit a regression model predicting cesd as a function of age separately for each of the levels of the substance variable. Generate a table of results (estimates and confidence intervals) for each level of the grouping variable.

Note: The table twill have six rows: an intercept row and a slope row for each of the three possible values of the substance variable.

The output should look like this:

substance	estimate	lower	upper
alcohol	22.1792409	13.2178084	31.1406735
alcohol	0.3192241	0.0891633	0.5492849
cocaine	40.0209001	28.8095265	51.2322737
cocaine	-0.3073006	-0.6264170	0.0118159
heroin	41.6192873	33.0700875	50.1684871
heroin	-0.2017824	-0.4504766	0.0469118

Hint

When given a subgroup of the data, the helper-function for dplyr::do() should return a data frame with:

two rows (one for the intercept and another for the slope)
three columns (one for the estimate of the coefficient, another for the lower bound of the confidence interval, a third for the upper bound)

A skeleton for the helper-function could look like this:

estConf <- function(group) {
  model <- lm(cesd ~ age, data = group)
  data.frame(
    estimate = ## your code here ##,
    lower =    ## your code here ##,
    upper =    ## your code here ##,
  )
}

Notice how in the estimate column, it is the coefficient for each substance of the model

mod <- lm(cesd ~ age, data = HELPrct %>% filter(substance == "cocaine"))
coef(mod)

## (Intercept)         age 
##  40.0209001  -0.3073006

You should find a way to put this into the estimate part of the estConf function

Notice how in the lower column, it is the 2.5%, and the upper column is the 97.5%

confint(mod)

##                 2.5 %      97.5 %
## (Intercept) 28.809527 51.23227374
## age         -0.626417  0.01181587

You are going to have to do something like this to get only the lower, and also with the upper

confint(mod)[, 1]

## (Intercept)         age 
##   28.809527   -0.626417

Find a way to do plug these into the function, then you will be able to use the dplyr::do()