Starting point/premise:
Over the course of the semester, we use ESM to ping students every time (or nearly every time) they view the calendar or view the syllabus or view assignment instruction. What we find is that 95% of the time whenever students “view the calendar” they report via ESM that they are “planning”, whereas 90% of the time whenever they “view the syllabus” they report via ESM that they are “planning”, and only 50% of the time when they “view assignment instructions” they are “planning”. It turns out that the “view assignment instructions” ESM data systematically varies: if a student does this prior to the assignment due date, they report “planning” but if they do this on the day the assignment is due, they report “doing assignment”. To me, the ESM data are our “ground truth” – I think in most cases we would “believe” what a student reports via ESM unless their report is uninterpretable (e.g., “I was chillin’”) or really implausible (e.g., they viewed the syllabus but reported they were “submitting a quiz response”). I would interpret this to mean that “view the calendar” and “view the syllabus” can be inferred as evidence of “planning” regardless of when they enact that behavior whereas the “view assignment instructions” digital trace can be interpreted with high validity when we take into account timing (i.e., either “planning” or “doing assignment”). ## Loading, setting up
library(tidyverse)
library(palmerpenguins)
library(margins)
library(prediction)
library(corrr)
library(sjPlot)
library(lme4)
library(brms)
Let’s make this a relevant-appearing situation. While we’re using data on penguins, let’s imagine that we’re modeling the presence of a planning reported via ESM as a function of a certain behavior measured through log data. For now, let us hold aside whether ESM is really ground truth (and/or how to account for measurement error in those responses. To simplify further, let’s consider any independent variables to be standardized with M = 0, SD = 1. For now, we’ll consider planning to be a binary variable (1 = planning, 0 = not planning)
penguins <- penguins %>%
mutate(large_flipper = if_else(flipper_length_mm < mean(flipper_length_mm, na.rm = TRUE), 0, 1))
penguins <- penguins %>%
mutate(body_mass_g = as.numeric(scale(body_mass_g))) %>%
rename(planning_esm = large_flipper,
planning_behavior = body_mass_g)
penguins %>%
select(planning_esm, planning_behavior) %>%
correlate() %>%
knitr::kable()
term | planning_esm | planning_behavior |
---|---|---|
planning_esm | NA | 0.7781774 |
planning_behavior | 0.7781774 | NA |
Let’s predict planning_esm
on the basis of planning_behavior
.
model_logit <- glm(planning_esm ~ planning_behavior,
data = penguins,
family = binomial(link = "logit"))
it’s a simple model:
equatiomatic::extract_eq(model_logit)
\[ \log\left[ \frac { P( \operatorname{planning\_esm} = \operatorname{0} ) }{ 1 - P( \operatorname{planning\_esm} = \operatorname{0} ) } \right] = \alpha + \beta_{1}(\operatorname{planning\_behavior}) \]
these estimates are in log-odds units; they can be exponentiated to obtain an odds ratio,
summary(model_logit)
##
## Call:
## glm(formula = planning_esm ~ planning_behavior, family = binomial(link = "logit"),
## data = penguins)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1978 -0.3788 -0.1192 0.2624 2.8251
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.3022 0.1909 -1.583 0.113
## planning_behavior 3.6707 0.3974 9.237 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 467.91 on 341 degrees of freedom
## Residual deviance: 180.96 on 340 degrees of freedom
## (2 observations deleted due to missingness)
## AIC: 184.96
##
## Number of Fisher Scoring iterations: 6
https://cran.r-project.org/web/packages/margins/vignettes/TechnicalDetails.pdf
For a logistic regression model, for example, we may want to interpret marginal effects (sometimes “partial effects”) on the scale of the observed outcome, so that we can understand the marginal effects as changes in the predicted probability of the outcome. By default, margins sets type = “response”. This can, however, be modified. For GLMs, this could be set to type = “link” in order to calculate true marginal effects on the scale of the linear predictor.
model_logit_margins <- margins(model_logit)
summary(model_logit_margins)
## factor AME SE z p lower upper
## planning_behavior 0.2946 0.0062 47.2972 0.0000 0.2824 0.3068
Using the estimates, let’s obtain predictions for all of the cases. The predicted values are on a probabiity scale (0-1). Let’s just look at 30 predictions.
Generally, to the extent that this is a predictive model, when fitted
is higher, planning_esm
should be higher, too,
model_logit_predictions <- prediction::prediction(model_logit, penguins)
as_tibble(model_logit_predictions) %>%
select(planning_esm, fitted, se.fitted, planning_behavior, ) %>%
arrange(fitted) %>%
sample_n(30) %>%
knitr::kable()
planning_esm | fitted | se.fitted | planning_behavior |
---|---|---|---|
0 | 0.0402744 | 0.0139911 | -0.7815336 |
0 | 0.0769623 | 0.0214096 | -0.5944905 |
0 | 0.0074848 | 0.0039299 | -1.2491411 |
0 | 0.2695759 | 0.0401826 | -0.1892307 |
1 | 0.9195391 | 0.0262747 | 0.7459845 |
0 | 0.0131873 | 0.0061394 | -1.0932719 |
0 | 0.1051689 | 0.0257511 | -0.5009690 |
0 | 0.1893112 | 0.0347451 | -0.3139260 |
0 | 0.0323021 | 0.0119798 | -0.8438812 |
1 | 0.9415518 | 0.0212964 | 0.8395060 |
0 | 0.0206838 | 0.0086479 | -0.9685766 |
1 | 0.8206367 | 0.0407325 | 0.4965938 |
1 | 0.9475400 | 0.0197741 | 0.8706798 |
1 | 0.9999088 | 0.0000967 | 2.6164147 |
1 | 0.9475400 | 0.0197741 | 0.8706798 |
1 | 0.9929911 | 0.0041968 | 1.4318089 |
1 | 0.9349270 | 0.0228908 | 0.8083321 |
0 | 0.6468527 | 0.0493587 | 0.2472031 |
0 | 0.1051689 | 0.0257511 | -0.5009690 |
1 | 0.9977573 | 0.0016147 | 1.7435472 |
0 | 0.0231333 | 0.0093982 | -0.9374027 |
1 | 0.8785088 | 0.0335638 | 0.6212891 |
1 | 0.9912045 | 0.0050527 | 1.3694612 |
1 | 0.8206367 | 0.0407325 | 0.4965938 |
1 | 0.8785088 | 0.0335638 | 0.6212891 |
0 | 0.0692167 | 0.0200405 | -0.6256644 |
1 | 0.8518929 | 0.0372412 | 0.5589414 |
1 | 0.9944168 | 0.0034787 | 1.4941565 |
0 | 0.0692167 | 0.0200405 | -0.6256644 |
0 | 0.8785088 | 0.0335638 | 0.6212891 |
The predictions on a probability scale are valuable. For instance, this code shows us that students are reporting planning around 43% (+- 4%) of the time.
my_subset_predictions <- model_logit_predictions %>%
select(planning_esm, planning_behavior, sex, fitted, se.fitted) %>%
as_tibble()
# overall predicted probability and SE (in probability units)
my_subset_predictions %>%
summarize(mean_fitted = mean(fitted, na.rm = TRUE),
mean_se.fitted = mean(se.fitted, na.rm = TRUE))
## # A tibble: 1 x 2
## mean_fitted mean_se.fitted
## <dbl> <dbl>
## 1 0.433 0.0207
We can also examine differences in (in this case) sex:
# sex predicted probability and SE (in probability units)
my_subset_predictions %>%
filter(!is.na(sex)) %>%
group_by(sex) %>%
summarize(mean_fitted = mean(fitted, na.rm = TRUE),
mean_se.fitted = mean(se.fitted, na.rm = TRUE))
## # A tibble: 2 x 3
## sex mean_fitted mean_se.fitted
## <fct> <dbl> <dbl>
## 1 female 0.310 0.0186
## 2 male 0.555 0.0227
Females were planning around 31% of the time (+- ~4%) and males around 56% of the time (+- 4.5%).
Let’s predict planning_esm
on the basis of planning_behavior
. We can add time_point
(year in the original data) and student
(species in the original data), estimating a model that considers responses to be nested within both time points and classes. We see that much of the variation is within students - around 74%. 8% is within time point.
penguins <- penguins %>%
rename(time_point = year,
student = species)
library(lme4)
model_logit_nested <- glmer(planning_esm ~ planning_behavior + (1|time_point) + (1|student),
data = penguins,
family = binomial(link = "logit"))
summary(model_logit_nested)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: planning_esm ~ planning_behavior + (1 | time_point) + (1 | student)
## Data: penguins
##
## AIC BIC logLik deviance df.resid
## 115.4 130.7 -53.7 107.4 338
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4182 -0.1630 -0.0311 0.0149 12.0711
##
## Random effects:
## Groups Name Variance Std.Dev.
## time_point (Intercept) 1.486 1.219
## student (Intercept) 13.881 3.726
## Number of obs: 342, groups: time_point, 3; student, 3
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.256103 0.002258 556.4 <2e-16 ***
## planning_behavior 3.297524 0.002258 1460.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## plnnng_bhvr 0.000
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.0313688 (tol = 0.002, component 1)
performance::icc(model_logit_nested, by_group = TRUE)
## # ICC by Group
##
## Group | ICC
## ------------------
## time_point | 0.080
## student | 0.744
equatiomatic::extract_eq(model_logit_nested)
\[ \begin{aligned} \operatorname{planning\_esm}_{i} &\sim \operatorname{Binomial}(n = 1, \operatorname{prob}_{\operatorname{planning\_esm} = 1} = \widehat{P}) \\ \log\left[\frac{\hat{P}}{1 - \hat{P}} \right] &=\alpha_{j[i],k[i]} + \beta_{1}(\operatorname{planning\_behavior}) \\ \alpha_{j} &\sim N \left(\mu_{\alpha_{j}}, \sigma^2_{\alpha_{j}} \right) \text{, for time_point j = 1,} \dots \text{,J} \\ \alpha_{k} &\sim N \left(\mu_{\alpha_{k}}, \sigma^2_{\alpha_{k}} \right) \text{, for student k = 1,} \dots \text{,K} \end{aligned} \]
We can again examine differences in (in this case) sex:
model_logit_nested_predictions <- prediction::prediction(model_logit_nested, penguins)
my_subset_nested_predictions <- model_logit_nested_predictions %>%
select(planning_esm, planning_behavior, sex, fitted, se.fitted) %>%
as_tibble()
# sex predicted probability and SE (in probability units)
my_subset_nested_predictions %>%
filter(!is.na(sex)) %>%
group_by(sex) %>%
summarize(mean_fitted = mean(fitted, na.rm = TRUE),
mean_se.fitted = mean(se.fitted, na.rm = TRUE))
## # A tibble: 2 x 3
## sex mean_fitted mean_se.fitted
## <fct> <dbl> <dbl>
## 1 female 0.381 NaN
## 2 male 0.483 NaN
It looks like we may have to fit a model with MCMC to propogate uncertainty with our multi-level model. There seems to be much more uncertainty; let’s explore that further next.
model_logit_nested_brms <- brm(planning_esm ~ planning_behavior + (1|time_point) + (1|student),
data = penguins,
family = bernoulli(),
cores = 4,
chains = 4,
iter = 1000)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DBOOST_NO_AUTO_PTR -include '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
model_logit_nested_brms
## Family: bernoulli
## Links: mu = logit
## Formula: planning_esm ~ planning_behavior + (1 | time_point) + (1 | student)
## Data: penguins (Number of observations: 342)
## Samples: 4 chains, each with iter = 1000; warmup = 500; thin = 1;
## total post-warmup samples = 2000
##
## Group-Level Effects:
## ~student (Number of levels: 3)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 4.25 2.04 1.73 9.96 1.00 1150 1006
##
## ~time_point (Number of levels: 3)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.85 1.18 0.47 5.07 1.00 666 582
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.62 1.87 -3.05 4.51 1.00 1101 890
## planning_behavior 3.39 0.70 2.05 4.79 1.00 1932 1345
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
model_logit_nested_predictions_brms <- predict(model_logit_nested_brms, penguins)
model_logit_nested_predictions_brms <- bind_cols(as_tibble(model_logit_nested_predictions_brms), penguins)
my_subset_nested_predictions_brms <- model_logit_nested_predictions_brms %>%
select(planning_esm, planning_behavior, sex, Estimate, Est.Error) %>%
as_tibble()
# sex predicted probability and SE (in probability units)
my_subset_nested_predictions_brms %>%
filter(!is.na(sex)) %>%
group_by(sex) %>%
summarize(mean_fitted = mean(Estimate, na.rm = TRUE),
mean_se.fitted = mean(Est.Error, na.rm = TRUE))
## # A tibble: 2 x 3
## sex mean_fitted mean_se.fitted
## <fct> <dbl> <dbl>
## 1 female 0.381 0.111
## 2 male 0.483 0.175
Why so much more uncertainty? Maybe it’s the multi-level structure, somehow? Let’s fit a simpler model. It’s even more uncertain (which makes sense since we removed degrees of freedom when we added the multi-level structure). Not sure why.
model_logit_nested_brms_1 <- brm(planning_esm ~ planning_behavior,
data = penguins,
family = bernoulli(),
cores = 4,
chains = 4,
iter = 1000)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DBOOST_NO_AUTO_PTR -include '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
model_logit_nested_brms_1
## Family: bernoulli
## Links: mu = logit
## Formula: planning_esm ~ planning_behavior
## Data: penguins (Number of observations: 342)
## Samples: 4 chains, each with iter = 1000; warmup = 500; thin = 1;
## total post-warmup samples = 2000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.30 0.18 -0.68 0.06 1.00 1656 1369
## planning_behavior 3.75 0.39 3.05 4.56 1.00 1287 1179
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
model_logit_nested_predictions_brms_1 <- predict(model_logit_nested_brms_1, penguins)
model_logit_nested_predictions_brms_1 <- bind_cols(as_tibble(model_logit_nested_predictions_brms_1), penguins)
my_subset_nested_predictions_brms_1 <- model_logit_nested_predictions_brms_1 %>%
select(planning_esm, planning_behavior, sex, Estimate, Est.Error) %>%
as_tibble()
# sex predicted probability and SE (in probability units)
my_subset_nested_predictions_brms_1 %>%
filter(!is.na(sex)) %>%
group_by(sex) %>%
summarize(mean_fitted = mean(Estimate, na.rm = TRUE),
mean_se.fitted = mean(Est.Error, na.rm = TRUE))
## # A tibble: 2 x 3
## sex mean_fitted mean_se.fitted
## <fct> <dbl> <dbl>
## 1 female 0.310 0.220
## 2 male 0.554 0.258
Without a prior, the log-odds coefficient for the relationship between planning_behavior
and planning_esm
was 3.74 (SE = 0.40). Let’s say we think a priori that it’s very high. I’m getting a bit lost in the marginal effects on the response scale and calculating the probability from the log-odds and odds, so this will just be a very very rough start.
exp(3.74) # odds
## [1] 42.09799
exp(3.74)/(exp(3.74)+1) # probability - this seems too high, especially given the marginal effects
## [1] 0.9767971
Let’s say our prior is a bit lower:
x_to_p <- function(x) {
exp(x)/(exp(x)+1)
}
seq(0, 10, by = .10) %>%
map_dbl(x_to_p) # oh, it's 0
## [1] 0.5000000 0.5249792 0.5498340 0.5744425 0.5986877 0.6224593 0.6456563
## [8] 0.6681878 0.6899745 0.7109495 0.7310586 0.7502601 0.7685248 0.7858350
## [15] 0.8021839 0.8175745 0.8320184 0.8455347 0.8581489 0.8698915 0.8807971
## [22] 0.8909032 0.9002495 0.9088770 0.9168273 0.9241418 0.9308616 0.9370266
## [29] 0.9426758 0.9478464 0.9525741 0.9568927 0.9608343 0.9644288 0.9677045
## [36] 0.9706878 0.9734030 0.9758730 0.9781187 0.9801597 0.9820138 0.9836975
## [43] 0.9852260 0.9866131 0.9878716 0.9890131 0.9900482 0.9909867 0.9918374
## [50] 0.9926085 0.9933071 0.9939402 0.9945137 0.9950332 0.9955037 0.9959299
## [57] 0.9963158 0.9966652 0.9969816 0.9972680 0.9975274 0.9977622 0.9979747
## [64] 0.9981671 0.9983412 0.9984988 0.9986415 0.9987706 0.9988875 0.9989932
## [71] 0.9990889 0.9991756 0.9992540 0.9993249 0.9993891 0.9994472 0.9994998
## [78] 0.9995474 0.9995904 0.9996294 0.9996646 0.9996966 0.9997254 0.9997515
## [85] 0.9997752 0.9997966 0.9998159 0.9998334 0.9998493 0.9998636 0.9998766
## [92] 0.9998883 0.9998990 0.9999086 0.9999173 0.9999252 0.9999323 0.9999387
## [99] 0.9999446 0.9999498 0.9999546
We can specify a prior of 0 with a SD of, let’s say, .50. This should lead to a smaller estimated relationship between planning_behavior
and planning_esm
.
model_logit_nested_brms_2 <- brm(planning_esm ~ planning_behavior,
prior = c(prior(normal(0, .50),
coef = planning_behavior)),
data = penguins,
family = bernoulli(),
cores = 4,
chains = 4,
iter = 1000)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel/include/" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DBOOST_NO_AUTO_PTR -include '/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -I/usr/local/include -fPIC -Wall -g -O2 -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
## ^
## ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
## ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
model_logit_nested_brms_2
## Family: bernoulli
## Links: mu = logit
## Formula: planning_esm ~ planning_behavior
## Data: penguins (Number of observations: 342)
## Samples: 4 chains, each with iter = 1000; warmup = 500; thin = 1;
## total post-warmup samples = 2000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.31 0.16 -0.62 0.00 1.00 1875 1446
## planning_behavior 2.62 0.23 2.18 3.11 1.00 1767 1237
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
model_logit_nested_predictions_brms_2 <- predict(model_logit_nested_brms_2, penguins)
model_logit_nested_predictions_brms_2 <- bind_cols(as_tibble(model_logit_nested_predictions_brms_2), penguins)
my_subset_nested_predictions_brms_2 <- model_logit_nested_predictions_brms_2 %>%
select(planning_esm, planning_behavior, sex, Estimate, Est.Error) %>%
as_tibble()
# sex predicted probability and SE (in probability units)
my_subset_nested_predictions_brms_2 %>%
filter(!is.na(sex)) %>%
group_by(sex) %>%
summarize(mean_fitted = mean(Estimate, na.rm = TRUE),
mean_se.fitted = mean(Est.Error, na.rm = TRUE))
## # A tibble: 2 x 3
## sex mean_fitted mean_se.fitted
## <fct> <dbl> <dbl>
## 1 female 0.310 0.296
## 2 male 0.557 0.318