Lecture 9 Examples

Count Data and the Poisson Distribution

Count data are non-negative integers (e.g., number of birds visiting a tree). A common likelihood for count data is the Poisson distribution, parameterized by \(\lambda\), the expected count.


Poisson likelihood

## Poisson likelihood examples
dpois(10, 8.9)
## [1] 0.1171969
dpois(1, 8.9)
## [1] 0.001213861
dpois(100, 8.9)
## [1] 1.269919e-67

Counts near \(\lambda = 8.9\) are much more likely than extremely small or large counts.


Key Takeaways

  • Poisson GLMs model count data with a log link to enforce positivity.
  • Coefficients are estimated on the log scale and interpreted via exp().
  • fixef() summarizes posterior beliefs about coefficients.
  • conditional_effects() visualizes expected outcomes with uncertainty.

Poisson GLM: 🦜

A generalized linear model combines: - a linear predictor (intercept + slope × covariate) - a link function (log) - a likelihood (Poisson)

This allows nonlinear relationships on the data scale.


Load and plot the data

fig <- read.csv("fig_visitors.csv")

plot(fig$FIG.DBH, fig$Bird,
     xlab = "Fig DBH",
     ylab = "Number of Birds")


Fit the Poisson model

bird.model <- brm(
  Bird ~ FIG.DBH,
  family = poisson,
  data = fig)
## Compiling Stan program...
## WARNING: Rtools is required to build R packages, but is not currently installed.
## 
## Please download and install the appropriate version of Rtools for 4.5.2 from
## https://cran.r-project.org/bin/windows/Rtools/.
## Trying to compile a simple C file
## Running "C:/PROGRA~1/R/R-45~1.2/bin/x64/Rcmd.exe" SHLIB foo.c
## using C compiler: 'gcc.exe (GCC) 14.3.0'
## gcc  -I"C:/PROGRA~1/R/R-45~1.2/include" -DNDEBUG   -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/Rcpp/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/unsupported"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/BH/include" -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/src/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppParallel/include/" -DRCPP_PARALLEL_USE_TBB=1 -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include "C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp"  -std=c++1y    -I"C:/rtools45/x86_64-w64-mingw32.static.posix/include"      -O2 -Wall -std=gnu2x  -mfpmath=sse -msse2 -mstackrealign   -c foo.c -o foo.o
## cc1.exe: warning: command-line option '-std=c++14' is valid for C++/ObjC++ but not for C
## In file included from C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Core:19,
##                  from C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Dense:1,
##                  from C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
##                  from <command-line>:
## C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
##   679 | #include <cmath>
##       |          ^~~~~~~
## compilation terminated.
## make: *** [C:/PROGRA~1/R/R-45~1.2/etc/x64/Makeconf:289: foo.o] Error 1
## WARNING: Rtools is required to build R packages, but is not currently installed.
## 
## Please download and install the appropriate version of Rtools for 4.5.2 from
## https://cran.r-project.org/bin/windows/Rtools/.
## Start sampling
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.071 seconds (Warm-up)
## Chain 1:                0.054 seconds (Sampling)
## Chain 1:                0.125 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.059 seconds (Warm-up)
## Chain 2:                0.051 seconds (Sampling)
## Chain 2:                0.11 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: Rejecting initial value:
## Chain 3:   Log probability evaluates to log(0), i.e. negative infinity.
## Chain 3:   Stan can't start sampling from this initial value.
## Chain 3: 
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.072 seconds (Warm-up)
## Chain 3:                0.046 seconds (Sampling)
## Chain 3:                0.118 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 6e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.057 seconds (Warm-up)
## Chain 4:                0.043 seconds (Sampling)
## Chain 4:                0.1 seconds (Total)
## Chain 4:

Inspect posterior draws

bird_draws <- as.data.frame(bird.model)

median(bird_draws$b_Intercept)
## [1] 2.283618
median(bird_draws$b_FIG.DBH)
## [1] 0.002394414

Fixed effects with fixef()

fixef() summarizes the posterior distribution of the model coefficients (on the log scale).

fixef(bird.model)
##              Estimate    Est.Error        Q2.5       Q97.5
## Intercept 2.282875218 0.0842697907 2.114327161 2.446091485
## FIG.DBH   0.002386717 0.0003597236 0.001665182 0.003054481

Extract posterior mean coefficients:

a <- fixef(bird.model)["Intercept", "Estimate"]
b <- fixef(bird.model)["FIG.DBH", "Estimate"]

Plot the model-implied mean

plot(fig$FIG.DBH, fig$Bird,
     xlab = "Fig DBH",
     ylab = "Number of Birds")

curve(exp(a + b * x),
      from = min(fig$FIG.DBH, na.rm = TRUE),
      to   = max(fig$FIG.DBH, na.rm = TRUE),
      add  = TRUE,
      lwd  = 2)

The curve shows the expected number of birds as a function of tree diameter.


Interpreting the intercept

The intercept is the expected log count when FIG.DBH = 0. Exponentiating moves back to the count scale.

b_int <- bird_draws$b_Intercept
baseline_lambda <- exp(b_int)

plot(baseline_lambda,
     ylab = "Expected birds",
     main = "Posterior baseline (FIG.DBH = 0)")

This represents plausible baseline bird visitation rates.


Conditional effects

conditional_effects() shows the expected outcome as a function of a predictor, with posterior uncertainty.

conditional_effects(bird.model)
## Ignoring unknown labels:
## • fill : "NA"
## • colour : "NA"
## Ignoring unknown labels:
## • fill : "NA"
## • colour : "NA"

Categorical Predictor: 🌱

Load shrub data and fit model

shrubs <- read.csv("shrubs.csv")

shrub.model <- brm(
  Shrub.Count ~ Burned.or.unburned,
  family = poisson,
  data = shrubs
)
## Compiling Stan program...
## WARNING: Rtools is required to build R packages, but is not currently installed.
## 
## Please download and install the appropriate version of Rtools for 4.5.2 from
## https://cran.r-project.org/bin/windows/Rtools/.
## Trying to compile a simple C file
## Running "C:/PROGRA~1/R/R-45~1.2/bin/x64/Rcmd.exe" SHLIB foo.c
## using C compiler: 'gcc.exe (GCC) 14.3.0'
## gcc  -I"C:/PROGRA~1/R/R-45~1.2/include" -DNDEBUG   -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/Rcpp/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/unsupported"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/BH/include" -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/src/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/"  -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppParallel/include/" -DRCPP_PARALLEL_USE_TBB=1 -I"C:/Users/jessi/AppData/Local/R/win-library/4.5/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include "C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp"  -std=c++1y    -I"C:/rtools45/x86_64-w64-mingw32.static.posix/include"      -O2 -Wall -std=gnu2x  -mfpmath=sse -msse2 -mstackrealign   -c foo.c -o foo.o
## cc1.exe: warning: command-line option '-std=c++14' is valid for C++/ObjC++ but not for C
## In file included from C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Core:19,
##                  from C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Dense:1,
##                  from C:/Users/jessi/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
##                  from <command-line>:
## C:/Users/jessi/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
##   679 | #include <cmath>
##       |          ^~~~~~~
## compilation terminated.
## make: *** [C:/PROGRA~1/R/R-45~1.2/etc/x64/Makeconf:289: foo.o] Error 1
## WARNING: Rtools is required to build R packages, but is not currently installed.
## 
## Please download and install the appropriate version of Rtools for 4.5.2 from
## https://cran.r-project.org/bin/windows/Rtools/.
## Start sampling
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.017 seconds (Warm-up)
## Chain 1:                0.018 seconds (Sampling)
## Chain 1:                0.035 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.017 seconds (Warm-up)
## Chain 2:                0.015 seconds (Sampling)
## Chain 2:                0.032 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.016 seconds (Warm-up)
## Chain 3:                0.015 seconds (Sampling)
## Chain 3:                0.031 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.017 seconds (Warm-up)
## Chain 4:                0.014 seconds (Sampling)
## Chain 4:                0.031 seconds (Total)
## Chain 4:

Conditional effects for categories

conditional_effects(shrub.model)
## Ignoring unknown labels:
## • fill : "NA"
## • colour : "NA"
## Ignoring unknown labels:
## • fill : "NA"
## • colour : "NA"


Interpreting shrub model coefficients

shrub_draws <- as.data.frame(shrub.model)

## Baseline expected shrubs (alphabetical baseline category)
b_int <- exp(shrub_draws$b_Intercept)

## Fixed-effect summary
fixef(shrub.model)
##                      Estimate Est.Error      Q2.5    Q97.5
## Intercept           0.9173769  0.288937 0.3236328 1.459327
## Burned.or.unburnedU 1.9658459  0.306268 1.4028865 2.594884

Expected shrub count in the non-baseline category:

exp(0.9314528 + 1.9560232)
## [1] 17.94795

Probability of direction

pos <- as.data.frame(shrub.model)

mean(pos$b_Burned.or.unburnedU > 0)
## [1] 1

There is extremely high posterior certainty that unburned plots have higher shrub counts.