Easy

10E1

p_to_logodds <- function(p){log(p/(1-p))}
p_to_logodds(0.35)

## [1] -0.6190392

10E2

logodds_to_p <- function(logodds){exp(logodds)/(1+exp(logodds))}
logodds_to_p(-0.62)

## [1] 0.3497815

10E3

Coefficients on the log-odds scale, so for the odds this means a relative change of:

exp(1.7)

## [1] 5.473947

10E4

Because samples may have been counted on different scales, ie counting birds for 4 or 5 hours. The offset can then normalize this difference without needing to scale the original counts and throwing out data.

Medium

10M1

The binomial distribution in aggregated form has an extra coefficient, the multiplicity, the number of ways that the sequence could be reordered. In non-aggregated form the sequence is known.

10M2

The log-odds increase, what this means for outcome on the actual outcome scale depends on the odds with and without the variable with this coefficient.

10M3

In binomial models, we model a proportion, p, with a linear model. The original scale, p, runs from 0 to 1. The logit link scales a linear line to values between 0 and 1, where the 0-line represents a p=0.5 . As logit functions become very large or small, they have increasingly less influence on the original scale.

10M4

In poisson models, we are modeling a count rate with a linear model. Using the exponential ensures that the counts are always positive, as they should be.

10M5

The rate would need to be constrained between 0 and 1.

curve(dpois(x, 0.2), from = 0, to = 10)

Cannot think of situation where this is appropriate

10M6

Hard

10H1

library(rethinking)
data(chimpanzees)
d <- chimpanzees
d2 <- d
d2$recipient <- NULL

m10.4 <- map2stan(
  alist(
    pulled_left ~ dbinom( 1 , p ) ,
    logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left ,
    a[actor] ~ dnorm(0,10),
    bp ~ dnorm(0,10),
    bpC ~ dnorm(0,10)
  ) ,
  data=d2 , chains=2 , iter=2500 , warmup=500 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file29684f14632d.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'pulled_left ~ dbinom(1, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2500 [  0%]  (Warmup)
## Iteration:  250 / 2500 [ 10%]  (Warmup)
## Iteration:  500 / 2500 [ 20%]  (Warmup)
## Iteration:  501 / 2500 [ 20%]  (Sampling)
## Iteration:  750 / 2500 [ 30%]  (Sampling)
## Iteration: 1000 / 2500 [ 40%]  (Sampling)
## Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Iteration: 2500 / 2500 [100%]  (Sampling)
## 
##  Elapsed Time: 2.94 seconds (Warm-up)
##                9.615 seconds (Sampling)
##                12.555 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'pulled_left ~ dbinom(1, p)' NOW (CHAIN 2).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2500 [  0%]  (Warmup)
## Iteration:  250 / 2500 [ 10%]  (Warmup)
## Iteration:  500 / 2500 [ 20%]  (Warmup)
## Iteration:  501 / 2500 [ 20%]  (Sampling)
## Iteration:  750 / 2500 [ 30%]  (Sampling)
## Iteration: 1000 / 2500 [ 40%]  (Sampling)
## Iteration: 1250 / 2500 [ 50%]  (Sampling)
## Iteration: 1500 / 2500 [ 60%]  (Sampling)
## Iteration: 1750 / 2500 [ 70%]  (Sampling)
## Iteration: 2000 / 2500 [ 80%]  (Sampling)
## Iteration: 2250 / 2500 [ 90%]  (Sampling)
## Iteration: 2500 / 2500 [100%]  (Sampling)
## 
##  Elapsed Time: 2.788 seconds (Warm-up)
##                8.854 seconds (Sampling)
##                11.642 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'pulled_left ~ dbinom(1, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0.001 seconds (Sampling)
##                0.001 seconds (Total)
## 
## [ 400 / 4000 ]
[ 800 / 4000 ]
[ 1200 / 4000 ]
[ 1600 / 4000 ]
[ 2000 / 4000 ]
[ 2400 / 4000 ]
[ 2800 / 4000 ]
[ 3200 / 4000 ]
[ 3600 / 4000 ]
[ 4000 / 4000 ]

precis(m10.4)

##      Mean StdDev lower 0.89 upper 0.89 n_eff Rhat
## bp   0.84   0.26       0.43       1.24  2220    1
## bpC -0.13   0.29      -0.58       0.34  2875    1

m10.4.map <- map(
  alist(
    pulled_left ~ dbinom( 1 , p ) ,
    logit(p) <- a[actor] + (bp + bpC*condition)*prosoc_left ,
    a[actor] ~ dnorm(0,10),
    bp ~ dnorm(0,10),
    bpC ~ dnorm(0,10)
  ) ,
  data=d2)

precis(m10.4.map)

##      Mean StdDev  5.5% 94.5%
## bp   0.82   0.26  0.40  1.24
## bpC -0.13   0.30 -0.61  0.34

post.mcmc <- extract.samples(m10.4)
post.map  <- extract.samples(m10.4.map)

par(mfrow=c(3,3))
for(i in 1:7){
  dens(post.mcmc$a[,i], add=F, main=i)
  dens(post.map$a[,i], add=T, col="red", lty=2)
}

Estimates for all chimps are more normally distributed when using map, as expected. This is fine, except in the case of chimp#2, who pulled left in all cases, making the parameters difficult to estimate. Note that the scale for this chimp is also much larger.

10H2

m10.2 <- map(
  alist(
    pulled_left ~ dbinom( 1 , p ) ,
    logit(p) <- a + bp*prosoc_left ,
    a ~ dnorm(0,10) ,
    bp ~ dnorm(0,10)
  ) ,
  data=d2 )


m10.3 <- map(
  alist(
    pulled_left ~ dbinom( 1 , p ) ,
    logit(p) <- a + (bp + bpC*condition)*prosoc_left ,
    a ~ dnorm(0,10) ,
    bp ~ dnorm(0,10) ,
    bpC ~ dnorm(0,10)
  ) ,
  data=d2 )

compare(m10.2, m10.3, m10.4.map)

##            WAIC pWAIC dWAIC weight    SE   dSE
## m10.4.map 556.6  18.5   0.0      1 18.29    NA
## m10.2     680.4   1.9 123.8      0  9.20 17.77
## m10.3     682.3   3.0 125.8      0  9.35 17.73

The model including the actors is the best by far.

10H3

A

library(MASS)
data(eagles)
#construct readable dummies
eagles$P_large = as.numeric(eagles$P=="L")
eagles$A_adult = as.numeric(eagles$A=="A")
eagles$V_large = as.numeric(eagles$V=="L")

eagles$P <- eagles$A <- eagles$V <- NULL


m10H3_stan <- map2stan(
  alist(
    y ~ dbinom( n , p ) ,
    logit(p) <- a + bp*P_large + ba*A_adult + bv*V_large ,
    a ~ dnorm(0,10),
    c(bp,ba,bv) ~ dnorm(0,5)
  ) ,
  data=eagles , chains=3, cores=3 , iter=2500 , warmup=500 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file29681761270d.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'y ~ dbinom(n, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0 seconds (Sampling)
##                0 seconds (Total)
## 
## [ 600 / 6000 ]
[ 1200 / 6000 ]
[ 1800 / 6000 ]
[ 2400 / 6000 ]
[ 3000 / 6000 ]
[ 3600 / 6000 ]
[ 4200 / 6000 ]
[ 4800 / 6000 ]
[ 5400 / 6000 ]
[ 6000 / 6000 ]

precis(m10H3_stan)

##     Mean StdDev lower 0.89 upper 0.89 n_eff Rhat
## a   0.65   0.68      -0.42       1.72  3230    1
## bp  4.68   1.05       2.95       6.12  1964    1
## ba  1.13   0.55       0.27       2.02  3219    1
## bv -5.08   1.11      -6.72      -3.30  2060    1

m10H3_map <- map(
  alist(
    y ~ dbinom( n , p ) ,
    logit(p) <- a + bp*P_large + ba*A_adult + bv*V_large ,
    a ~ dnorm(0,10),
    c(bp,ba,bv) ~ dnorm(0,5)
  ) ,
  data=eagles)

precis(m10H3_map)

##     Mean StdDev  5.5% 94.5%
## a   0.59   0.66 -0.47  1.65
## bp  4.24   0.90  2.81  5.67
## ba  1.08   0.53  0.23  1.93
## bv -4.59   0.96 -6.13 -3.06

post.stan <- extract.samples(m10H3_stan)
post.map  <- extract.samples(m10H3_map)

par(mfrow=c(2,2))
for(i in names(post.stan)){
  dens(post.stan[[i]], add=F, main=i)
  dens(post.map[[i]], add=T, col="red", lty=2)
}

Pretty close, but how close is ok?

B

p <- link(m10H3_stan)

## [ 100 / 1000 ]
[ 200 / 1000 ]
[ 300 / 1000 ]
[ 400 / 1000 ]
[ 500 / 1000 ]
[ 600 / 1000 ]
[ 700 / 1000 ]
[ 800 / 1000 ]
[ 900 / 1000 ]
[ 1000 / 1000 ]

y <- sim(m10H3_stan)

## [ 100 / 1000 ]
[ 200 / 1000 ]
[ 300 / 1000 ]
[ 400 / 1000 ]
[ 500 / 1000 ]
[ 600 / 1000 ]
[ 700 / 1000 ]
[ 800 / 1000 ]
[ 900 / 1000 ]
[ 1000 / 1000 ]

p.mean <- colMeans(p)
p.PI <- apply(p, 2, PI)
y.mean <- colMeans(y)
y.PI <- apply(y, 2, PI)

par(mfrow=c(2,1))

# plot the model predictions for `p` vs. the actual proportion of successes for each case
eagles$success <- eagles$y / eagles$n
plot(eagles$success, col=rangi2, ylab="successful proportion", xlab="case", xaxt="n",  ylim = c(0, 1), pch=16)
axis(1, at=1:8, labels=c( "LAL","LAS","LIL","LIS","SAL","SAS","SIL","SIS" ))
points( 1:8 , p.mean )
for(i in 1:8){ lines( c(i, i), p.PI[,i] ) }

# plot the model predictions for `y` vs. the actual number of successes for each case
plot(eagles$y, col=rangi2, ylab="number of successes successful", xlab="case", xaxt="n", xlim=c(0.75,8.25) , ylim = c(0, 30), pch=16)
axis(1, at=1:8, labels=c( "LAL","LAS","LIL","LIS","SAL","SAS","SIL","SIS" ))
points( 1:8 , y.mean )
for(i in 1:8){ lines(c(i, i), y.PI[,i]) }

C

m10H3_stan_2 <- map2stan(
  alist(
    y ~ dbinom( n , p ) ,
    logit(p) <- a + bp*P_large + ba*A_adult + bpa*P_large*A_adult + bv*V_large ,
    a ~ dnorm(0,10),
    c(bp,ba,bv, bpa) ~ dnorm(0,5)
  ) ,
  data=eagles , chains=3, cores=3 , iter=2500 , warmup=500 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file2968470dc8d.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'y ~ dbinom(n, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0 seconds (Sampling)
##                0 seconds (Total)
## 
## [ 600 / 6000 ]
[ 1200 / 6000 ]
[ 1800 / 6000 ]
[ 2400 / 6000 ]
[ 3000 / 6000 ]
[ 3600 / 6000 ]
[ 4200 / 6000 ]
[ 4800 / 6000 ]
[ 5400 / 6000 ]
[ 6000 / 6000 ]

m10H3_stan_3 <- map2stan(
  alist(
    y ~ dbinom( n , p ) ,
    logit(p) <- a + bp*P_large + ba*A_adult + bpv*P_large*V_large + bv*V_large ,
    a ~ dnorm(0,10),
    c(bp,ba,bv, bpv) ~ dnorm(0,5)
  ) ,
  data=eagles , chains=3, cores=3 , iter=2500 , warmup=500 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file2968385b656.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'y ~ dbinom(n, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0 seconds (Sampling)
##                0 seconds (Total)
## 
## [ 600 / 6000 ]
[ 1200 / 6000 ]
[ 1800 / 6000 ]
[ 2400 / 6000 ]
[ 3000 / 6000 ]
[ 3600 / 6000 ]
[ 4200 / 6000 ]
[ 4800 / 6000 ]
[ 5400 / 6000 ]
[ 6000 / 6000 ]

m10H3_stan_4 <- map2stan(
  alist(
    y ~ dbinom( n , p ) ,
    logit(p) <- a + bp*P_large + bpa*P_large*A_adult + ba*A_adult + bpv*P_large*V_large + bv*V_large ,
    a ~ dnorm(0,10),
    c(bp,ba,bv, bpa, bpv) ~ dnorm(0,5)
  ) ,
  data=eagles , chains=3, cores=3 , iter=2500 , warmup=500 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file296851185dd3.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'y ~ dbinom(n, p)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0 seconds (Sampling)
##                0 seconds (Total)
## 
## [ 600 / 6000 ]
[ 1200 / 6000 ]
[ 1800 / 6000 ]
[ 2400 / 6000 ]
[ 3000 / 6000 ]
[ 3600 / 6000 ]
[ 4200 / 6000 ]
[ 4800 / 6000 ]
[ 5400 / 6000 ]
[ 6000 / 6000 ]

compare(m10H3_stan, m10H3_stan_2, m10H3_stan_3, m10H3_stan_4)

##              WAIC pWAIC dWAIC weight    SE  dSE
## m10H3_stan_2 93.7   4.6   0.0   0.57 12.71   NA
## m10H3_stan_4 94.7   5.0   1.0   0.35 13.07 0.67
## m10H3_stan   98.7   4.1   5.0   0.05 13.42 4.85
## m10H3_stan_3 99.4   4.4   5.6   0.03 13.56 4.96

par(mfrow=c(4,1), mar=c(2,2,1,1))
plot(precis(m10H3_stan))
plot(precis(m10H3_stan_2))
plot(precis(m10H3_stan_3))
plot(precis(m10H3_stan_4))

So it seems that the interaction adult*large is negative, meaning that being either large or adult gives the eagles an advantage, but these effects are not additive.

Including interaction between sizes of pirate and victim (big pirate has better chances on small victim for example) does not really improve the model.

mens <- ensemble(m10H3_stan, m10H3_stan_2, m10H3_stan_3, m10H3_stan_4)

p <- mens$link
y <- mens$sim

p.mean <- colMeans(p)
p.PI <- apply(p, 2, PI)
y.mean <- colMeans(y)
y.PI <- apply(y, 2, PI)

par(mfrow=c(2,1))

# plot the model predictions for `p` vs. the actual proportion of successes for each case
eagles$success <- eagles$y / eagles$n
plot(eagles$success, col=rangi2, ylab="successful proportion", xlab="case", xaxt="n",  ylim = c(0, 1), pch=16)
axis(1, at=1:8, labels=c( "LAL","LAS","LIL","LIS","SAL","SAS","SIL","SIS" ))
points( 1:8 , p.mean )
for(i in 1:8){ lines( c(i, i), p.PI[,i] ) }

# plot the model predictions for `y` vs. the actual number of successes for each case
plot(eagles$y, col=rangi2, ylab="number of successes successful", xlab="case", xaxt="n", xlim=c(0.75,8.25) , ylim = c(0, 30), pch=16)
axis(1, at=1:8, labels=c( "LAL","LAS","LIL","LIS","SAL","SAS","SIL","SIS" ))
points( 1:8 , y.mean )
for(i in 1:8){ lines(c(i, i), y.PI[,i]) }

The model fit is now a lot better.

10H4

data("salamanders")

sal_map <- map(
  alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*PCTCOVER,
    a ~ dnorm(0,10),
    bC ~ dnorm(0,5)
  ),
  data=salamanders )


sal_stan <- map2stan(
  alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*PCTCOVER,
    a ~ dnorm(0,10),
    bC ~ dnorm(0,5)
  ),
  data=salamanders , chains=3, cores=3 , iter=4000 , warmup=1000 )

## In file included from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config.hpp:39:0,
##                  from C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/math/tools/config.hpp:13,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/var.hpp:7,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/core.hpp:12,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math/rev/mat.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/stan/math.hpp:4,
##                  from C:/Users/paule/Documents/R/win-library/3.4/StanHeaders/include/src/stan/model/model_header.hpp:4,
##                  from file29684030637a.cpp:8:
## C:/Users/paule/Documents/R/win-library/3.4/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
##  #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##  ^
## <command-line>:0:0: note: this is the location of the previous definition
## cc1plus.exe: warning: unrecognized command line option "-Wno-ignored-attributes"
## 
## SAMPLING FOR MODEL 'SALAMAN ~ dpois(lambda)' NOW (CHAIN 1).
## 
## Gradient evaluation took 0 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
## Adjust your expectations accordingly!
## 
## 
## WARNING: No variance estimation is
##          performed for num_warmup < 20
## 
## Iteration: 1 / 1 [100%]  (Sampling)
## 
##  Elapsed Time: 0 seconds (Warm-up)
##                0 seconds (Sampling)
##                0 seconds (Total)
## 
## [ 900 / 9000 ]
[ 1800 / 9000 ]
[ 2700 / 9000 ]
[ 3600 / 9000 ]
[ 4500 / 9000 ]
[ 5400 / 9000 ]
[ 6300 / 9000 ]
[ 7200 / 9000 ]
[ 8100 / 9000 ]
[ 9000 / 9000 ]

post.stan <- extract.samples(sal_stan)
post.map  <- extract.samples(sal_map)

par(mfrow=c(2,2))
for(i in names(post.stan)){
  dens(post.stan[[i]], add=F, main=i)
  dens(post.map[[i]], add=T, col="red", lty=2)
}

map approximation is fine

y <- sim(sal_map)

## [ 100 / 1000 ]
[ 200 / 1000 ]
[ 300 / 1000 ]
[ 400 / 1000 ]
[ 500 / 1000 ]
[ 600 / 1000 ]
[ 700 / 1000 ]
[ 800 / 1000 ]
[ 900 / 1000 ]
[ 1000 / 1000 ]

y.mean <- colMeans(y)
y.PI <- apply(y, 2, PI)


# plot the model predictions for `y` vs. the actual number of successes for each case
plot(y=salamanders$SALAMAN, x=salamanders$PCTCOVER, col=rangi2, ylab="number of salamanders", xlab="forest cover perc", pch=16)
lines(y= y.mean[order(salamanders$PCTCOVER)],  x=sort(salamanders$PCTCOVER))
lines( y.PI[1,order(salamanders$PCTCOVER)],  x=sort(salamanders$PCTCOVER), lty=2 )
lines( y.PI[2,order(salamanders$PCTCOVER)],  x=sort(salamanders$PCTCOVER), lty=2 )

It accurately predicts that there are more salamanders in more densely covered areas, but it cannot predict in the 80-100 density range, where most of the data is.

B

after some initial trouble decided so scale predictors

data("salamanders")
plot(salamanders$PCTCOVER~salamanders$FORESTAGE)

salamanders$PCTCOVER <- scale(salamanders$PCTCOVER)
salamanders$FORESTAGE <- scale(salamanders$FORESTAGE)

sal_map_1 <- map(
  alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*PCTCOVER,
    a ~ dnorm(0,10),
    c(bC) ~ dnorm(0,5)
  ),
  data=salamanders, start=list(bC=0) )


sal_map_2 <- map(
  alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bA*FORESTAGE,
    a ~ dnorm(0,10),
    c(bA) ~ dnorm(0,5)
  ),
  data=salamanders, start=list(bA=0) )

sal_map_3 <- map(
  alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*PCTCOVER + bA*FORESTAGE,
    a ~ dnorm(0,10),
    c(bA, bC) ~ dnorm(0,5)
  ),
  data=salamanders, start=list(bA=0, bC=0) )

# sal_map_4 <- map(
#   alist(
#     SALAMAN ~ dpois( lambda ),
#     log(lambda) <- a + bC*PCTCOVER + bA*FORESTAGE + bCA*PCTCOVER*FORESTAGE,
#     a ~ dnorm(0,10),
#     c(bA, bC, bCA=0) ~ dnorm(0,5)
#   ),
#   data=salamanders, start=list(bA=0, bC=0, bCA=0) )

compare(sal_map_1,sal_map_2,sal_map_3)

##            WAIC pWAIC dWAIC weight    SE   dSE
## sal_map_1 213.4   4.8   0.0   0.88 26.23    NA
## sal_map_3 217.4   7.5   4.0   0.12 26.89  1.18
## sal_map_2 265.6   7.4  52.2   0.00 35.27 22.42

The model including interaction is exactly singular, how is that possible?

The model containing only the PCTCOVER gets most of the weight, but maybe the ensemble model would add a little precision. There is not much evidence that forestage matters much.

Chapter10