Stan Tutorial

June 25, 2017

Module objectives

In this module, we will run a linear model (standard regression) using Stan.

This will allow us to:
- Understand how to do Bayesian inference with Stan without getting bogged down in modeling details.

Running a linear model in Stan isn't something we would normally do in our own work, but it gives us a simple starting point that most people are familar with. We will add details and complexity in subsequent modules.

Let's get started!

Open the file lm.R from the folder 1_lm folder.
In that file, you will find the code for this module and can run the code to follow along.
Nearly all the code in lm.R is contained in these slides, as well.

Computer conjoint analysis with a linear model

Computer conjoint data

Our first data set is a ratings-based conjoint study reported in Lenk, Desarbo, Green and Young (1996). Thanks, Peter!
In this study, respondents were shown computer profiles (combinations of features) and rated them on a 0-10 scale.
Our goal is to analyze this data using a linear model (i.e. standard regression) relating the respondents' ratings to the features of the computers.
By interpreting the coeficients, we will know how much people value the features of computers (or at least how they valued features back in the early 90's.)

Read the data into R

cc.df <- read.csv("ComputerConjointData.csv")
head(cc.df)

##   Question LIKE id HotLine RAM Screen    CPU HardDisk CD Cache Color
## 1        1    6  1     yes 8MB   17in 100MHz    340MB no 256MB beige
## 2        1    5  2     yes 8MB   17in 100MHz    340MB no 256MB beige
## 3        1   10  3     yes 8MB   17in 100MHz    340MB no 256MB beige
## 4        1   10  4     yes 8MB   17in 100MHz    340MB no 256MB beige
## 5        1    8  5     yes 8MB   17in 100MHz    340MB no 256MB beige
## 6        1    3  6     yes 8MB   17in 100MHz    340MB no 256MB beige
##      Channel Warranty Software Guarantee Price
## 1 mail-order      3yr  bundled    30days $2000
## 2 mail-order      3yr  bundled    30days $2000
## 3 mail-order      3yr  bundled    30days $2000
## 4 mail-order      3yr  bundled    30days $2000
## 5 mail-order      3yr  bundled    30days $2000
## 6 mail-order      3yr  bundled    30days $2000

Inspect the data in R

summary(cc.df)

##     Question          LIKE              id      HotLine      RAM      
##  Min.   : 1.00   Min.   : 0.000   Min.   :  1   no :2010   16MB:2211  
##  1st Qu.: 5.75   1st Qu.: 3.000   1st Qu.: 51   yes:2010   8MB :1809  
##  Median :10.50   Median : 5.000   Median :101                         
##  Mean   :10.50   Mean   : 4.773   Mean   :101                         
##  3rd Qu.:15.25   3rd Qu.: 7.000   3rd Qu.:151                         
##  Max.   :20.00   Max.   :10.000   Max.   :201                         
##                  NA's   :22                                           
##   Screen         CPU        HardDisk      CD         Cache     
##  14in:2211   100MHz:2010   340MB:2010   no :1809   128MB:2010  
##  17in:1809   50MHz :2010   730MB:2010   yes:2211   256MB:2010  
##                                                                
##                                                                
##                                                                
##                                                                
##                                                                
##    Color            Channel     Warranty          Software   
##  beige:1809   mail-order:1809   3yr:1809   bundled    :2211  
##  black:2211   store     :2211   5yr:2211   not_bundled:1809  
##                                                              
##                                                              
##                                                              
##                                                              
##                                                              
##   Guarantee      Price     
##  30days:2010   $2000:2010  
##  none  :2010   $3500:2010  
##                            
##                            
##                            
##                            
##

Plot the data in R

plot(LIKE ~ Price, data = cc.df, ylab = "Rating (0-10)")

Have data. Still need model.

Stan model specification

In most statistical software (e.g., SPSS, JMP, Stata, SAS, base R, MCMCpack), you are provided with a menu of different models that you can run such as standard regression, logistic regression, poisson regression, etc.
In Stan, you specify the model you want to estimate using a special syntax.
This gives you the freedom to specify any model you want.
When we want to estimate a standard regression, we still need to specify the model, which helps us remember to think carefully about the model.

Stan model components

data declares the structure of the data that you observe
parameters declares the parameters of your model
model describes the model you want to estimate
- And, optionally, your priors on the parameters of that model

Stan `data`

The data structures and their dimensions must all be declared in the data block.
Don't type this into R!

data {
  int N;  
  int K;
  vector[N] y;
  matrix[N,K] X;
}

For this model, the data is a matrix describing the attributes of the products that were shown to respondents (X) and the vector of ratings that the respondents gave those products (y).
There are N evaluations and K product attributes. Note that dimensions are declared explicitly.

Stan `model`

The model block we specify says that the ratings y are normally distributed with mean beta0 + X*beta and standard deviation sigma.
Don't type this into R!

model {
  y ~ normal(beta0 + X*beta, sigma);
}

That's pretty straightforward!

Stan `parameters`

intercept beta0
a vector of K parameters that multiplies the product attributes in X which is called beta
a variance parameter sigma
Don't type this into R!

parameters {
  real beta0;
  vector[K] beta;
  real<lower=0> sigma;
}

The parameters we are really after are those in the beta vector, which are sometimes called the part-worths of the product attributes.

Stan specification as a text string in R

lm.stan <- "
data {
  int N;  
  int K;
  vector[N] y;
  matrix[N,K] X;
}

parameters {
  real beta0;
  vector[K] beta;
  real<lower=0> sigma;
}

model {
  y ~ normal(beta0 + X*beta, sigma);
}
"

Preparing the computer data for Stan

Preparing data for Stan: `y`

When you pass data to Stan, it needs to be in the format that we have declared in our Stan code. To create the y variable in R:

y <- cc.df[!is.na(cc.df$LIKE),"LIKE"]
str(y)

##  int [1:3998] 6 5 10 10 8 3 8 9 9 5 ...

Preparing data for Stan: `X` (wrong!)

For X, we want to include all of the product features, which are:

X <- cc.df[!is.na(cc.df$LIKE), 4:16]
head(X)

##   HotLine RAM Screen    CPU HardDisk CD Cache Color    Channel Warranty
## 1     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
## 2     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
## 3     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
## 4     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
## 5     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
## 6     yes 8MB   17in 100MHz    340MB no 256MB beige mail-order      3yr
##   Software Guarantee Price
## 1  bundled    30days $2000
## 2  bundled    30days $2000
## 3  bundled    30days $2000
## 4  bundled    30days $2000
## 5  bundled    30days $2000
## 6  bundled    30days $2000

Preparing data for Stan: `X` (right!)

However, in the original data, the values of the attributes are text labels like 8MB while Stan is expecting a matrix of real numbers that it can use in multiplication. We can code these attributes as numbers using the R function model.matrix.

X <- model.matrix(~ HotLine + RAM + Screen + CPU + HardDisk + 
                    CD + Cache + Color + Channel + Warranty + 
                    Software + Guarantee + Price, 
                  data=cc.df[!is.na(cc.df$LIKE),])
X <- X[,2:ncol(X)] ## remove intercept
head(X)

##   HotLineyes RAM8MB Screen17in CPU50MHz HardDisk730MB CDyes Cache256MB
## 1          1      1          1        0             0     0          1
## 2          1      1          1        0             0     0          1
## 3          1      1          1        0             0     0          1
## 4          1      1          1        0             0     0          1
## 5          1      1          1        0             0     0          1
## 6          1      1          1        0             0     0          1
##   Colorblack Channelstore Warranty5yr Softwarenot_bundled Guaranteenone
## 1          0            0           0                   0             0
## 2          0            0           0                   0             0
## 3          0            0           0                   0             0
## 4          0            0           0                   0             0
## 5          0            0           0                   0             0
## 6          0            0           0                   0             0
##   Price$3500
## 1          0
## 2          0
## 3          0
## 4          0
## 5          0
## 6          0

Preparing data for Stan: bundling it up

Finally, to pass the data to Stan, it needs to be bundled up as a list object in R.

cc.standata <- list(N=length(y), K=ncol(X), y=y, X=X)
str(cc.standata)

## List of 4
##  $ N: int 3998
##  $ K: int 13
##  $ y: int [1:3998] 6 5 10 10 8 3 8 9 9 5 ...
##  $ X: num [1:3998, 1:13] 1 1 1 1 1 1 1 1 1 1 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:3998] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:13] "HotLineyes" "RAM8MB" "Screen17in" "CPU50MHz" ...

Tidy up the R workspace by removing `y` and `X`

rm(y, X)

Bayesian statistical inference

Running the model

Now that we have a Stan model and data in Stan format, we can ask Stan to produce random draws from the posterior for us.

library(rstan)  ## load interface to Stan
m.stan <- stan(model_code = lm.stan, data = cc.standata, 
               iter = 1000, chains=1)

## In file included from file7669288d013.cpp:8:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/src/stan/model/model_header.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/mat.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core.hpp:12:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core/var.hpp:7:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/math/tools/config.hpp:13:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/config.hpp:39:
## /Users/emf75/Library/R/3.3/library/BH/include/boost/config/compiler/clang.hpp:196:11: warning: 'BOOST_NO_CXX11_RVALUE_REFERENCES' macro redefined [-Wmacro-redefined]
## #  define BOOST_NO_CXX11_RVALUE_REFERENCES
##           ^
## <command line>:6:9: note: previous definition is here
## #define BOOST_NO_CXX11_RVALUE_REFERENCES 1
##         ^
## In file included from file7669288d013.cpp:8:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/src/stan/model/model_header.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/mat.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core.hpp:42:
## /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints.hpp:14:17: warning: unused function 'set_zero_all_adjoints' [-Wunused-function]
##     static void set_zero_all_adjoints() {
##                 ^
## In file included from file7669288d013.cpp:8:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/src/stan/model/model_header.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/mat.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core.hpp:43:
## /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/core/set_zero_all_adjoints_nested.hpp:17:17: warning: 'static' function 'set_zero_all_adjoints_nested' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
##     static void set_zero_all_adjoints_nested() {
##                 ^
## In file included from file7669288d013.cpp:8:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/src/stan/model/model_header.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/mat.hpp:11:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/prim/mat.hpp:59:
## /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/prim/mat/fun/autocorrelation.hpp:17:14: warning: function 'fft_next_good_size' is not needed and will not be emitted [-Wunneeded-internal-declaration]
##       size_t fft_next_good_size(size_t N) {
##              ^
## In file included from file7669288d013.cpp:8:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/src/stan/model/model_header.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math.hpp:4:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/rev/mat.hpp:11:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/prim/mat.hpp:298:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/prim/arr.hpp:39:
## In file included from /Users/emf75/Library/R/3.3/library/StanHeaders/include/stan/math/prim/arr/functor/integrate_ode_rk45.hpp:13:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/numeric/odeint.hpp:61:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/numeric/odeint/util/multi_array_adaption.hpp:29:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array.hpp:21:
## In file included from /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array/base.hpp:28:
## /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array/concept_checks.hpp:42:43: warning: unused typedef 'index_range' [-Wunused-local-typedef]
##       typedef typename Array::index_range index_range;
##                                           ^
## /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array/concept_checks.hpp:43:37: warning: unused typedef 'index' [-Wunused-local-typedef]
##       typedef typename Array::index index;
##                                     ^
## /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array/concept_checks.hpp:53:43: warning: unused typedef 'index_range' [-Wunused-local-typedef]
##       typedef typename Array::index_range index_range;
##                                           ^
## /Users/emf75/Library/R/3.3/library/BH/include/boost/multi_array/concept_checks.hpp:54:37: warning: unused typedef 'index' [-Wunused-local-typedef]
##       typedef typename Array::index index;
##                                     ^
## 8 warnings generated.
## 
## SAMPLING FOR MODEL 'e688d3cae2a15b3fb91186bfe3de5542' NOW (CHAIN 1).
## 
## Gradient evaluation took 0.000574 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 5.74 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:   1 / 1000 [  0%]  (Warmup)
## Iteration: 100 / 1000 [ 10%]  (Warmup)
## Iteration: 200 / 1000 [ 20%]  (Warmup)
## Iteration: 300 / 1000 [ 30%]  (Warmup)
## Iteration: 400 / 1000 [ 40%]  (Warmup)
## Iteration: 500 / 1000 [ 50%]  (Warmup)
## Iteration: 501 / 1000 [ 50%]  (Sampling)
## Iteration: 600 / 1000 [ 60%]  (Sampling)
## Iteration: 700 / 1000 [ 70%]  (Sampling)
## Iteration: 800 / 1000 [ 80%]  (Sampling)
## Iteration: 900 / 1000 [ 90%]  (Sampling)
## Iteration: 1000 / 1000 [100%]  (Sampling)
## 
##  Elapsed Time: 4.56133 seconds (Warm-up)
##                3.9461 seconds (Sampling)
##                8.50743 seconds (Total)

This part was easy for us, but it may take some time for the computer to produce the posterior draws.

What's `stan()` doing?

stan() is producing a sample of draws from the posterior distribution.
When you call stan(), it first translates the model and data into a program for producing the posterior draws, then runs that program and feeds the output back to R.
While much Bayesian software uses an MCMC algorithm called Gibbs sampling to produce the posterior draws, Stan defaults to a different MCMC algorithm called Hamiltonian Monte Carlo.
One of the great things about Stan is you don't need to know too much about the algorithm to use the software (but knowing a little bit helps.)

`stan()` output

The stanfit object that is returned to R from stan() stores everything about the run: the model text, the posterior draws, the names of the parameters, etc.

str(m.stan)

## Formal class 'stanfit' [package "rstan"] with 10 slots
##   ..@ model_name: chr "e688d3cae2a15b3fb91186bfe3de5542"
##   ..@ model_pars: chr [1:4] "beta0" "beta" "sigma" "lp__"
##   ..@ par_dims  :List of 4
##   .. ..$ beta0: num(0) 
##   .. ..$ beta : num 13
##   .. ..$ sigma: num(0) 
##   .. ..$ lp__ : num(0) 
##   ..@ mode      : int 0
##   ..@ sim       :List of 12
##   .. ..$ samples    :List of 1
##   .. .. ..$ :List of 16
##   .. .. .. ..$ beta0   : num [1:1000] -1.25 -1.25 -1.25 -1.25 -1.25 ...
##   .. .. .. ..$ beta[1] : num [1:1000] -1.12 -1.12 -1.12 -1.12 -1.12 ...
##   .. .. .. ..$ beta[2] : num [1:1000] -1.49 -1.49 -1.49 -1.49 -1.49 ...
##   .. .. .. ..$ beta[3] : num [1:1000] -1.49 -1.49 -1.49 -1.49 -1.49 ...
##   .. .. .. ..$ beta[4] : num [1:1000] -0.455 -0.455 -0.455 -0.455 -0.455 ...
##   .. .. .. ..$ beta[5] : num [1:1000] 1.11 1.11 1.11 1.11 1.11 ...
##   .. .. .. ..$ beta[6] : num [1:1000] 0.0258 0.0258 0.0258 0.0258 0.0258 ...
##   .. .. .. ..$ beta[7] : num [1:1000] 0.272 0.272 0.272 0.272 0.272 ...
##   .. .. .. ..$ beta[8] : num [1:1000] -1.52 -1.52 -1.52 -1.52 -1.52 ...
##   .. .. .. ..$ beta[9] : num [1:1000] 1.14 1.14 1.14 1.14 1.14 ...
##   .. .. .. ..$ beta[10]: num [1:1000] 1.02 1.02 1.02 1.02 1.02 ...
##   .. .. .. ..$ beta[11]: num [1:1000] -1.63 -1.63 -1.63 -1.63 -1.63 ...
##   .. .. .. ..$ beta[12]: num [1:1000] -1.66 -1.66 -1.66 -1.66 -1.66 ...
##   .. .. .. ..$ beta[13]: num [1:1000] 1.81 1.81 1.81 1.81 1.81 ...
##   .. .. .. ..$ sigma   : num [1:1000] 1.12 1.12 1.12 1.12 1.12 ...
##   .. .. .. ..$ lp__    : num [1:1000] -118912 -118912 -118912 -118912 -118912 ...
##   .. .. .. ..- attr(*, "test_grad")= logi FALSE
##   .. .. .. ..- attr(*, "args")=List of 16
##   .. .. .. .. ..$ append_samples    : logi FALSE
##   .. .. .. .. ..$ chain_id          : num 1
##   .. .. .. .. ..$ control           :List of 12
##   .. .. .. .. .. ..$ adapt_delta      : num 0.8
##   .. .. .. .. .. ..$ adapt_engaged    : logi TRUE
##   .. .. .. .. .. ..$ adapt_gamma      : num 0.05
##   .. .. .. .. .. ..$ adapt_init_buffer: num 75
##   .. .. .. .. .. ..$ adapt_kappa      : num 0.75
##   .. .. .. .. .. ..$ adapt_t0         : num 10
##   .. .. .. .. .. ..$ adapt_term_buffer: num 50
##   .. .. .. .. .. ..$ adapt_window     : num 25
##   .. .. .. .. .. ..$ max_treedepth    : int 10
##   .. .. .. .. .. ..$ metric           : chr "diag_e"
##   .. .. .. .. .. ..$ stepsize         : num 1
##   .. .. .. .. .. ..$ stepsize_jitter  : num 0
##   .. .. .. .. ..$ enable_random_init: logi TRUE
##   .. .. .. .. ..$ init              : chr "random"
##   .. .. .. .. ..$ init_list         : NULL
##   .. .. .. .. ..$ init_radius       : num 2
##   .. .. .. .. ..$ iter              : int 1000
##   .. .. .. .. ..$ method            : chr "sampling"
##   .. .. .. .. ..$ random_seed       : chr "97682506"
##   .. .. .. .. ..$ refresh           : int 100
##   .. .. .. .. ..$ sampler_t         : chr "NUTS(diag_e)"
##   .. .. .. .. ..$ save_warmup       : logi TRUE
##   .. .. .. .. ..$ test_grad         : logi FALSE
##   .. .. .. .. ..$ thin              : int 1
##   .. .. .. .. ..$ warmup            : int 500
##   .. .. .. ..- attr(*, "inits")= num [1:15] -1.361 -1.179 -1.538 -1.545 -0.505 ...
##   .. .. .. ..- attr(*, "mean_pars")= num [1:15] 5.938 0.208 -0.595 0.367 -0.849 ...
##   .. .. .. ..- attr(*, "mean_lp__")= num -5569
##   .. .. .. ..- attr(*, "adaptation_info")= chr "# Adaptation terminated\n# Step size = 0.193709\n# Diagonal elements of inverse mass matrix:\n# 0.0257637, 0.00684503, 0.006396"| __truncated__
##   .. .. .. ..- attr(*, "elapsed_time")= Named num [1:2] 4.56 3.95
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "warmup" "sample"
##   .. .. .. ..- attr(*, "sampler_params")=List of 6
##   .. .. .. .. ..$ accept_stat__: num [1:1000] 1 0 0 0 0 1 1 1 1 1 ...
##   .. .. .. .. ..$ stepsize__   : num [1:1000] 9.77e-04 1.44e+01 2.43 2.40e-01 1.86e-02 ...
##   .. .. .. .. ..$ treedepth__  : num [1:1000] 1 0 0 0 0 7 2 2 4 3 ...
##   .. .. .. .. ..$ n_leapfrog__ : num [1:1000] 1 1 1 1 1 127 3 3 15 7 ...
##   .. .. .. .. ..$ divergent__  : num [1:1000] 0 1 1 1 1 0 0 0 0 0 ...
##   .. .. .. .. ..$ energy__     : num [1:1000] 900988 118920 118924 118919 118923 ...
##   .. .. .. ..- attr(*, "return_code")= int 0
##   .. ..$ iter       : num 1000
##   .. ..$ thin       : num 1
##   .. ..$ warmup     : num 500
##   .. ..$ chains     : num 1
##   .. ..$ n_save     : num 1000
##   .. ..$ warmup2    : num 500
##   .. ..$ permutation:List of 1
##   .. .. ..$ : int [1:500] 16 212 6 481 354 191 180 98 314 69 ...
##   .. ..$ pars_oi    : chr [1:4] "beta0" "beta" "sigma" "lp__"
##   .. ..$ dims_oi    :List of 4
##   .. .. ..$ beta0: num(0) 
##   .. .. ..$ beta : num 13
##   .. .. ..$ sigma: num(0) 
##   .. .. ..$ lp__ : num(0) 
##   .. ..$ fnames_oi  : chr [1:16] "beta0" "beta[1]" "beta[2]" "beta[3]" ...
##   .. ..$ n_flatnames: int 16
##   ..@ inits     :List of 1
##   .. ..$ :List of 3
##   .. .. ..$ beta0: num -1.36
##   .. .. ..$ beta : num [1:13(1d)] -1.179 -1.538 -1.545 -0.505 1.061 ...
##   .. .. ..$ sigma: num 0.38
##   ..@ stan_args :List of 1
##   .. ..$ :List of 9
##   .. .. ..$ chain_id          : int 1
##   .. .. ..$ iter              : int 1000
##   .. .. ..$ thin              : int 1
##   .. .. ..$ seed              : int 97682506
##   .. .. ..$ warmup            : num 500
##   .. .. ..$ init              : chr "random"
##   .. .. ..$ algorithm         : chr "NUTS"
##   .. .. ..$ check_unknown_args: logi FALSE
##   .. .. ..$ method            : chr "sampling"
##   ..@ stanmodel :Formal class 'stanmodel' [package "rstan"] with 5 slots
##   .. .. ..@ model_name  : chr "e688d3cae2a15b3fb91186bfe3de5542"
##   .. .. ..@ model_code  : atomic [1:1] 
## data {
##   int N;  
##   int K;
##   vector[N] y;
##   matrix[N,K] X;
## }
## 
## parameters {
##   real beta0;
##   vector[K] beta;
##   real<lower=0> sigma;
## }
## 
## model {
##   y ~ normal(beta0 + X*beta, sigma);
## }
## 
##   .. .. .. ..- attr(*, "model_name2")= chr "e688d3cae2a15b3fb91186bfe3de5542"
##   .. .. ..@ model_cpp   :List of 2
##   .. .. .. ..$ model_cppname: chr "model7669ef1476f_e688d3cae2a15b3fb91186bfe3de5542"
##   .. .. .. ..$ model_cppcode: chr "// Code generated by Stan version 2.15.0\n\n#include <stan/model/model_header.hpp>\n\nnamespace model7669ef1476f_e688d3cae2a15b"| __truncated__
##   .. .. ..@ mk_cppmodule:function (object)  
##   .. .. ..@ dso         :Formal class 'cxxdso' [package "rstan"] with 7 slots
##   .. .. .. .. ..@ sig         :List of 1
##   .. .. .. .. .. ..$ file7669288d013: chr(0) 
##   .. .. .. .. ..@ dso_saved   : logi TRUE
##   .. .. .. .. ..@ dso_filename: chr "file7669288d013"
##   .. .. .. .. ..@ modulename  : chr "stan_fit4model7669ef1476f_e688d3cae2a15b3fb91186bfe3de5542_mod"
##   .. .. .. .. ..@ system      : chr "x86_64, darwin13.4.0"
##   .. .. .. .. ..@ cxxflags    : chr "CXXFLAGS = -Wall -mtune=core2 -g -O2 $(LTO)"
##   .. .. .. .. ..@ .CXXDSOMISC :<environment: 0x7fb64af58898> 
##   ..@ date      : chr "Wed Jul  5 12:00:25 2017"
##   ..@ .MISC     :<environment: 0x7fb64d749ce8>

Getting the posterior draws

The core output of Stan is a matrix of posterior draws for all the parameters. We can get the posterior draws using the extract() function from the rstan package.

mydraws <- extract(m.stan, pars=c("beta"))
head(mydraws$beta)

##           
## iterations       [,1]       [,2]      [,3]       [,4]       [,5]      [,6]
##       [1,] 0.19740999 -0.5914161 0.5160827 -0.9157734 0.38511068 1.0928047
##       [2,] 0.17958350 -0.5887120 0.2596891 -0.7541913 0.26844179 0.9057757
##       [3,] 0.29554013 -0.6444533 0.4057966 -0.8746189 0.28431914 0.9818464
##       [4,] 0.01303628 -0.5696108 0.2517150 -0.8109542 0.18863116 0.9243068
##       [5,] 0.12542269 -0.5536330 0.2371954 -0.8766540 0.09941147 0.7988320
##       [6,] 0.08730884 -0.5045019 0.3613823 -0.5757361 0.09866310 0.9485528
##           
## iterations         [,7]        [,8]      [,9]      [,10]      [,11]
##       [1,] -0.079499488 -0.04482239 0.2878329 -0.2217906 -0.3054568
##       [2,]  0.091061457  0.07472608 0.1917749 -0.1633398 -0.3943135
##       [3,]  0.062147386 -0.04212758 0.2099535 -0.4134927 -0.4354544
##       [4,]  0.084839798  0.01052008 0.3221497 -0.4698140 -0.5690210
##       [5,]  0.005786222  0.04973867 0.1696677 -0.1911747 -0.3236274
##       [6,]  0.021987891  0.08600439 0.2974813 -0.2051077 -0.3477360
##           
## iterations      [,12]     [,13]
##       [1,] -0.2217020 -2.327877
##       [2,] -0.3272110 -2.473883
##       [3,] -0.2417738 -2.237272
##       [4,] -0.2615193 -2.192958
##       [5,] -0.3223195 -2.310287
##       [6,] -0.1250320 -2.426316

Summarizing the draws

Summarizing the posterior draws

We usually summarize the draws to get a sense of what the parameters might be.
Some Bayesian software like Sawtooth CBC-HB does this step more automatically for the user, but the this two stage process is typical and allows the analyst to summarize the draws as she likes.

Posterior summaries

summary(m.stan)$summary

##                   mean     se_mean         sd          2.5%           25%
## beta0     5.937791e+00 0.008490918 0.15078226  5.624641e+00  5.839899e+00
## beta[1]   2.075533e-01 0.003708977 0.08293526  5.347726e-02  1.480525e-01
## beta[2]  -5.952007e-01 0.003862098 0.07816434 -7.533202e-01 -6.441456e-01
## beta[3]   3.665858e-01 0.004101629 0.08460820  2.084793e-01  3.053227e-01
## beta[4]  -8.490565e-01 0.003557864 0.07955627 -1.008483e+00 -9.007051e-01
## beta[5]   2.926251e-01 0.003988549 0.08232436  1.308773e-01  2.365633e-01
## beta[6]   9.664002e-01 0.003747642 0.08379983  7.987131e-01  9.084722e-01
## beta[7]   8.396496e-02 0.003435028 0.07680957 -7.041231e-02  2.987844e-02
## beta[8]  -1.785856e-02 0.004173688 0.08260435 -1.688624e-01 -7.427799e-02
## beta[9]   1.993357e-01 0.003618253 0.08090660  4.814137e-02  1.458052e-01
## beta[10] -2.566432e-01 0.003559898 0.07960174 -4.196135e-01 -3.132709e-01
## beta[11] -3.731614e-01 0.004064660 0.08278861 -5.385465e-01 -4.293853e-01
## beta[12] -2.267082e-01 0.003570451 0.07983772 -3.734021e-01 -2.798337e-01
## beta[13] -2.282146e+00 0.003637171 0.08132961 -2.441979e+00 -2.333102e+00
## sigma     2.441880e+00 0.001181119 0.02641062  2.390490e+00  2.423509e+00
## lp__     -5.568736e+03 0.160663582 2.55315740 -5.574566e+03 -5.570040e+03
##                    50%           75%         97.5%    n_eff      Rhat
## beta0     5.938239e+00     6.0429544  6.235283e+00 315.3488 1.0002139
## beta[1]   2.046890e-01     0.2663660  3.654613e-01 500.0000 0.9985215
## beta[2]  -5.950164e-01    -0.5456524 -4.427470e-01 409.6103 0.9988887
## beta[3]   3.649812e-01     0.4230947  5.284629e-01 425.5124 1.0008381
## beta[4]  -8.485787e-01    -0.7937569 -6.954011e-01 500.0000 1.0032261
## beta[5]   2.916160e-01     0.3490446  4.521311e-01 426.0169 0.9979984
## beta[6]   9.675414e-01     1.0264737  1.123612e+00 500.0000 0.9999406
## beta[7]   8.477490e-02     0.1309846  2.351080e-01 500.0000 0.9983135
## beta[8]  -2.044278e-02     0.0362006  1.433732e-01 391.7111 1.0000978
## beta[9]   2.005728e-01     0.2542461  3.557986e-01 500.0000 0.9999962
## beta[10] -2.502777e-01    -0.1999898 -1.182839e-01 500.0000 0.9980719
## beta[11] -3.748073e-01    -0.3148992 -2.180157e-01 414.8516 0.9985667
## beta[12] -2.260158e-01    -0.1703362 -7.554753e-02 500.0000 0.9985976
## beta[13] -2.278980e+00    -2.2248738 -2.124541e+00 500.0000 1.0009022
## sigma     2.442024e+00     2.4587435  2.498223e+00 500.0000 0.9983750
## lp__     -5.568470e+03 -5566.8499317 -5.564539e+03 252.5342 0.9994241

Adding labels to the posterior summary

cc.lab <- c("Intercept", dimnames(cc.standata$X)[[2]], "sigma", "lp__")
data.frame(label=cc.lab, summary(m.stan)$summary)

##                        label          mean     se_mean         sd
## beta0              Intercept  5.937791e+00 0.008490918 0.15078226
## beta[1]           HotLineyes  2.075533e-01 0.003708977 0.08293526
## beta[2]               RAM8MB -5.952007e-01 0.003862098 0.07816434
## beta[3]           Screen17in  3.665858e-01 0.004101629 0.08460820
## beta[4]             CPU50MHz -8.490565e-01 0.003557864 0.07955627
## beta[5]        HardDisk730MB  2.926251e-01 0.003988549 0.08232436
## beta[6]                CDyes  9.664002e-01 0.003747642 0.08379983
## beta[7]           Cache256MB  8.396496e-02 0.003435028 0.07680957
## beta[8]           Colorblack -1.785856e-02 0.004173688 0.08260435
## beta[9]         Channelstore  1.993357e-01 0.003618253 0.08090660
## beta[10]         Warranty5yr -2.566432e-01 0.003559898 0.07960174
## beta[11] Softwarenot_bundled -3.731614e-01 0.004064660 0.08278861
## beta[12]       Guaranteenone -2.267082e-01 0.003570451 0.07983772
## beta[13]          Price$3500 -2.282146e+00 0.003637171 0.08132961
## sigma                  sigma  2.441880e+00 0.001181119 0.02641062
## lp__                    lp__ -5.568736e+03 0.160663582 2.55315740
##                  X2.5.          X25.          X50.          X75.
## beta0     5.624641e+00  5.839899e+00  5.938239e+00     6.0429544
## beta[1]   5.347726e-02  1.480525e-01  2.046890e-01     0.2663660
## beta[2]  -7.533202e-01 -6.441456e-01 -5.950164e-01    -0.5456524
## beta[3]   2.084793e-01  3.053227e-01  3.649812e-01     0.4230947
## beta[4]  -1.008483e+00 -9.007051e-01 -8.485787e-01    -0.7937569
## beta[5]   1.308773e-01  2.365633e-01  2.916160e-01     0.3490446
## beta[6]   7.987131e-01  9.084722e-01  9.675414e-01     1.0264737
## beta[7]  -7.041231e-02  2.987844e-02  8.477490e-02     0.1309846
## beta[8]  -1.688624e-01 -7.427799e-02 -2.044278e-02     0.0362006
## beta[9]   4.814137e-02  1.458052e-01  2.005728e-01     0.2542461
## beta[10] -4.196135e-01 -3.132709e-01 -2.502777e-01    -0.1999898
## beta[11] -5.385465e-01 -4.293853e-01 -3.748073e-01    -0.3148992
## beta[12] -3.734021e-01 -2.798337e-01 -2.260158e-01    -0.1703362
## beta[13] -2.441979e+00 -2.333102e+00 -2.278980e+00    -2.2248738
## sigma     2.390490e+00  2.423509e+00  2.442024e+00     2.4587435
## lp__     -5.574566e+03 -5.570040e+03 -5.568470e+03 -5566.8499317
##                 X97.5.    n_eff      Rhat
## beta0     6.235283e+00 315.3488 1.0002139
## beta[1]   3.654613e-01 500.0000 0.9985215
## beta[2]  -4.427470e-01 409.6103 0.9988887
## beta[3]   5.284629e-01 425.5124 1.0008381
## beta[4]  -6.954011e-01 500.0000 1.0032261
## beta[5]   4.521311e-01 426.0169 0.9979984
## beta[6]   1.123612e+00 500.0000 0.9999406
## beta[7]   2.351080e-01 500.0000 0.9983135
## beta[8]   1.433732e-01 391.7111 1.0000978
## beta[9]   3.557986e-01 500.0000 0.9999962
## beta[10] -1.182839e-01 500.0000 0.9980719
## beta[11] -2.180157e-01 414.8516 0.9985667
## beta[12] -7.554753e-02 500.0000 0.9985976
## beta[13] -2.124541e+00 500.0000 1.0009022
## sigma     2.498223e+00 500.0000 0.9983750
## lp__     -5.564539e+03 252.5342 0.9994241

Bayesian hypothesis testing

We can compute the probability that a parameter is greater than some threshold like zero just by summarizing the draws.

mean(mydraws$beta[,6]>0)

## [1] 1

mean(mydraws$beta[,8]>0)

## [1] 0.414

If more than 95% of draws are greater than 0, then it is very likley that the parameter is greater than zero.

Plotting the parameters

plot(m.stan, pars=c("beta0", "beta"))

Interpretation

Having a CD drive increases the rating by almost a point (beta[6]).
Including the faster processor increasing the rating by almost a point (beta[4]).
The higher price reduces the rating by more than 2 points (beta[13]).
The color has almost no effect on the rating (beta[8]).

Assessing convergence with traceplots

There is some risk that the Stan sampler did not reach the proper posterior. If this happens, we can usually detect it by ploting the draws in order over time (called a traceplot.)

plot(m.stan, plotfun="trace", pars=c("beta0", "beta", "sigma"))

Fuzzy caterpillar

Traceplots should look like fuzzy caterpillars.

Histograms of posterior draws

Histograms of the draws approximate posterior densities.

plot(m.stan, plotfun="hist", pars=c("beta0", "beta", "sigma"))

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Density plots of posterior draws

Or you could use a density plot, if you prefer.

plot(m.stan, plotfun="dens", pars=c("beta0", "beta", "sigma"))

Combining plots

We happen to like this plot from another R package:

library(bayesplot)
mcmc_combo(As.mcmc.list(m.stan))

But it doesn't work very well with 15 parameters.

Other things you can get from the `stanfit` object

get_stanmodel(m.stan)

## S4 class stanmodel 'e688d3cae2a15b3fb91186bfe3de5542' coded as follows:
## 
## data {
##   int N;  
##   int K;
##   vector[N] y;
##   matrix[N,K] X;
## }
## 
## parameters {
##   real beta0;
##   vector[K] beta;
##   real<lower=0> sigma;
## }
## 
## model {
##   y ~ normal(beta0 + X*beta, sigma);
## }
##

Summary of Module 1

What have we accomplished?

We've reviewed Bayesian inference.
We've fit linear model to the computer conjoint data using Stan.
- More precisely, we have computed a posterior sample for the parameters of the linear model using the computer conjoint data.
The model suggests that CD and CPU are the most important attributes with customers prefering computers that have a CD drive and have a faster CPU.

Stan modeling steps

Read data into into R and inspect
Create model specifiction using the Stan language
- data
- parameters
- model
Munge data into Stan format (list of vectors and matricies)
Call stan() to create the posterior draws
Assess convergence
Summarize and visualize the draws to make inferences

Advanced techniques

In the file lm_advanced.R you will find code for several advanced topics
- Testing the model using synthetic data
- Comparing inference from stan() to lm()
  - OLS regression in R using lm()
  - Bayesian regression using MCMCregress() from MCMCPack

A note on hierarchical models

Lenk et al. (1996) used this data set to illustrate a hierarchical linear model and that is a more approrpriate model for this data.
However, we are going to switch to choice models before we get to hierarchical models.

Module objectives

Let's get started!

Computer conjoint analysis with a linear model

Computer conjoint data

Read the data into R

Inspect the data in R

Plot the data in R

Have data. Still need model.

Stan model specification

Stan model components

Stan data

Stan model

Stan parameters

Stan specification as a text string in R

Preparing the computer data for Stan

Preparing data for Stan: y

Preparing data for Stan: X (wrong!)

Preparing data for Stan: X (right!)

Preparing data for Stan: bundling it up

Tidy up the R workspace by removing y and X

Bayesian statistical inference

Running the model

Running the model

What's stan() doing?

stan() output

Getting the posterior draws

Summarizing the draws

Summarizing the posterior draws

Posterior summaries

Adding labels to the posterior summary

Bayesian hypothesis testing

Plotting the parameters

Interpretation

Assessing convergence with traceplots

Fuzzy caterpillar

Histograms of posterior draws

Density plots of posterior draws

Combining plots

Other things you can get from the stanfit object

Summary of Module 1

What have we accomplished?

Stan modeling steps

Advanced techniques

A note on hierarchical models

Stan `data`

Stan `model`

Stan `parameters`

Preparing data for Stan: `y`

Preparing data for Stan: `X` (wrong!)

Preparing data for Stan: `X` (right!)

Tidy up the R workspace by removing `y` and `X`

What's `stan()` doing?

`stan()` output

Other things you can get from the `stanfit` object