Bayesian Ideal Points with Stan in R

Creating Legislator Ideal Points in RStudio with Stan

If you don’t have the following packages (readr and rstan), you’ll have to install them, and with Stan, you’ll need a C++ compiler (see here). The model we’ll be estimating is:

\[ y_{ij} = \beta{j}\boldsymbol x_i - \alpha_j,\]

where $y_{ij}$ is our data matrix of observed votes (1=yes, 0=no, absent); $\boldsymbol x_i$ are the legislator ideal points, and $\beta_j$ and $\alpha_j$ are the discrimination and difficulty parameters, respectively (slope and intercept). See Clinton, Jackman & Rivers (2004) for more details.

First, we’ll load the packages and import the data from my github repo using read_csv, which is roll-call data from the 53rd legislature of the Brazilian Federal Senate.

library(readr)
library(rstan)
data <- read_csv("https://raw.githubusercontent.com/RobertMyles/Bayesian-Ideal-Point-IRT-Models/master/Senate_Example.csv")
unique(data$Vote)
data$Vote[data$Vote=="S"] <- 1
data$Vote[data$Vote=="N"] <- 0
data$Vote[data$Vote  %in% c(NA,"O","A")] <- NA
data$Vote <- as.numeric(data$Vote)
data$FullID <- paste(data$SenatorUpper, data$Party, sep=":")

Next, we’ll build the vote matrix that we will use to estimate the ideal points, and place two specific legislators in the first two rows of the matrix, as we’ll use these later as constraints.

NameID <- unique(data$FullID)
J <- length(unique(NameID))
M <- length(unique(data$VoteNumber))
grep("JOSE AGRIPINO:PFL", NameID)    # 34th place; right-wing; positive ideal point
grep("EDUARDO SUPLICY:PT", NameID)   # 12th; left-wing; negative
NameID <- NameID[c(34, 12, 1:11, 13:33, 35:J)]
y <- matrix(NA,J,M)
Rows <- match(data$FullID, NameID)
Cols <- unique(data$VoteNumber)
Columns <- match(data$VoteNumber, Cols)

for(i in 1:dim(data)[1]){
  y[Rows[i],Columns[i]] <- data$Vote[i]
}

dimnames(y) <- list(unique(NameID), unique(data$VoteNumber))

We’ll also create two dataframes, one of legislator characteristics and the other of vote variables. These are optional, but we can use them to make the plots more informative later on.

ldata <- data.frame(FullID=unique(NameID), 
        Party=data$Party[match(unique(NameID), data$FullID)], 
        GovCoalition=data$GovCoalition[match(unique(NameID), data$FullID)],
        Name=data$SenatorUpper[match(unique(NameID), data$FullID)], 
        State=data$State[match(unique(NameID), data$FullID)], 
        row.names=NULL, stringsAsFactors=FALSE)

vdata <- data.frame(VoteNumber=unique(data$VoteNumber), 
        VoteType=data$VoteType[match(unique(data$VoteNumber), data$VoteNumber)],
        SenNumber=data$SenNumber[match(unique(data$VoteNumber), data$VoteNumber)],
        Origin=data$Origin[match(unique(data$VoteNumber), data$VoteNumber)],
        Contentious=data$Contentious[match(unique(data$VoteNumber), 
                    data$VoteNumber)], 
        PercentYes=data$PercentYes[match(unique(data$VoteNumber), data$VoteNumber)],
        IndGov=data$IndGov[match(unique(data$VoteNumber), data$VoteNumber)],
        Content=data$Content[match(unique(data$VoteNumber), data$VoteNumber)],
        Round=data$Round[match(unique(data$VoteNumber), data$VoteNumber)],
        stringsAsFactors=F)

Using Stan, we need to delete out the missing data that is present in the matrix.

N <- length(y)
j <- rep(1:J, times=M)
m <- rep(1:M, each=J)

miss <- which(is.na(y))
N <- N - length(miss)
j <- j[-miss]
m <- m[-miss]
y <- y[-miss]

For initial values, you can let Stan choose them, or set them yourself. Since we’re dealing with vote data, in which I know already that certain parties vote in different ways to others, I can exploit this fact to give the parties different starting values (which will speed things up a little). This is simply placing right-wing parties on the right of the scale, and the inverse for left-wing parties. For the beta and alpha parameters of the model, we’re sampling randomly from a normal distribution. If you don’t know anything about the behaviour of the legislators, you can set starting values in a simiar fashion to those for beta & alpha.

ldata$ThetaStart <- rnorm(J, 0, 1)
ldata$ThetaStart[ldata$Party=="PFL" | ldata$Party=="PTB" | ldata$Party=="PSDB" | ldata$Party=="PPB"] <- 2
ldata$ThetaStart[ldata$Party=="PT" | ldata$Party=="PSOL" | ldata$Party=="PCdoB"] <- -2
ThetaStart <- ldata$ThetaStart

initF <- function() {
  list(theta=ThetaStart, beta=rnorm(M, 0, 2), alpha=rnorm(M, 0, 2))
}

Doing this, I have right-wing parties starting at 2, left-wing parties at -2, and the rest (including independents) at 0.

The model code is the following, in which we need to declare the variables and parameters. We’ll also set different prior distributions for our two constrained legislators, theta[1] and theta[2] (the first two rows of the data matrix).

###  Model
stan.code <- "
data {
  int<lower=1> J; //Senators
  int<lower=1> M; //Proposals
  int<lower=1> N; //no. of observations
  int<lower=1, upper=J> j[N]; //Senator for observation n
  int<lower=1, upper=M> m[N]; //Proposal for observation n
  int<lower=0, upper=1> y[N]; //Vote of observation n
}
parameters {
  real alpha[M];
  real beta[M];
  real theta[J];
}
model {
  alpha ~ normal(0,5); 
  beta ~ normal(0,5); 
  theta ~ normal(0,1); 
  theta[1] ~ normal(1, .01);
  theta[2] ~ normal(-1, .01);  
  for (n in 1:N)
  y[n] ~ bernoulli_logit(theta[j[n]] * beta[m[n]] - alpha[m[n]]);
}"

Then we just send all this to stan. Here, I’m using 1000 iterations, but you can increase this to get better estimates. For these type of data, I usually play it safe and go for 5000 or so. A quick examination of the Rhat values shows we can be confident that the chains are converging. Some more iterations would be useful, but Rhat is less than 1.03, so we’re good. Note that with comparatively few iterations, you’ll probably get slightly different results than mine, and may have to slightly adjust the plots below.

stan.data <- list(J=J, M=M, N=N, j=j, m=m, y=y, ThetaStart=ThetaStart)

stan.fit <- stan(model_code=stan.code, data=stan.data, iter=1000, warmup=500, chains=4, thin=5, init=initF, verbose=TRUE, cores=4, seed=1234)

stan_rhat(stan.fit, bins=60)

Graphing

I prefer to work with an mcmc.list object, so I’ll transform the stan.fit object first.

MS <- As.mcmc.list(stan.fit)
sMS <- summary(MS)

We can extract the ideal points now from the mcmc summary (sMS). You’ll find the ideal points in sMS$statistics[,1], the mean of the posterior.

Theta <- sMS$statistics[grep("theta", row.names(sMS$statistics)),1]
ThetaQ <- sMS$quantiles[grep("theta", row.names(sMS$statistics)),c(1,5)]
Theta <- as.data.frame(cbind(Theta, ThetaQ))
rm(ThetaQ)
Theta$FullID <- ldata$FullID
row.names(Theta) <- NULL
colnames(Theta)[1:3] <- c("Mean", "Lower", "Upper")
Theta <- merge(Theta, ldata, by="FullID")
Theta <- Theta[order(Theta$Mean),]

Now it’s simple to plot with ggplot2. I’m going to colour the plot by government membership (blue=government) and add a few things to make it clearer, but you can easily change all of this in ggplot2 to get something that’s more to your taste. The sizes can be easily adjusted too, for example, the names here could probably be a bit smaller (just change geom_text(size=)).

Y <- seq(from=1, to=length(Theta$Mean), by=1)

ggplot(Theta, aes(x=Mean, y=Y)) + geom_point(aes(colour=GovCoalition),
    shape=19, size=3) + geom_errorbarh(aes(xmin=Lower, xmax=Upper,colour=GovCoalition), height=0) + geom_text(aes(x=Upper, label=FullID, colour=GovCoalition), size=2.5, hjust=-.05)+ scale_colour_manual(values=c("red", "blue")) + theme(axis.text.y=element_blank(), axis.ticks.y=element_blank(), axis.title=element_blank(), legend.position="none", panel.grid.major.y = element_blank(), panel.grid.major.x=element_line(linetype=1, colour="grey"), panel.grid.minor=element_blank(), panel.background=element_rect(fill="white"), panel.border = element_rect(colour="black", fill=NA, size=.4)) + scale_x_continuous(limits=c(-2.7, 4))

Of course, that’s not all the information we have in our ldata dataframe. We could plot things by party or by state. Let’s plot something by region (since there are a lot of states):

St <- Theta[is.na(Theta$State)==FALSE,]  # take out president
St$Region <- NA
SE <- c("SP", "RJ", "ES", "MG")
S <- c("RS", "PR", "SC")
N <- c("AM", "RO", "RR", "TO", "PA", "AC", "AP")
CW <- c("DF", "GO", "MT", "MS")
NE <- c("CE", "MA", "AL", "RN", "PB", "SE", "PI", "BA", "PE")
St$Region[St$State %in% SE] <- "South-East"
St$Region[St$State %in% S] <- "South"
St$Region[St$State %in% NE] <- "North-East"
St$Region[St$State %in% CW] <- "Centre-West"
St$Region[St$State %in% N] <- "North"

nameorder <- St$FullID[order(St$Region, St$Mean)]
St$FullID <- factor(St$FullID, levels=nameorder)

ggplot(St, aes(x=Mean, y=FullID)) + geom_point(size=3, aes(colour=Region)) + geom_errorbarh(aes(xmin=Lower, xmax=Upper, colour=Region), height=0) + facet_grid(Region ~ ., scales="free_y") + scale_colour_manual(values=c("orange", "black", "red", "blue", "darkgreen")) + theme_bw()

This image might need a little work (the names are illegible for some regions). Regardless, it shows what we can do with ideal points, and something similar with the party is easy as well.

Other parameters of interest

Usually the ideal points are plotted when doing this type of analysis, but beta and alpha are useful too (well, alpha not so much), and can be plotted in the same manner as above.

# Beta 
Beta <- sMS$statistics[grep("beta", row.names(sMS$statistics)),1]
BetaQ <- sMS$quantiles[grep("beta", row.names(sMS$quantiles)), c(1,5)]
Beta <- as.data.frame(cbind(Beta, BetaQ))
colnames(Beta) <- c("Mean", "Lower", "Upper")
Beta$VoteNumber <- vdata$VoteNumber
row.names(Beta) <- NULL
rm(BetaQ)
Beta <- merge(Beta, vdata, by="VoteNumber")
# Alpha 
Alpha <- sMS$statistics[grep("alpha", row.names(sMS$statistics)),1]
AlphaQ <- sMS$quantiles[grep("alpha", row.names(sMS$quantiles)), c(1,5)]
Alpha <- as.data.frame(cbind(Alpha, AlphaQ))
colnames(Alpha) <- c("Mean", "Lower", "Upper")
Alpha$VoteNumber <- vdata$VoteNumber
row.names(Alpha) <- NULL
rm(AlphaQ)

The discrimination parameter $\beta$ is an interesting parameter in its own right. We can use $\beta$ to see how the senators respond to different proposals. We can also analyse the dimensionality of the policy space using $\beta$. Values of $\beta$ where the 95% credible interval crosses zero do not discriminate on this (first) dimension. Votes that have values of $\beta$ where both ends of the 95% C.I. lie on the same side of zero discriminate on the first dimension. These votes tell us something about the substantive content of the first dimension; the non-discriminating votes may tell us about higher dimensions.

Beta$Disc[Beta$Lower<0 & Beta$Upper<0] <- 1
Beta$Disc[Beta$Lower>0 & Beta$Upper>0] <- 1
Beta$Disc[Beta$Lower<0 & Beta$Upper>0] <- 0
Beta$Disc[Beta$Lower>0 & Beta$Upper<0] <- 0
unique(Beta$Content[Beta$Disc==1])
unique(Beta$Content[Beta$Disc==0])

In Brazil, voting patterns are highly correlated with the government preference for each vote. In the following graph, the y-axis is the percentage that vote ‘yes’ on each vote, while the colour relates to the government preference (blue=‘yes’). Circles indicate a non-discriminating vote, while triangles indicate a vote that discriminates.

ggplot(Beta, aes(y=PercentYes, x=Mean)) + geom_point(aes(colour=as.factor(IndGov), shape=as.factor(Disc)), size=2) + geom_text(aes(label=SenNumber, colour=as.factor(IndGov)), size=4, family="Courier", vjust=2) + ylab("Percentage voting 'Yes'") + xlab("Mean of Beta")  + theme_bw() + theme(legend.position="none") + scale_colour_manual(values=c("firebrick", "blue4")) + geom_vline(xintercept=0, linetype="longdash") + scale_shape_manual(values=c(1,2)) + scale_x_continuous(limits=c(-10, 10))

This is certainly not the most beautiful plot in the world! But it sheds light on some aspects of voting behaviour in the Brazilian Federal Senate. The cluster of dark blue in the top middle of the plot shows that a lot of votes pass with high percentages and low absolute values of $\beta$. This suggests a higher dimensional aspect to these data, however, in this specific case, it is more likely that these senators simply delegate to the Executive. The correlation is also far from perfect: plenty of votes receive high percentages of support and a ‘No’ preference from the Executive.

Beta can also be informative when it comes to the passage of a bill, as can be seen in the (South Africa-themed) plot below. PLC0007/10 changes radically in terms of $\beta$ over two rounds of voting, most likely meaning that amendments were made to the original bill that changed how the Senators viewed it.

ggplot(Beta, aes(x=Round, y=Mean))  + geom_text(aes(label=SenNumber, colour=as.factor(IndGov)), size=4, family="Courier") + ylab("Beta Mean") + xlab("No. of times voted on") + geom_line(aes(group = SenNumber), alpha=0.45) + theme_bw() + theme(legend.position="none") + scale_colour_manual(values=c("darkgreen", "goldenrod2"))

We can also use these beta, and alpha, to calculate the Percentage Correctly Predicted, a gauge of model fit. (The code for this is adapted from the source code of pscl.)

# Percentage Correctly Predicted
Tally <- function(x){
  sum(x,na.rm=T)/sum(!is.na(x))
}
x1 <- as.matrix(Theta$Mean)
x1 <- cbind(x1,-1)
b <- as.matrix(cbind(Beta$Mean, Alpha$Mean))
mu <- tcrossprod(x1,b)      
pred.probs <- pnorm(mu)
prediction <- pred.probs >= 0.5 
correct <- y==prediction
legis.percent <- apply(correct,1,Tally)*100   # legislator-specific
party <- ldata$Party
party.percent <- tapply(legis.percent,party,mean) # party-specific
vote.percent=apply(correct,2,Tally)*100
yes.percent=(sum(correct[y==1],na.rm=T)/sum(!is.na(correct[y==1])))*100
no.percent=(sum(correct[y==0],na.rm=T)/sum(!is.na(correct[y==0])))*100
overall.percent=(sum(correct,na.rm=T)/sum(!is.na(correct)))*100
overall.percent
[1] 88.35165

We’re not doing too badly here, with 88% correctly predicted. If this figure was much lower, let’s say 60%, you would want to consider running a two-dimensional model. The code for this is pretty similar, just don’t forget to include D in your stan.data list. In this model, I’m constraining the same two legislators at their positions in the first dimension–they’re free to ‘roam’ in the second. I’ve followed Jackman’s idea of identifying the model by using $\beta$: two roll-call votes, opposite in content (for example, a protectionist bill that seeks to cut foreign competition for domestic goods and a free-trade bill that liberalises foreign trade) can be used to create a scale for the second dimension. These bills are given the following priors: \[\pi(\boldsymbol\beta_1) = {\cal N}\left(\left[0 \atop -4 \right], \left[{.01 \atop 0}\hskip1em {0 \atop 4} \right] \right) \mathinner{\text{and}} \pi(\boldsymbol\beta_2) = {\cal N}\left(\left[0 \atop 4 \right], \left[{.01 \atop 0}\hskip1em {0 \atop 4} \right] \right) \mathinner{\text{(Jackman, 2001, p.235)}}.\] You can identify the model using different methods, indeed, using two points (x,y) for three legislators is the most straight-forward way, if you have good a priori reasons to believe the legislators would occupy certain positions in a second dimension.

###  Two- D Model
stan.code <- "
data {
 int<lower=1> J; //Senators
 int<lower=1> M; //Proposals
 int<lower=1> N; //no. of observations
 int<lower=1, upper=J> j[N]; //Senator for observation n
 int<lower=1, upper=M> m[N]; //proposal for observation n
 int<lower=0, upper=1> y[N]; //vote of observation n
 int<lower=1> D; //no. of dimensions
}
parameters {
  real alpha[M];
  matrix[M,D] beta;
  matrix[J,D] theta;
}
model {
  alpha ~ normal(0,10); 
  to_vector(beta) ~ normal(0,10); 
  to_vector(theta) ~ normal(0,1); 
  theta[1,1] ~  normal(1, .01);
  theta[2,1] ~ normal(-1, .01);  
  beta[1,2] ~ normal(-4, 2); 
  beta[2,2] ~ normal(4, 2); 
  beta[1,1] ~ normal(0, .1); 
  beta[2,1] ~ normal(0, .1); 
for (n in 1:N)
 y[n] ~ bernoulli_logit(theta[j[n],1] * beta[m[n],1] + theta[j[n],2] * beta[m[n],2] - alpha[m[n]]);
}"

D <- 2
stan.data <- list(J=J, M=M, N=N, j=j, m=m, y=y, D=D)

In general, this takes quite a bit longer, with more iterations necessary in order to get good convergence statistics. Instead of running this, let’s load some data from a two-dimensional model and work with that. This will be a new sMS summary mcmc object.

load(url("https://github.com/RobertMyles/Bayesian-Ideal-Point-IRT-Models/blob/master/2DModelSenate53.Rda?raw=true"))

Extracting the ideal points and other parameters is ever so slightly different, given the two dimensions, so I’ll put the code here:

library(plyr)
library(dplyr)
fft <- first(grep("theta", row.names(sMS$statistics)))
llt <- last(grep("theta", row.names(sMS$statistics)))
Theta <- sMS$statistics[fft:llt,1]
ThetaQ <- sMS$quantiles[fft:llt,c(1,5)]
Theta <- as.data.frame(cbind(Theta, ThetaQ))
Theta1 <- Theta[seq(1, length(row.names(Theta)), by=2),]
Theta2 <- Theta[seq(2, length(row.names(Theta)), by=2),]
colnames(Theta1) <- c("MeanD1", "LowerD1", "UpperD1")
colnames(Theta2) <- c("MeanD2", "LowerD2", "UpperD2")
Theta1$FullID <- ldata$FullID
Theta2$FullID <- ldata$FullID
row.names(Theta1) <- NULL
row.names(Theta2) <- NULL
Theta <- merge(Theta1, Theta2, by="FullID")
rm(ThetaQ, Theta1, Theta2)
Theta <- merge(Theta, ldata, by="FullID")

Two-D plots can be a little ugly to look at, so one solution I found is to draw a hull around the points, defining it by party or government coalition. Here we’ll use the latter, since it’s visually cleaner.

find_hull <- function(Theta) Theta[chull(Theta$MeanD1, Theta$MeanD2), ]
hulls <- ddply(Theta, "GovCoalition", find_hull)

ggplot(Theta, aes(x=MeanD1, y=MeanD2, colour=GovCoalition, fill=GovCoalition)) + geom_point(shape=19, size=3) + geom_polygon(data=hulls, alpha=.18) + geom_text(aes(y=MeanD2, label=FullID, colour=GovCoalition), size=2.5, vjust=-0.75) + scale_colour_manual(values=c("red", "blue")) + scale_fill_manual(values=c("red", "blue")) + theme_bw() + theme(axis.text.y=element_blank(), axis.ticks.y=element_blank(), axis.title=element_blank(), legend.position="none") + scale_x_continuous(limits=c(-1.5, 1.5))

That plot looks a little dodgy to me (regarding the specific positions of a few senators), so most likely our strategy of identification through $\beta$ wasn’t a great one for these data.

We can calculate PCP statistics in the same way as before, and plot $\beta$ like we did earlier. If you’re interested in this work, have a look here.

Bayesian Ideal Points with Stan in R

Robert Myles McDonnell

May 14, 2016

Creating Legislator Ideal Points in RStudio with Stan

Graphing

Other parameters of interest