Monte Carlo Simulations in R

Yunkyu Sohn
November 9, 2017

Research Associate, Department of Politics

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Workshop Preliminaries

  1. Workshop Requirements
  2. Research Questions
  3. Contents for This Week

1. Workshop Requirements

Before You Begin

  1. You have access to a laptop computer and Internet Service.
  2. You have downloaded and installed R with RStudio.
  3. You have opened the link for this week's slides at the workshop website: https://compass-workshops.github.io/info/
  4. You have downloaded the dataset under Monte Carlo Simulations in R Data.

2. Research Questions

Party Polarization in United States Congress

bars

2. Research Questions

Simulate parliamentary votes using the most famous model in political science:

  1. Simulate deterministic voting outcomes
  2. Simulate stochastic (probablistic) voting outcomes
  3. Apply the model to empirical ideal points of US Senators

3. Contents for This Week

Contents

  • Objective: Assemble things you learned during this semster to conduct simulations.
  • Part 1: Deterministic voting outcomes
  • Part 2: Stochastic voting outcomes using MCMC
  • Part 3: Generate counterfactual outcomes using empirical ideal points

Contents

  • Markov Chain Monte Carlo
  • First proposed by Ulam and Neumann

bars

Contents

  • Markov Chain Monte Carlo
  • First proposed by Ulam and Neumann
  • Gambling as an anlogy to real life (nature and societies)
    • Think of your future income as a function of your education level
  • Gamble (probabilistic/stochastic) \( \neq \) exact caculation (deterministic)

bars

Part 1: Deterministic Voting Outcomes

Question 1: Spatial Model of Voting

  • Given policy preference locations of legislators and policy proposals, can we predict the politicians' voting profile?
    • Who will vote for Yea, and who will vote for Nay?
    • Which bill will pass, and which will not?

Spatial model of voting

  • A legislator(i.e. voter)'s utility function with her bliss point

bars

Spatial model of voting

  • Policy options: blue and green (e.g. proposal and status quo)
  • Which one she would choose?

bars

Spatial model of voting

  • Policy options blue and green (e.g. proposal and status quo)
  • Which one she would choose?

bars

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

bars

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

bars

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

bars

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

bars

First Challenge

  • How can we operationalize the utility curve into a mathematical function?

bars

First Challenge: Rigorous Formulation

  • Legislator(i.e. voter) \( i \)'s bliss point: \( x_i \)
  • Bill proposal location: \( b_j \)
    • Utility function \( U(x_i,b_j)=-(x_i-b_j)^2 \)
  • Corresponding status quo location: \( s_j \)
    • Utility function \( U(x_i,s_j)=-(x_i-s_j)^2 \)
  • Vote Yea (choosing \( b_j \)) or Nay (choosing \( s_j \))

\[ \text{Vote } \begin{cases} \text{Yea }, & \text{if } U(x_i,b_j) > U(x_i,s_j)\\ \text{Nay }, & \text{if } U(x_i,b_j) < U(x_i,s_j) \end{cases} \]

Question 1: Deterministic Voting Outcome

  • Create a function with input \( {x,b,s} \) and output (indicator: TRUE for Yea & FALSE for Nay).
  • Open New R Script
vote1 <- function(x,b,s){
Ub <- -(x-b)^2 ## U for choosing b
Us <- -(x-s)^2 ## U for choosing s 
outcome <- (Ub>Us) ## Check if Ub is larger than Us
return(outcome) ## Output: whether b is selected instead of s
}

vote1(0.2,0.3,0.4) 
vote1(0.2,0.5,0.4)

Question 1.5: Deterministic Voting Outcome for Vectors

  • Create a function for vectors (multiple observations) \( \mathbf{x} \), \( \mathbf{b} \) and \( \mathbf{s} \).
  • Open New R Script
  • DEFINE VARIABLES
vote2 <- function(x,b,s){
N <- length(x) ## Number of legislators
J <- length(b) ## Number of Bills
Ub <- Us <- matrix(0,N,J) ## N by J
outcome <- matrix(0,N,J) ## N by J
#### ACTUAL COMPUTATIONS ####
return(outcome)} ## Output
  • ACTUAL COMPUTATIONS
for(i in c(1:N)){ ## Loop through legislators
for(j in c(1:J)){ ## Loop through bills
Ub[i,j] <- -(x[i]-b[j])^2 ## U for choosing b
Us[i,j] <- -(x[i]-s[j])^2 ## U for choosing s 
outcome[i,j] <- (Ub[i,j] > Us[i,j]) ## Check if Ub > Us
}}

Question 1.5: Deterministic Voting Outcome for Vectors

  • Create a function for vectors \( \mathbf{x} \), \( \mathbf{b} \) and \( \mathbf{s} \).
  • Open New R Script
vote2 <- function(x,b,s){
N <- length(x) ## Number of legislators
J <- length(b) ## Number of Bills
Ub <- Us <- matrix(0,N,J) ## N by J
outcome <- matrix(0,N,J) ## N by J
for(i in c(1:N)){
for(j in c(1:J)){
Ub[i,j] <- -(x[i]-b[j])^2 ## U for choosing b
Us[i,j] <- -(x[i]-s[j])^2 ## U for choosing s 
outcome[i,j] <- (Ub[i,j] > Us[i,j]) ## Check if Ub > Us
}}
return(outcome)} ## Output

## Run below many times 
vote2(c(2:4)/10,c(1:5)/10,c(3,3,3,3,3)/10)

Check-In 1: Simulation of Deterministic Outcomes

  • At this point you should have:

    • created user-defined functions for deterministic outcome simulation

Question 2: Stochastic Voting Outcome

  • How to incorporate idiosyncratic shocks and unobserved factors

Deterministic Voting Model

  • Deterministic Voting Model

\[ \text{Vote } \begin{cases} \text{Yea }, & \text{if } U(x_i,b_j) - U(x_i,s_j) > 0\\ \text{Nay }, & \text{if } U(x_i,b_j) - U(x_i,s_j) < 0 \end{cases} \]

bars

Probablistic Voting Model

  • How to incorporate idiosyncratic shocks and unobserved factors
  • How to reflect probablistic nature of voting
  • Probablistic Voting Model

\[ \text{Vote } \begin{cases} \text{Yea }, & \text{with Probability } F[U(x_i,b_j) - U(x_i,s_j)]\\ \text{Nay }, & \text{with Probability } 1 - F[U(x_i,b_j) - U(x_i,s_j)] \end{cases} \]

bars

Probablistic Voting Model

  • Probablistic Voting Model

\[ \text{Vote } \begin{cases} \text{Yea }, & \text{with Probability } F[U(x_i,b_j) - U(x_i,s_j)]\\ \text{Nay }, & \text{with Probability } 1 - F[U(x_i,b_j) - U(x_i,s_j)] \end{cases} \]

  • Deterministic Voting Model

\[ \text{Vote } \begin{cases} \text{Yea }, & \text{if } U(x_i,b_j) - U(x_i,s_j) > 0\\ \text{Nay }, & \text{if } U(x_i,b_j) - U(x_i,s_j) < 0 \end{cases} \]

Probablistic Voting Model

  • Specification of \( F(x) \)

    • Let \( F(x) \) be cumulative normal (Gaussian):
    pnorm(q, mean = 0, sd = 1)
    

bars

Probablistic Voting Model

  • How would you RELAIZE the probablistic events?

    • Vector of n Uniform random variables in the range [min,max]:
    • Default: min = 0, max = 1
    runif(n, min = 0, max = 1)
    
    • Use runif to realize voting outcomes
    ## Compare how likely you will get TRUE
    (pnorm(rep(10, 10)) > runif(10))
    (pnorm(rep(0, 10)) > runif(10))
    (pnorm(rep(-10, 10)) > runif(10))
    
    • Remember: higher value in the left, more likely to get TRUE

Question 2: Stochastic Voting

  • Create a function with input \( {x,b,s} \) and output (indicator: TRUE for Yea & FALSE for Nay).
  • Recap

    • Probablistic Voting Model
      \[ \text{Vote } \begin{cases} \text{Yea }, & \text{with Probability } F[U(x_i,b_j) - U(x_i,s_j)]\\ \text{Nay }, & \text{with Probability } 1 - F[U(x_i,b_j) - U(x_i,s_j)] \end{cases} \]
    • Deterministic Voting Model
      \[ \text{Vote } \begin{cases} \text{Yea }, & \text{if } U(x_i,b_j) - U(x_i,s_j) > 0\\ \text{Nay }, & \text{if } U(x_i,b_j) - U(x_i,s_j) < 0 \end{cases} \]

Question 2: Stochastic Voting

  • Create a function with input \( {x,b,s} \) and output (indicator: TRUE for Yea & FALSE for Nay).
  • See ONLY THIS LINE SHOULD BE MODIFIED!!! for the comparison with deterministic version vote2(x,b,s)
vote2 <- function(x,b,s){
N <- length(x) ## Number of legislators
J <- length(b) ## Number of Bills
Ub <- Us <- matrix(0,N,J) ## N by J
outcome <- matrix(0,N,J) ## N by J
for(i in c(1:N)){
for(j in c(1:J)){
Ub[i,j] <- -(x[i]-b[j])^2 ## U for choosing b
Us[i,j] <- -(x[i]-s[j])^2 ## U for choosing s 
###### ONLY THIS LINE SHOULD BE MODIFIED!!! ###### 
outcome[i,j] <- (Ub[i,j] > Us[i,j]) ## Check if Ub > Us
###### ###### ###### ###### ###### ###### ###### 
}}
return(outcome)} ## Output

Question 2: Stochastic Voting

  • Create a function with input \( {x,b,s} \) and output (indicator: TRUE for Yea & FALSE for Nay).
vote3 <- function(x,b,s){
N <- length(x) ## Number of legislators
J <- length(b) ## Number of Bills
Ub <- Us <- matrix(0,N,J) ## N by J
outcome <- matrix(0,N,J) ## N by J
for(i in c(1:N)){
for(j in c(1:J)){
Ub[i,j] <- -(x[i]-b[j])^2 ## U for choosing b
Us[i,j] <- -(x[i]-s[j])^2 ## U for choosing s 
###### ###### ###### ###### ###### ###### ###### 
outcome[i,j] <- (pnorm(Ub[i,j]-Us[i,j]) > runif(1))
###### ###### ###### ###### ###### ###### ###### 
}}
return(outcome)} ## Output
## Compare vote2 and vote 3 by running them multiple times
vote3(c(1:3)/10,c(1:5)/10,c(3,3,3,3,3)/10)
vote2(c(1:3)/10,c(1:5)/10,c(3,3,3,3,3)/10)

Check-In 2: Simulation of Stochastic Outcomes

  • At this point you should have:

    • learned how to create stochastic outcomes using uniform random variable fuction runif()
    • created a function for probablistic voting, and simulated votes

Feedback Survey:

Part 3: Generate Counterfactual Outcomes Using Empirical Ideal Points

Question 3: Simulate Your Own Congress Using Senate Voting Dataset


bars

Loading the dataset

  • 113th Senator Ideal Points

    RCV <- read.csv("113RCV.csv")
    View(RCV)
    
    plot(RCV$ideology1,RCV$ideology2,col=RCV$party)
    

Question 3: Simulate Your Own Congress Using Senate Ideal Points

  • Generate bill(J,DM,RM) function using leading dimensional estimates (ideology1)
    • J: number of {bill, status quo} pairs
    • DM (RM) : mean ideal point of Democrats (Republicans)
bill <- function(J,DM,RM){
###### ACTUAL COMPUTATIONS ###### 
result <- list(b=b,s=s)
return(result)}
  • Assume each bill proposal is given as a function of party mean ideology
b <- c(rep(DM,J/2),rep(RM,J/2)) ## Bill locations
b <- c(rep(DM,J/2),rep(RM,J/2)) + rnorm(J, mean = 0, sd = 0.1) ## add Gaussian-shaped noise
  • Assume each status quo is given as a function of bill proposals
s <- b + rnorm(J,mean = 0, sd = 0.2) ## Status quos

Question 3: Simulate Your Own Congress Using Senate Ideal Points

  • Generate bill(J,DM,RM) function
    • J: number of {bill, status quo} pairs
    • DM (RM) : mean ideal point of Democrats (Republicans)
    • Assume each proposal is given as a function of party mean ideology
    • Only use the leading dimensional estimates (ideology1)
bill <- function(J,DM,RM){
b <- c(rep(DM,J/2),rep(RM,J/2)) + rnorm(J, mean = 0, sd = 0.1) ## Bill locations
s <- b + rnorm(J,mean = 0, sd = 0.4) ## Status quos
result <- list(b=b,s=s)
return(result)}

DM <- mean(RCV$ideology1[which(RCV$party=='D')])
RM <- mean(RCV$ideology1[which(RCV$party=='R')])
proposals <- bill(2000,DM,RM)
out3 <- vote3(RCV$ideology1,proposals$b,proposals$s)

Question 3.5: Verify Party Polarization

  • How likely are two legislators from same party to vote same?
  • How likely are two legislators from different party to vote same?
same <- out3 %*% t(out3) + (1-out3) %*% t(1-out3) # Number of votes casted same
dim(same)
  • Subset same matrix by party labels
Dsame <- same[which(RCV$party=='D'),which(RCV$party=='D')]
Dsame <- Dsame[lower.tri(Dsame, diag = FALSE)]
Rsame <- same[which(RCV$party=='R'),which(RCV$party=='R')]
Rsame <- Rsame[lower.tri(Rsame, diag = FALSE)]
DRsame <- same[which(RCV$party=='R'),which(RCV$party=='D')]
par(mfrow=c(1,3));
hist(Dsame, prob = TRUE, main="Dem", xlab="# Votes",xlim = c(900,1200))
abline(v = median(Dsame), col = "red", lwd = 2)
hist(Rsame, prob = TRUE, main="Rep", xlab="# Votes",xlim = c(900,1200))
abline(v = median(Rsame), col = "red", lwd = 2)
hist(DRsame, prob = TRUE, main="{Dem, Rep}", xlab="# Votes",xlim = c(900,1200))
abline(v = median(DRsame), col = "red", lwd = 2)

Recap of the Workshop

  • At this point you should have:
    • learned the basics of MCMC simulation using the canonical voting model
    • learned the difference between deterministic and stochastic simulations
    • learned how to incorporate empirical datasets for simulation
    • learned a way to demonstrate simulation outcomes using plots

For Advanced Students

  • Think how you can enhance the speed of your codes
    • Replace loops with linear algebraic compuations
    • Generate a vector (or matrix) of random variables at once
  • For those interested in American Congressional datasets:
  • Check Grolemund & Wickham's new book:

For More Information: