#library(ggplot2)
# Many thanks indeed to Rasmus Bååth 
#for his  Bayesian First Aid Alternative to the Proportion Test!
#http://www.sumsar.net/blog/2014/06/bayesian-first-aid-prop-test/
library(rjags)
## Loading required package: coda
## Linked to JAGS 4.1.0
## Loaded modules: basemod,bugs
library(mcmc)
library(stringr)
source(file = "bayes_prp_tst.R")
source(file = "r_jags.R")
source(file = "generic.R")

Presidential Election Process

An election for President of the United States occurs every four years on Election Day, held the first Tuesday after the first Monday in November. The 2016 Presidential election will be held on November 8, 2016.

The election process begins with the primary elections and caucuses and moves to nominating conventions, during which political parties each select a nominee to unite behind. The nominee also announces a Vice Presidential running mate at this time. The candidates then campaign across the country to explain their views and plans to voters and participate in debates with candidates from other parties. See: https://www.usa.gov/election

2016 Delegate Count and Primary Results

According to the Associated Press, Donald J. Trump and Hillary Clinton have each won enough delegates to claim their party’s nomination for president. Delegate totals include unpledged delegates, also known as superdelegates, who are free to support any candidate at the party conventions.

Primaries results as of 7th June 2016, http://www.nytimes.com/.

Proportion test

So far we got two nominees for Republicans and Democrats: Donald J. Trump and Hillary Clinton. Though H.Clinton has got advantage in absolute voices gained (1812) compared to D.Trump (1144) there is some intrigue in proportions for both candidates (pledged delegates/total delegates). We see \({P}_{DT}=1144/1239=0.92\) for D.Trump and \({P}_{HC}=1812/2383=0.76\) for H.Clinton. Let’s try proportion test for both frequentist (https://rpubs.com/alex-lev/111354) and Bayesian (https://en.wikipedia.org/wiki/Bayesian_statistics) models:

Frequentist view

prop.test(c(1144,1812),c(1239,2382))
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(1144, 1812) out of c(1239, 2382)
## X-squared = 142.69, df = 1, p-value < 2.2e-16
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.1393556 0.1858843
## sample estimates:
##    prop 1    prop 2 
## 0.9233253 0.7607053

The result is significantly (p-value < 2.2e-16) different for two proportions.

Bayesian view

fit<-bayes.prop.test(c(1144,1812),c(1239,2382))
summary(fit)
##   Data
## number of successes:  1144, 1812
## number of trials:     1239, 2382
## 
##   Model parameters and generated quantities
## theta[i]: the relative frequency of success for Group i
## x_pred[i]: predicted number of successes in a replication for Group i
## theta_diff[i,j]: the difference between two groups (theta[i] - theta[j])
## 
##   Measures
##                     mean     sd    HDIlo    HDIup %<comp %>comp
## theta[1]           0.923  0.008    0.908    0.937      0      1
## theta[2]           0.761  0.009    0.743    0.778      0      1
## x_pred[1]       1143.193 13.378 1115.000 1167.000      0      1
## x_pred[2]       1811.513 29.334 1751.000 1866.000      0      1
## theta_diff[1,2]    0.162  0.012    0.140    0.185      0      1
## 
## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.
## '%<comp' and '%>comp' are the probabilities of the respective parameter being
## smaller or larger than 0.5 (except for the theta_diff parameters where
## the comparison value comp is 0.0).
## 
##   Quantiles
##                    q2.5%     q25%   median     q75%   q97.5%
## theta[1]           0.907    0.918    0.923    0.928    0.937
## theta[2]           0.744    0.755    0.760    0.767    0.778
## x_pred[1]       1116.000 1134.000 1144.000 1152.000 1168.000
## x_pred[2]       1753.000 1792.000 1812.000 1832.000 1868.000
## theta_diff[1,2]    0.139    0.154    0.162    0.170    0.185
plot(fit)

diagnostics(fit)#diagnostics
## 
## Iterations = 1:5000
## Thinning interval = 1 
## Number of chains = 3 
## Sample size per chain = 5000 
## 
##   Diagnostic measures
##                     mean     sd mcmc_se n_eff  Rhat
## theta[1]           0.923  0.008   0.000 15587 1.001
## theta[2]           0.761  0.009   0.000 15466 1.000
## x_pred[1]       1143.193 13.378   0.110 14719 1.000
## x_pred[2]       1811.513 29.334   0.235 15572 1.000
## theta_diff[1,2]    0.162  0.012      NA    NA    NA
## 
## mcmc_se: the estimated standard error of the MCMC approximation of the mean.
## n_eff: a crude measure of effective MCMC sample size.
## Rhat: the potential scale reduction factor (at convergence, Rhat=1).
## 
##   Model parameters and generated quantities
## theta: The relative frequency of success
## x_pred: Predicted number of successes in a replication
## theta_diff[i,j]: the difference between two groups (theta[i] - theta[j])

So does Bayesian test with adequate diagnostics.

Sanders and Clinton against Trump

What if Sanders’s pledged delegates voices would be added to Clinton’s ones? Ask frequentist:

prop.test(c(1144,1812+1521),c(1239,2382+1569))
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(1144, 1812 + 1521) out of c(1239, 2382 + 1569)
## X-squared = 49.939, df = 1, p-value = 1.586e-12
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.06056212 0.09892060
## sample estimates:
##    prop 1    prop 2 
## 0.9233253 0.8435839

And what about Bayes?

fit2<-bayes.prop.test(c(1144,1812+1521),c(1239,2382+1569))
plot(fit2)

Well, H.Clinton support by Sanders pledged delegates voters would grow \({P}_{HC+S}=1812+1521/2382+1569=0.84\) but not enough to beat D.Trump significantly.

Conclusions

  1. We see significant difference in proportions of pledged delegates votes to total for both Republican’s and Democrat’s nominees.
  2. Donald J. Trump has more support of Republican’s pledged delegates (91%-94%) as compared to H.Clinton voters (74%-78%) so far in terms of proportion (pledged delegates/total delegates).
  3. If Sanders’s pledged delegates voices would be added to Clinton’s delegates voices, her support would grow to (83$-85%) but still would be less than Trump’s support so far.
  4. Unfortunately we can’t predict the choice of Cruz, Rubio and Kasich pledged delegates voters in the nearest future (Trump or Clinton). So cherchez la femme and keep on watching http://markets.ft.com/data!
  5. Both Republican and Democrat nominees will do their best to beat each other promising their voters as much as they can do in the bounds of budget constraints and deficit legislated by the US Congress (https://rpubs.com/alex-lev/162165).