#library(ggplot2)
# Many thanks indeed to Rasmus Bååth
#for his Bayesian First Aid Alternative to the Proportion Test!
#http://www.sumsar.net/blog/2014/06/bayesian-first-aid-prop-test/
library(rjags)
## Loading required package: coda
## Linked to JAGS 4.1.0
## Loaded modules: basemod,bugs
library(mcmc)
library(stringr)
## Warning: package 'stringr' was built under R version 3.2.5
source(file = "bayes_prp_tst.R")
source(file = "r_jags.R")
source(file = "generic.R")
According to the New York Times, Donald J. Trump has won elections beating his rival Hillary Clinton as 279 to 228 votes so far. See http://www.nytimes.com/elections/results/president
Republicans has won majority in the Senate (51/48) and in the House of Representatives (239/193) too.
Let’s try proportion test for the current results using both frequentest and Bayesian (https://en.wikipedia.org/wiki/Bayesian_statistics) models:
prop.test(c(228,48,193),c(228+279,48+51,193+239)) # for Democrats
##
## 3-sample test for equality of proportions without continuity
## correction
##
## data: c(228, 48, 193) out of c(228 + 279, 48 + 51, 193 + 239)
## X-squared = 0.48987, df = 2, p-value = 0.7828
## alternative hypothesis: two.sided
## sample estimates:
## prop 1 prop 2 prop 3
## 0.4497041 0.4848485 0.4467593
prop.test(c(279,51,239),c(228+279,48+51,193+239)) # for Republicans
##
## 3-sample test for equality of proportions without continuity
## correction
##
## data: c(279, 51, 239) out of c(228 + 279, 48 + 51, 193 + 239)
## X-squared = 0.48987, df = 2, p-value = 0.7828
## alternative hypothesis: two.sided
## sample estimates:
## prop 1 prop 2 prop 3
## 0.5502959 0.5151515 0.5532407
The result of the test is not significantly (p-value = 0.7828) different both for two parties proportions. We see that both parties have very close proportions for President and House votes: Democrats (0.449704, 0.4467593) and Republicans (0.5502959, 0.5532407). While votes for Senate differ on both sides but are very close to each other (0.4848485, 0.515151).
fit.d<-bayes.prop.test(c(228,193),c(228+279,191+239))# for Democrats
summary(fit.d)
## Data
## number of successes: 228, 193
## number of trials: 507, 430
##
## Model parameters and generated quantities
## theta[i]: the relative frequency of success for Group i
## x_pred[i]: predicted number of successes in a replication for Group i
## theta_diff[i,j]: the difference between two groups (theta[i] - theta[j])
##
## Measures
## mean sd HDIlo HDIup %<comp %>comp
## theta[1] 0.450 0.022 0.408 0.494 0.988 0.012
## theta[2] 0.449 0.024 0.401 0.496 0.983 0.017
## x_pred[1] 228.016 15.677 196.000 257.000 0.000 1.000
## x_pred[2] 192.880 14.666 163.000 220.000 0.000 1.000
## theta_diff[1,2] 0.001 0.033 -0.062 0.065 0.485 0.515
##
## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.
## '%<comp' and '%>comp' are the probabilities of the respective parameter being
## smaller or larger than 0.5 (except for the theta_diff parameters where
## the comparison value comp is 0.0).
##
## Quantiles
## q2.5% q25% median q75% q97.5%
## theta[1] 0.407 0.435 0.450 0.464 0.493
## theta[2] 0.402 0.433 0.449 0.465 0.496
## x_pred[1] 197.000 217.000 228.000 239.000 258.000
## x_pred[2] 165.000 183.000 193.000 203.000 222.000
## theta_diff[1,2] -0.063 -0.021 0.001 0.023 0.065
plot(fit.d)
fit.r<-bayes.prop.test(c(279,239),c(228+279,193+236))# for Republicans
summary(fit.r)
## Data
## number of successes: 279, 239
## number of trials: 507, 429
##
## Model parameters and generated quantities
## theta[i]: the relative frequency of success for Group i
## x_pred[i]: predicted number of successes in a replication for Group i
## theta_diff[i,j]: the difference between two groups (theta[i] - theta[j])
##
## Measures
## mean sd HDIlo HDIup %<comp %>comp
## theta[1] 0.550 0.022 0.506 0.592 0.012 0.988
## theta[2] 0.557 0.024 0.510 0.604 0.009 0.991
## x_pred[1] 278.959 15.702 247.000 308.000 0.000 1.000
## x_pred[2] 239.100 14.610 211.000 267.000 0.000 1.000
## theta_diff[1,2] -0.007 0.033 -0.070 0.057 0.585 0.415
##
## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.
## '%<comp' and '%>comp' are the probabilities of the respective parameter being
## smaller or larger than 0.5 (except for the theta_diff parameters where
## the comparison value comp is 0.0).
##
## Quantiles
## q2.5% q25% median q75% q97.5%
## theta[1] 0.507 0.536 0.550 0.565 0.592
## theta[2] 0.510 0.541 0.557 0.573 0.604
## x_pred[1] 248.000 268.000 279.000 290.000 310.000
## x_pred[2] 210.000 229.000 239.000 249.000 267.000
## theta_diff[1,2] -0.070 -0.029 -0.007 0.015 0.057
plot(fit.r)
Now we use chi-squared test (https://en.wikipedia.org/wiki/Chi-squared_test) to prove null-hypothesis that there are no unusual outcomes for two parties results obtained in number of votes for President, Senate and House i.e. the proportions of votes in the election events are the same for each party. In other words null-hypothesis should prove homogeneity of results if not rejected.
us_election_2016<-as.table(rbind(c(228,48,193),c(279,51,239)))
dimnames(us_election_2016) <- list(party = c("Democrat", "Republican"),event = c("President","Senate", "House"))
res<-chisq.test(us_election_2016)
res
##
## Pearson's Chi-squared test
##
## data: us_election_2016
## X-squared = 0.48987, df = 2, p-value = 0.7828
res$expected
## event
## party President Senate House
## Democrat 229.078 44.73121 195.1908
## Republican 277.922 54.26879 236.8092
res$residuals
## event
## party President Senate House
## Democrat -0.0712264 0.4887437 -0.1568063
## Republican 0.0646653 -0.4437225 0.1423619
So we can’t reject null-hypothesis i.e. the \(p-value = 0.7828\) shows no unusual proportions and demonstrates pure homogeneity as far as votes obtained by both parties are concerned. That’s the point!