US presidential elections is an old story about Republican’s and Democrat’s ambitions, voters, taxes and budget deficit. The 2016 election campaign in spite of all scandals and rhetoric will not be something odd.
I found some interesting data about US government budget deficit: http://goliards.us/adelphi/deficits/index.html. The author of this web page claims “In my lifetime, every Democratic President has left office with a smaller deficit than he inherited, and every Republican President except Nixon has left office with a larger deficit than he inherited. This may be because Republican Presidents have placed high priority on cutting taxes, and placed lower priority on (or had less success at) cutting spending. Democratic Presidents have perhaps had equal success at cutting spending (I haven’t researched those numbers), but have not been bound by promises to cut taxes.” Let’s explore the data.
uspresdef<-read.csv(file = "uspresdef.csv",header = T)
uspresdef
## X President Party def.change def.avg.change years
## 1 1 B.H.Obama Democrat -804494 -134082 6.00
## 2 2 G.W.Bush Republican 942085 117761 8.00
## 3 3 W.J.Clinton Democrat -173244 -21655 8.00
## 4 4 G.H.W.Bush Republican 50010 12503 4.00
## 5 5 R.Reagan Republican 123646 15456 8.00
## 6 6 J.Carter Democrat -46628 -11657 4.00
## 7 7 G.Ford Republican 103701 31908 3.25
## 8 8 R.M.Nixon Republican -31792 -6358 5.00
## 9 9 L.B.Johnson Democrat -31792 -6358 5.00
## 10 10 J.F.Kennedy Democrat 3617 1206 3.00
## 11 11 D.D.Eisenhower Republican -12680 -1585 8.00
## 12 12 H.S.Truman Democrat -272072 -34009 8.00
## 13 13 F.D.Roosevelt Democrat 252936 21078 12.00
## 14 14 H.Hoover Republican 38120 9530 4.00
## 15 15 C.Coolidge Republican 2100 351 6.00
## 16 16 W.G.Harding Republican -18097 -9049 2.00
## 17 17 W.Wilson Democrat 11573 1447 8.00
summary(uspresdef)
## X President Party def.change
## Min. : 1 B.H.Obama : 1 Democrat :8 Min. :-804494
## 1st Qu.: 5 C.Coolidge : 1 Republican:9 1st Qu.: -31792
## Median : 9 D.D.Eisenhower: 1 Median : 2100
## Mean : 9 F.D.Roosevelt : 1 Mean : 8058
## 3rd Qu.:13 G.Ford : 1 3rd Qu.: 50010
## Max. :17 G.H.W.Bush : 1 Max. : 942085
## (Other) :11
## def.avg.change years
## Min. :-134082.0 Min. : 2.000
## 1st Qu.: -9049.0 1st Qu.: 4.000
## Median : 351.0 Median : 6.000
## Mean : -794.9 Mean : 6.015
## 3rd Qu.: 12503.0 3rd Qu.: 8.000
## Max. : 117761.0 Max. :12.000
##
boxplot(def.change~Party,col=c("blue","red"),data = uspresdef,main="US budget deficit change by party", ylab="Millions of dollars")
grid()
cols <- c("blue", "red")[(uspresdef$Party=="Republican")+1]
barplot(uspresdef$def.change,names.arg=uspresdef$President,col=cols,main = "US budget deficit change by president",cex.names = 0.7,ylab = "Million dollars",legend.text = c("Democrat", "Republican"),las=2)
These diagrams show that one day, maybe hundreds years later, US budget deficit would be equal to zero provided democrats will keep reducing it in the current tempo.
Our goal is to check out if there is any difference between Republican’s and Democrat’s presidents as far as changes in budget deficit are concerned. So we got null hypothesis H0 : there is no difference between Republicans and Democrats. The alternative hypothesis H1 : there is difference between Republicans and Democrats.
library(fitdistrplus)
## Warning: package 'fitdistrplus' was built under R version 3.2.4
## Loading required package: MASS
plotdist(uspresdef$def.change, histo = TRUE, demp = TRUE)
shapiro.test(uspresdef$def.change)
##
## Shapiro-Wilk normality test
##
## data: uspresdef$def.change
## W = 0.78534, p-value = 0.001286
According to the obtained results distribution failed to pass the test for normality. So we can’t apply classical Student test. Instead we’ll use Wilcoxon signed-rank test (see https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test).
we have two independent samples (D, P) for budget deficit if Democrats and Republicans are independent in their policy. In fact we don’t know the future that is why our data is a sample not a population.
wilcox.test(def.change~Party,data = uspresdef,exact=F)
##
## Wilcoxon rank sum test with continuity correction
##
## data: def.change by Party
## W = 16.5, p-value = 0.06734
## alternative hypothesis: true location shift is not equal to 0
So we can’t reject null hypothesis. Let’s try robust Bayesian Student test (see http://www.indiana.edu/~kruschke/BEST/BEST.pdf).
library(BayesianFirstAid)
## Loading required package: rjags
## Loading required package: coda
## Linked to JAGS 4.1.0
## Loaded modules: basemod,bugs
tb<-bayes.t.test(def.change~Party,data = uspresdef,set.seed(12345))
plot(tb)
We see the proof of alternative hypothesis H1 which can be estimated in the aspect of posterior probability distribution of means, their difference and effect size using MCMC methods (see https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo).
See: https://en.wikipedia.org/wiki/Linear_regression.
fit.lm<-lm(data = uspresdef,def.change~Party+years)
summary(fit.lm)
##
## Call:
## lm(formula = def.change ~ Party + years, data = uspresdef)
##
## Residuals:
## Min 1Q Median 3Q Max
## -640933 -149853 -26654 168112 699831
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -411946 230904 -1.784 0.0961 .
## PartyRepublican 323020 152262 2.121 0.0522 .
## years 41398 30356 1.364 0.1942
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 301100 on 14 degrees of freedom
## Multiple R-squared: 0.2691, Adjusted R-squared: 0.1646
## F-statistic: 2.577 on 2 and 14 DF, p-value: 0.1115
par(mfrow=c(2,3))
plot(fit.lm,which = 1:6,labels.id = uspresdef$President)
library(purrr)
## Warning: package 'purrr' was built under R version 3.2.4
uspresdef %>% split (.$Party) %>%
map(~lm(def.change~years, data = .x)) %>%
map(summary) %>% map_dbl("r.squared")
## Democrat Republican
## 0.0623692 0.2033796
We can see that our linear model has p-value=0.1115 (there is no significant relationship between variables - https://en.wikipedia.org/wiki/P-value) and can’t be used although tests for residuals seem to be consistent with ideal linear model except outliers: G.W.Bush, B.H.Obama and J.F.Kennedy. R2 for linear models separated by parties demonstrate 20% value for Republican’s presidents and 6% for Democrat’s presidents.
So we try Bayesian linear regression. See: https://en.wikipedia.org/wiki/Bayesian_linear_regression.
par(mfrow=c(1,1))
library(bayesboot)
set.seed(12345)
lm.coefs <- function(d, w) {
coef( lm(def.change~Party+years, data = d, weights = w) )
}
blm <- bayesboot(uspresdef, lm.coefs, R = 10000, use.weights = TRUE)
plot(blm)
summary(blm)
## Bayesian bootstrap
##
## Number of posterior draws: 10000
##
## Summary of the posterior (with 95% Highest Density Intervals):
## statistic mean sd hdi.low hdi.high
## (Intercept) -420927.10 225591.59 -881432.200 -48730.17
## PartyRepublican 332084.44 152557.84 83081.999 643408.55
## years 41059.92 26635.42 -7582.108 97152.70
##
## Quantiles:
## statistic q2.5% q25% median q75% q97.5%
## (Intercept) -969018.259 -549329.3 -382124.74 -255040.66 -90666.02
## PartyRepublican 99781.786 215571.7 311918.49 424734.35 673868.97
## years -4763.723 22862.6 38215.63 57428.72 100578.14
##
## Call:
## bayesboot(data = uspresdef, statistic = lm.coefs, R = 10000, use.weights = TRUE)
We see some interesting result for Bayesian linear model especially in Republican party deficit leverage expressed in posterior probability distribution, while timeline in years is tenfold lower for any party.