Budget deficit: case study for Republicans and Democrats

Historical background
The problem
Research question
Normal distribution test
Wilcoxon signed-rank test
Byaesian Student test
Classical linear regression
Byaesian linear regression
Conclusion

Historical background

US presidential elections is an old story about Republican’s and Democrat’s ambitions, voters, taxes and budget deficit. The 2016 election campaign in spite of all scandals and rhetoric will not be something odd.

The problem

I found some interesting data about US government budget deficit: http://goliards.us/adelphi/deficits/index.html. The author of this web page claims “In my lifetime, every Democratic President has left office with a smaller deficit than he inherited, and every Republican President except Nixon has left office with a larger deficit than he inherited. This may be because Republican Presidents have placed high priority on cutting taxes, and placed lower priority on (or had less success at) cutting spending. Democratic Presidents have perhaps had equal success at cutting spending (I haven’t researched those numbers), but have not been bound by promises to cut taxes.” Let’s explore the data.

uspresdef<-read.csv(file = "uspresdef.csv",header = T)
uspresdef

##     X      President      Party def.change def.avg.change years
## 1   1      B.H.Obama   Democrat    -804494        -134082  6.00
## 2   2       G.W.Bush Republican     942085         117761  8.00
## 3   3    W.J.Clinton   Democrat    -173244         -21655  8.00
## 4   4     G.H.W.Bush Republican      50010          12503  4.00
## 5   5       R.Reagan Republican     123646          15456  8.00
## 6   6       J.Carter   Democrat     -46628         -11657  4.00
## 7   7        G.Ford  Republican     103701          31908  3.25
## 8   8      R.M.Nixon Republican     -31792          -6358  5.00
## 9   9    L.B.Johnson   Democrat     -31792          -6358  5.00
## 10 10    J.F.Kennedy   Democrat       3617           1206  3.00
## 11 11 D.D.Eisenhower Republican     -12680          -1585  8.00
## 12 12     H.S.Truman   Democrat    -272072         -34009  8.00
## 13 13  F.D.Roosevelt   Democrat     252936          21078 12.00
## 14 14       H.Hoover Republican      38120           9530  4.00
## 15 15     C.Coolidge Republican       2100            351  6.00
## 16 16    W.G.Harding Republican     -18097          -9049  2.00
## 17 17       W.Wilson   Democrat      11573           1447  8.00

summary(uspresdef)

##        X               President         Party     def.change     
##  Min.   : 1   B.H.Obama     : 1   Democrat  :8   Min.   :-804494  
##  1st Qu.: 5   C.Coolidge    : 1   Republican:9   1st Qu.: -31792  
##  Median : 9   D.D.Eisenhower: 1                  Median :   2100  
##  Mean   : 9   F.D.Roosevelt : 1                  Mean   :   8058  
##  3rd Qu.:13   G.Ford        : 1                  3rd Qu.:  50010  
##  Max.   :17   G.H.W.Bush    : 1                  Max.   : 942085  
##               (Other)       :11                                   
##  def.avg.change          years       
##  Min.   :-134082.0   Min.   : 2.000  
##  1st Qu.:  -9049.0   1st Qu.: 4.000  
##  Median :    351.0   Median : 6.000  
##  Mean   :   -794.9   Mean   : 6.015  
##  3rd Qu.:  12503.0   3rd Qu.: 8.000  
##  Max.   : 117761.0   Max.   :12.000  
##

boxplot(def.change~Party,col=c("blue","red"),data = uspresdef,main="US budget deficit change by party", ylab="Millions of dollars")
grid()

cols <- c("blue", "red")[(uspresdef$Party=="Republican")+1]  
barplot(uspresdef$def.change,names.arg=uspresdef$President,col=cols,main = "US budget deficit change by president",cex.names = 0.7,ylab = "Million dollars",legend.text = c("Democrat", "Republican"),las=2)

These diagrams show that one day, maybe hundreds years later, US budget deficit would be equal to zero provided democrats will keep reducing it in the current tempo.

Research question

Our goal is to check out if there is any difference between Republican’s and Democrat’s presidents as far as changes in budget deficit are concerned. So we got null hypothesis H₀ : there is no difference between Republicans and Democrats. The alternative hypothesis H₁ : there is difference between Republicans and Democrats.

Normal distribution test

library(fitdistrplus)

## Warning: package 'fitdistrplus' was built under R version 3.2.4

## Loading required package: MASS

plotdist(uspresdef$def.change, histo = TRUE, demp = TRUE)

shapiro.test(uspresdef$def.change)

## 
##  Shapiro-Wilk normality test
## 
## data:  uspresdef$def.change
## W = 0.78534, p-value = 0.001286

According to the obtained results distribution failed to pass the test for normality. So we can’t apply classical Student test. Instead we’ll use Wilcoxon signed-rank test (see https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test).

Wilcoxon signed-rank test

we have two independent samples (D, P) for budget deficit if Democrats and Republicans are independent in their policy. In fact we don’t know the future that is why our data is a sample not a population.

wilcox.test(def.change~Party,data = uspresdef,exact=F)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  def.change by Party
## W = 16.5, p-value = 0.06734
## alternative hypothesis: true location shift is not equal to 0

So we can’t reject null hypothesis. Let’s try robust Bayesian Student test (see http://www.indiana.edu/~kruschke/BEST/BEST.pdf).

Byaesian Student test

library(BayesianFirstAid)

## Loading required package: rjags

## Loading required package: coda

## Linked to JAGS 4.1.0

## Loaded modules: basemod,bugs

tb<-bayes.t.test(def.change~Party,data = uspresdef,set.seed(12345))
plot(tb)

We see the proof of alternative hypothesis H₁ which can be estimated in the aspect of posterior probability distribution of means, their difference and effect size using MCMC methods (see https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo).

Classical linear regression

See: https://en.wikipedia.org/wiki/Linear_regression.

fit.lm<-lm(data = uspresdef,def.change~Party+years)
summary(fit.lm)

## 
## Call:
## lm(formula = def.change ~ Party + years, data = uspresdef)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -640933 -149853  -26654  168112  699831 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      -411946     230904  -1.784   0.0961 .
## PartyRepublican   323020     152262   2.121   0.0522 .
## years              41398      30356   1.364   0.1942  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 301100 on 14 degrees of freedom
## Multiple R-squared:  0.2691, Adjusted R-squared:  0.1646 
## F-statistic: 2.577 on 2 and 14 DF,  p-value: 0.1115

par(mfrow=c(2,3))
plot(fit.lm,which = 1:6,labels.id = uspresdef$President)

library(purrr)

## Warning: package 'purrr' was built under R version 3.2.4

uspresdef %>% split (.$Party) %>% 
          map(~lm(def.change~years, data = .x)) %>% 
          map(summary) %>% map_dbl("r.squared")

##   Democrat Republican 
##  0.0623692  0.2033796

We can see that our linear model has p-value=0.1115 (there is no significant relationship between variables - https://en.wikipedia.org/wiki/P-value) and can’t be used although tests for residuals seem to be consistent with ideal linear model except outliers: G.W.Bush, B.H.Obama and J.F.Kennedy. R² for linear models separated by parties demonstrate 20% value for Republican’s presidents and 6% for Democrat’s presidents.

Byaesian linear regression

So we try Bayesian linear regression. See: https://en.wikipedia.org/wiki/Bayesian_linear_regression.

par(mfrow=c(1,1))
library(bayesboot)
set.seed(12345)
lm.coefs <- function(d, w) {
  coef( lm(def.change~Party+years, data = d, weights = w) )
}

blm <- bayesboot(uspresdef, lm.coefs, R = 10000, use.weights = TRUE)
plot(blm)

summary(blm)

## Bayesian bootstrap
## 
## Number of posterior draws: 10000 
## 
## Summary of the posterior (with 95% Highest Density Intervals):
##        statistic       mean        sd     hdi.low  hdi.high
##      (Intercept) -420927.10 225591.59 -881432.200 -48730.17
##  PartyRepublican  332084.44 152557.84   83081.999 643408.55
##            years   41059.92  26635.42   -7582.108  97152.70
## 
## Quantiles:
##        statistic       q2.5%      q25%     median       q75%    q97.5%
##      (Intercept) -969018.259 -549329.3 -382124.74 -255040.66 -90666.02
##  PartyRepublican   99781.786  215571.7  311918.49  424734.35 673868.97
##            years   -4763.723   22862.6   38215.63   57428.72 100578.14
## 
## Call:
##  bayesboot(data = uspresdef, statistic = lm.coefs, R = 10000,      use.weights = TRUE)

We see some interesting result for Bayesian linear model especially in Republican party deficit leverage expressed in posterior probability distribution, while timeline in years is tenfold lower for any party.

Conclusion

Bayesian methods give us more robust estimation both for Student test and for linear regression.
G.W.Bush and B.H.Obama are outliers in deficit budget statistics in comparison with their predecessors.
The mean increase in deficit produced by Republicans is lower than the mean decrease in deficit produced by Democrats with the probability about 87% for median value about 100000 million dollars.
Republican’s presidents have significant leverage in budget deficit with the median value about 312000 million dollars.
No matter what party wins the 2016 presidential election US national debt will grow with the probability of 99.99% (joke).