Summary

  • A House candidate’s “personal vote” is that portion of their vote that is due to the candidate’s personal attributes (popularity, name recognition, reputation, history of service and performance as an incumbent or in non-political endeavours) as opposed to their party affiliation.

  • A convenient method for estimating a House candidate’s personal vote is to compute the difference between their 1st preference vote and their party’s share of 1st preferences cast for Senate in the candidate’s division.

  • In this report I examine variation in this quantity for Coalition, Labor and Green House candidates in the 2016 Australian Federal election.

  • Senate votes are much more dispersed across a greater number of minor party tickets/candidates than are House votes. Therefore, personal votes calculated as the difference between House and Senate vote shares are almost always positive, in that House vote shares almost always exceed Senate vote shares in a given division, for the Coalition, Labor and typically the Greens as well. Accordingly, politically meaningful insight follows from observing not the absolute magnitudes of these apparent personal votes, but the relative magnitudes of the House/Senate differneital and how this quantity varies across candidates.

  • Incumbents enjoy higher personal votes than challengers, by as much as 4 to 5 percentage points. This incumbency advantage in personal votes does not appear to vary by party.

  • Coalition candidates appear to have a slightly higher personal vote than Labor candidates, about 2/3s of a percentage points. Green candidates have a smaller personal vote than Labor candidates, by as much two percentage points.

  • Candidate gender is not associated with the magnitude of the personal vote, save for some weak evidence suggesting male Coalition candidates have a higher personal vote than female Coalition candidates.

  • Ballot position and ballot length shape the magnitude of personal votes. Being listed at the top of the House ballot papers lifts a candidate’s appartent personal vote share by 1.25 percentage points. Similarly, as the number of House candidates grows, the magnitude of the personal vote drops, by about half a percentage point for every extra House candidate. Since ballot position and ballot length effects are not personal attributes of candidates — and indeed, in the case of ballot position, assigned randomly — care should be taken when intepreting the difference between House vote and Senate vote shares as an estimate of the House candidate’s “personal vote”.

  • A caveat. Consider the following possibility, where a House candidate is so well known or so popular in their electoral division that they generate coattails for their party’s Senate ticket. In this case the difference between the House vote and the Senate vote could well understate the candidate’s personal vote, in that the latter is inflated by the candidate’s popularity.

Observations regarding Wentworth

  • For Coalition incumbents in NSW in 2016, the average difference between House and Senate vote was 8.2 percentage points; in Wentworth this difference was 9.3 percentage points.

  • Turnbull’s apparent personal vote in Wentworth is not especially large relative to the apparent personal votes of other NSW Coalition incumbents in 2016.

  • Wentworth is more “distinct” with respect to its large House and Senate votes for the Coalition.

  • If in 2016, Turnbull’s apparent personal vote was hypothetically set to zero, such that the Liberal’s House vote share fell to the Coalition’s Senate vote share in the seat (53.0%), Wentworth would remain a reasonably safe Liberal seat, safer than Warringah, North Sydney or Bennelong.

  • It is worth stressing that others factors besides candidate popularity determine election results. Accordingly, it would be risky to equate the upcoming by-election in Wentworth as a re-run of the 2016 election in that seat, save for the Liberal Party candidate being someone other than Turnbull. See also the caveat noted in the preceeding section.

Estimating personal votes for House candidates

Adopting a technique favored by Peter “Mumble” Brent, we compute a House candidate’s “personal vote” as the difference between their 1st preference vote and their party’s share of 1st preferences cast for Senate in the candidate’s division.

This is easily implemented given the results files provided by the AEC, as the following code demonstrates.

Reading the AEC data

We begin by reading the Senate results data, containing votes at the divisional level.

baseURL <- "https://results.aec.gov.au/20499/Website/Downloads/SenateFirstPrefsByDivisionByVoteTypeDownload-20499.csv"
theFile <- "~/Downloads/SenateFirstPrefsByDivisionByVoteTypeDownload-20499.csv"

## to speed up debugging, only download once
## use saved file thereafter
if(!file.exists(theFile)){
  download.file(baseURL,theFile)
}

## read file, skipping 1st line with AEC info
out <- read_csv(theFile,
                skip=1)

We next compute division-level totals and percentages for each party.

out <- out %>%
  dplyr::rename(division=DivisionNm) %>%
  group_by(division,PartyAb) %>%
  summarise(total=sum(TotalVotes),
            State=StateAb[1]) %>%
  group_by(division) %>%
  mutate(p=total/sum(total)*100)

We create an auxiliary data set for mapping candidates to parties, recognizing that Senate tickets and Senate candidates have different labels and abbreviations in different states. This provides a series of mappings from party abbreivations in different states to some common tags, that we can merge against the House data for analysis (across states) of personal votes by party.
This gets tricky in the case of three-cornered contests, where Nationals and Liberals run against one another and in some cases we will exclude data from these seats.

labs <- list()
labs[["NSW"]] <- data.frame(LP="Coalition",
                            NP="Coalition",
                            LPNP="Coalition",
                            ALP="ALP",
                            GRN="GRN")
labs[["VIC"]] <- labs[["NSW"]]
labs[["QLD"]] <- data.frame(LNP="Coalition",
                            ALP="ALP",
                            GRN="GRN")
labs[["TAS"]] <- data.frame(LP="Coalition",ALP="ALP",GRN="GRN")
labs[["SA"]] <- labs[["TAS"]]
labs[["WA"]] <- data.frame(LP="LP",NP="NP",ALP="ALP",GRN="GRN")
labs[["NT"]] <- data.frame(CLP="Coalition",
                           ALP="ALP",
                           GRN="GRN")
labs[["ACT"]] <- labs[["TAS"]]

labs <- bind_rows(labs,.id="State") %>%
  group_by(State) %>%
  gather(PartyAb,Party,-State) %>%
  ungroup() %>%
  filter(!is.na(Party))

We now implement the selections, writing Party back onto the file of results:

out2 <- left_join(out,labs,by=c("State","PartyAb")) %>%
  group_by(State,division,Party) %>%
  summarise(p=sum(p)) %>%
  filter(Party %in% c("Coalition","LP","NP","ALP","GRN","CLP"))

House results, by CED

We now download and parse 1st preference House of Representatives results, by division.

The last line, below, computes the difference between House vote (per) and Senate vote (p), as pvote, our estimate of the House candidate’s “personal vote.”

hdata <- read_csv("https://results.aec.gov.au/20499/Website/Downloads/HouseFirstPrefsByCandidateByVoteTypeDownload-20499.csv",skip=1)

out3 <- hdata %>%
  rename(division=DivisionNm,State=StateAb) %>%
  filter(Surname!="Informal") %>%
  group_by(division) %>%
  mutate(Party=PartyAb,
         per=TotalVotes/sum(TotalVotes)*100) %>%
  select(State,division,Party,PartyNm,
         Surname,GivenNm,CandidateID,
         Incumbent=HistoricElected,
         BallotPosition,per) %>%
  mutate(Party=if_else(State!="WA",
                       recode(Party,
                              `LP`="Coalition",
                              `NP`="Coalition",
                              `LNP`="Coalition",
                              `CLP`="Coalition"),
                       Party)) %>%
  left_join(out2,by=c("State","division","Party")) %>% 
  filter(Party %in% c("ALP","LP","NP","Coalition","GRN")) %>%
  mutate(pvote = per - p)

We now drop Coalition House candidates in three-cornered contests but where the Coalition ran a joint Senate ticket, which occured in Ballarat, Bendigo, Indi, McEwan, Murray and Whitlam. The labels “LP” and “NP” apply to Western Australia, where the Coalition does not run a joint Senate ticket and the Nationals ran candidates in five seats (all held by Liberal incumbents, incidentally).

tcc <- out3 %>% 
  group_by(division,Party) %>% 
  summarise(n=n()) %>% 
  ungroup() %>% 
  filter(Party=="Coalition",n>1)

out3 <- anti_join(out3,
                  tcc,
                  by=c("division","Party"))

Results

We display estimated personal votes for House candidates in table that can filtered by state, party and incumbency, etc. Both major parties no shortage of appartent “stars”, where the House vote outpaces Senate vote shares by more than 10 percentage points. There are 15 such Labor incumbents, 11 Liberals, 8 Nationals and 7 LNP incumbents with these high levels of apparent personal votes. Incumbents – perhaps by definition – have higher personal votes than challengers, and this is true across the parties.

library(DT)
datatable(out3 %>% 
            rename(House=per,Senate=p) %>%
            arrange(desc(pvote)) %>%
            mutate(Party=recode(PartyNm,
                                `The Nationals`="NAT",
                                `Liberal National Party of Queensland`="LNP",
                                `Australian Labor Party`="ALP",
                                `Liberal`="LIB",
                                `Labor`="ALP",
                                `The Greens`="GRN",
                                `The Greens (WA)`="GRN",
                                `Country Liberals (NT)`="CLP",
                                `Australian Labor Party (Northern Territory Branch)`="ALP")) %>%
            ##filter(abb=="lib" & HistoricElected=="Y") %>%
            select(division,State,Party,Incumbent,Surname,GivenNm,
                   House,Senate,pvote),
          options=list(pageLength=20),
          elementId = "results_table",
          filter="top",rownames=FALSE) %>%
  #formatStyle(c("division","State","PartyNm",
  #            "Incumbent","Surname","GivenNm",
  #            "House","Senate","pvote"),
  #            fontFamily="AtlasGrotesk",
  #            fontSize="12px") %>%
  formatRound(~House+Senate+pvote,digits=1)

NSW Coalition incumbents

In the graph below, data points on the left show Senate votes for the Coalition in House seats with Coalition incumbents; on the right hand side of the graph are the corresponding share of 1st preferences for Coalition incumbents. The lines connecting each pair of data points tends to slope up, indicating that on average, Coalition House candidates outperformed the Senate ticket in their respective seats. For Coalition incumbents in NSW in 2016, the average difference between House and Senate vote was 8.2 percentage points; in Wentworth this difference was 9.3 percentage points.

Graphical inspection also indicates the extent to which Wentworth lies “off-trend”, again consistent with a larger than average personal vote for Malcolm Turnbull in that seat. But the graph also suggests that Wentworth is not especially unusual in this regard, relative to other NSW Coalition-held seats in 2016.

Wentworth is more “unusual” or “distinct” with respect to its large House and Senate votes for the Coalition. If in 2016, Turnbull’s apparent personal vote was hypothetically set to zero, such that the Liberal’s House vote share fell to the Coalition’s Senate vote share in the seat (53.0%), Wentworth would remain a safe Liberal seat, safer than Warringah, North Sydney or Bennelong.

d3d <- out3 %>% 
  ungroup() %>%
  filter(Party=="Coalition" & 
           Incumbent=="Y" & 
           State=="NSW") %>%
  dplyr::rename(Senate=p,House=per) %>%
  dplyr::mutate(name=stringr::str_to_title(Surname),
                indx=1:n()) %>%
  dplyr::select(division,indx,Senate,House,name,pvote) %>%
  arrange(division)

d3d <- jsonlite::toJSON(d3d)
                       
cat(
  paste(
  '<script>
    var data = ', d3d ,';
  </script>'
  , sep="")
)

Analysis

What predicts the size of the personal vote?

Party differences

We begin by looking at differences in personal vote by incumbency status, by party. We omit Coalition House candidates in three-cornered contests but where the Coalition ran a joint Senate ticket, which occured in Ballarat, Bendigo, Indi, McEwan, Murray and Whitlam. The labels “LP” and “NP” apply to Western Australia, where the Coalition does not run a joint Senate ticket and the Nationals ran candidates in five seats (all held by Liberal incumbents, incidentally).

# Vector of smoothing methods for each plot panel
meths <- c("loess","loess","loess","NULL","NULL")

# Smoothing function with different behaviour in the different plot panels
mysmooth <- function(formula,data,...){
   meth <- eval(parse(text=meths[unique(data$PANEL)]))
   x <- match.call()
   x[[1]] <- meth
   eval.parent(x)
}

library(ggplot2)
p <- ggplot(data=out3,
       aes(x=p,y=per,
           group=Incumbent,
           color=Incumbent)) +
  geom_abline(intercept = 0, slope = 1) +
  geom_point(pch=1) +
  geom_smooth(method="mysmooth",se=FALSE) +
  facet_wrap(~Party) +
  scale_x_continuous("Senate Vote, by Division") +
  scale_y_continuous("House Vote, by Division") +
  mytheme +
  theme(legend.position = "top")
print(p)

This graphical inspection does not suggest large differences in the apparent personal vote by party. We verify this with regression analysis.

m <- lm(pvote ~ Party + Incumbent,
        data=out3)
pander(summary(m),
       digits=2,
       caption="Regression analysis of apparent personal votes: differences by party and incumbency.")
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.4 0.29 12 2.8e-29
PartyCoalition 0.64 0.39 1.6 0.1
PartyGRN -1.9 0.39 -5 6.7e-07
PartyLP 0.45 0.85 0.53 0.6
PartyNP -0.4 1.5 -0.28 0.78
IncumbentY 4.8 0.38 12 1.4e-30
Regression analysis of apparent personal votes: differences by party and incumbency.
Observations Residual Std. Error \(R^2\) Adjusted \(R^2\)
449 3.182 0.4406 0.4343

These results suggests some small differences in personal vote by party. Coalition candidates appear to have a slightly higher personal vote than Labor candidates, about 2/3s of a percentage point. On average, Green candidates have a two percentage point smaller personal vote than do Labor candidates, consistent with support for the Greens being more with respect to Senate votes than House votes, on average.

Incumbency

The regression analysis reported above also shows that the average personal vote for incumbents is almost five percentage points higher than that for challengers, a difference we can interpret as an estimate of incumbency advantage.

We test for whether incumbency effect varies across parties:

m2 <- lm(pvote ~ Party*Incumbent,
        data=out3)
pander(anova(m,m2,test="F"),digits=2,add.significance.stars = FALSE,
       caption = "F Test, invariance of incumbency effects across parties.")
F Test, invariance of incumbency effects across parties.
Res.Df RSS Df Sum of Sq F Pr(>F)
443 4484 NA NA NA NA
440 4438 3 46 1.5 0.21

We fail to reject the null hypothesis that the apparent incumbency advantage is constant across parties.

Gender

We consider if a candidate’s gender is associated with their apparent personal vote. The AEC noted the gender of each House candidate in a JSON file that was part of the AEC’s media feed (which I saved at the time of 2016 election). We read this file and extract the gender of each candidate:

library(jsonlite)
gender <- read_json(path="../../../AEC/results/LightProgress/data/candidates.json",
                    flatten=TRUE,
                    simplifyVector = TRUE)
gender <- bind_rows(lapply(gender,
                           function(x){
                             data.frame(CandidateID=x$candidate_id,
                                        gender=x$gender)
                           }))
out3 <- left_join(out3,gender)

We add gender to the regression analysis reported above:

m3 <- update(m, ~ . + gender)
m4 <- update(m, ~ . + gender*Incumbent)
m5 <- update(m, ~ . + gender*Party)

library(memisc)
mt <- mtable("Base Model"=m,
             "+Gender"=m3,
             "+Gender*Incumbency"=m4,
             "+Gender*Party"=m5,
             digits=2,
             sdigits=2,
             signif.symbols = FALSE,
             summary.stats=c("sigma","R-squared","AIC"))
mt <- relabel(mt,"(Intercept)" = "Constant")
show_html(mt)
Base Model +Gender +Gender*Incumbency +Gender*Party
Constant 3 . 45 3 . 45 3 . 56 4 . 04
(0 . 29) (0 . 34) (0 . 36) (0 . 42)
Party: Coalition/ALP 0 . 64 0 . 64 0 . 61 −0 . 79
(0 . 39) (0 . 39) (0 . 39) (0 . 70)
Party: GRN/ALP −1 . 94 −1 . 94 −1 . 97 −2 . 64
(0 . 39) (0 . 39) (0 . 39) (0 . 56)
Party: LP/ALP 0 . 45 0 . 45 0 . 44 −0 . 33
(0 . 85) (0 . 85) (0 . 85) (1 . 65)
Party: NP/ALP −0 . 40 −0 . 40 −0 . 42 −0 . 55
(1 . 45) (1 . 45) (1 . 45) (2 . 29)
Incumbent: Y/N 4 . 78 4 . 78 4 . 29 4 . 73
(0 . 38) (0 . 39) (0 . 66) (0 . 39)
gender: male/female 0 . 01 −0 . 15 −0 . 96
(0 . 32) (0 . 36) (0 . 53)
Incumbent: Y/N x gendermale 0 . 68
(0 . 75)
Party: Coalition/ALP x gendermale 2 . 12
(0 . 84)
Party: GRN/ALP x gendermale 1 . 17
(0 . 74)
Party: LP/ALP x gendermale 1 . 26
(1 . 91)
Party: NP/ALP x gendermale 0 . 21
(2 . 95)
sigma 3 . 18 3 . 19 3 . 19 3 . 18
R-squared 0 . 44 0 . 44 0 . 44 0 . 45
AIC 2321 . 48 2323 . 48 2324 . 63 2324 . 64

There is little evidence to suggest a candidate’s gender impacts the magnitude of their apparent personal vote. Coalition male candidates appear to have a higher personal vote than Coalition females, by about two percentage points.

Ballot length and ballot position

Among the predictors we’ll examine are (a) length of the ballot paper (i.e., as the ballot paper gets longer, the name recognition value associated with incumbency diminishes) and (b) ballot position (donkey vote effects, list effects). We create a ballot length variable and join it to our working data:

out3 <- left_join(out3,
                   hdata %>% 
                     dplyr::rename(division=DivisionNm) %>% 
                     group_by(division) %>% 
                     filter(BallotPosition!=999) %>% 
                     summarise(ballotLength=max(BallotPosition)),
                   by="division")

We look at relationship between ballot length and personal votes:

library(ggplot2)
ggplot(data=out3,
       aes(x=ballotLength,y=pvote,group=Incumbent,color=Incumbent)) + 
    geom_smooth(method="mysmooth",se=FALSE) +
  geom_point(pch=1) + 
  facet_wrap(~Party) + 
  mytheme +
  theme(legend.position = "top")

There does seem to be a slight diminuation in the apparent personal vote as ballot length increases.

We make a similar graph to look at the relationship between ballot position and personal vote:

library(ggplot2)
ggplot(data=out3,
       aes(x=BallotPosition,y=pvote,group=Incumbent,color=Incumbent)) + 
    geom_smooth(method="mysmooth",se=FALSE) +
  geom_point(pch=1) + 
  facet_wrap(~Party) + 
  mytheme +
  theme(legend.position = "top")

For Coalition and Green challengers, we do see a diminuation of the personal vote as the candidate appears lower on the ballot paper. This is not the case for ALP candidates, be they incumbents or challengers. There is no apparent difference in any ballot position effect between incumbencts and challengers. There is a hint that being at the top of the ballot paper does produce a bump in a candidate’s apparent personal vote, which we now investigate with some additional regression analysis:

m6 <-  update(m, ~ . + ballotLength)
m7 <-  update(m, ~ . + BallotPosition)
m8 <-  update(m, ~ . + ballotLength + BallotPosition)
m9 <-  update(m, ~ . + ballotLength + I(BallotPosition==1))
m10 <- update(m, ~ . + ballotLength * I(BallotPosition==1))

mt <- mtable("Base Model"=m,
             "+Ballot Length"=m6,
             "+Ballot Position"=m7,
             "+Length * Position"=m8,
             "+Length + 1st place"=m9,
             "+Length * 1st place"=m10,
             digits=2,
             sdigits=2,
             signif.symbols = FALSE,
             summary.stats=c("sigma","R-squared","AIC"))
mt <- relabel(mt,"(Intercept)" = "Constant")
show_html(mt)
Base Model +Ballot Length +Ballot Position +Length * Position +Length + 1st place +Length * 1st place
Constant 3 . 45 6 . 71 4 . 56 6 . 85 6 . 30 6 . 15
(0 . 29) (0 . 60) (0 . 40) (0 . 60) (0 . 60) (0 . 64)
Party: Coalition/ALP 0 . 64 0 . 67 0 . 72 0 . 70 0 . 73 0 . 74
(0 . 39) (0 . 38) (0 . 38) (0 . 38) (0 . 37) (0 . 37)
Party: GRN/ALP −1 . 94 −1 . 94 −2 . 10 −2 . 01 −2 . 00 −1 . 99
(0 . 39) (0 . 37) (0 . 38) (0 . 37) (0 . 37) (0 . 37)
Party: LP/ALP 0 . 45 −0 . 18 0 . 31 −0 . 17 −0 . 38 −0 . 38
(0 . 85) (0 . 82) (0 . 84) (0 . 82) (0 . 81) (0 . 81)
Party: NP/ALP −0 . 40 −0 . 71 −0 . 44 −0 . 69 −0 . 75 −0 . 77
(1 . 45) (1 . 40) (1 . 43) (1 . 39) (1 . 38) (1 . 38)
Incumbent: Y/N 4 . 78 4 . 80 4 . 65 4 . 73 4 . 71 4 . 68
(0 . 38) (0 . 37) (0 . 38) (0 . 37) (0 . 37) (0 . 37)
ballotLength −0 . 49 −0 . 43 −0 . 46 −0 . 44
(0 . 08) (0 . 09) (0 . 08) (0 . 09)
BallotPosition −0 . 28 −0 . 14
(0 . 07) (0 . 08)
I(BallotPosition == 1) 1 . 35 2 . 31
(0 . 38) (1 . 46)
ballotLength x I(BallotPosition == 1)TRUE −0 . 15
(0 . 23)
sigma 3 . 18 3 . 06 3 . 13 3 . 05 3 . 02 3 . 02
R-squared 0 . 44 0 . 48 0 . 46 0 . 49 0 . 50 0 . 50
AIC 2321 . 48 2287 . 16 2308 . 41 2285 . 95 2276 . 20 2277 . 72

This analysis suggests that each additional candidate on the ballot degreades an apparent personal vote by about half a percentage point, offset handsomely by (randomly) securing the first place on the ballot. First place on the ballot paper adds about 1.35 points to the personal vote, on average. The “first place” effect could be larger/smaller for shorter/longer ballot papers, but the statistical evidence for an interaction between ballot position and ballot length is weak.

Clustering by division

Finally, note that errors from the fitted model are likely correlated within division, given that one candidate’s over-performance with respect to their personal vote is mean under-performance for other candidates. There is some mild evidence of this in the scatterplot matries shown below.

First, scatterplots of personal votes by party, so each plotted point represents a division; the numbers in the upper diagonal are correlations.

library(GGally)
ggscatmat(out3 %>%
            ungroup() %>%
            dplyr::select(pvote,Party,division) %>%
            tidyr::spread(Party,pvote) %>%
            dplyr::select(-division))

We also produce this graph showing the residuals from model m9:

library(modelr)
ggscatmat(out3 %>%
            add_residuals(m9) %>%
            ungroup() %>%
            dplyr::select(resid,Party,division) %>%
            tidyr::spread(Party,resid) %>%
            dplyr::select(-division))

We refit the model with the smallest AIC from the previous section (the model labelled “+Length + 1st place”), adding a random effect for division, soaking up the previously unmodeled correlation within divisions. We adopt a Bayesian approach for estimation and inference for this model, using simulation methods (MCMC) to explore the posterior density of the model’s parameters via stan.

library(rstanarm)
options(mc.cores = parallel::detectCores()-2)
m9a <- stan_lmer(pvote ~ Party + Incumbent + ballotLength + I(BallotPosition==1) + (1|division),
            data=out3)

The tabular summary reports Bayesian point estimates and 50% and 95% confidence intervals:

sjPlot::tab_model(m9a,show.p = FALSE,show.aic = TRUE,digits=2,ci.hyphen = ", ",
                  bpe="mean")
  pvote
Predictors Estimates HDI (50%) HDI (95%)
(Intercept) 6.35 5.92, 6.81 5.11, 7.80
PartyCoalition 0.70 0.47, 0.92 0.02, 1.32
PartyGRN -1.99 -2.21, -1.78 -2.60, -1.33
PartyLP -0.36 -0.80, 0.27 -1.88, 1.17
PartyNP -0.89 -1.75, 0.01 -3.37, 1.66
IncumbentY 4.76 4.57, 5.04 4.12, 5.44
ballot Length -0.47 -0.53, -0.40 -0.66, -0.29
I(BallotPosition == 1)TRUE 1.24 1.05, 1.53 0.50, 1.89
Observations 449
Bayes R2 / Standard Error 0.590 / 0.029

This set of results confirms the findings of the previous analysis.