1. Introduction

2020 has been an eventful year. As Covid-19 emerges, people adapt to new lifestyles, like social distancing, mask on whenever one leaves his house, online learning, telecommuting just to name a few. For Singaporeans, another hot topic that mobilize all adults in 2020 is the General Election (GE) which took place on 10 July, shortly after the Circuit Breaker was lifted.

What is different about GE2020 is there is no physical rally. Instead, candidates get television airtime for political broadcasts. All candidates have to don their masks during walkabouts. With these limited interactions, how has Singapore voted in GE2020 and compared with GE2015? This is the main motivation of this visualization. We know that the Opposition Party has captured Hougang SMC, Aljunied GRC and Sengkang GRC. Is it possible for PAP to recapture them in the near future? This is also another question that can be answered through this visualization.

The image below is a quick guide of the existing political parties in Singapore.

Image credit : Seedly’s A beginner’s guide to SG political parties

2. Challenges

The first challenge is the availability of the data. While the historical GE results are readily available on https://data.gov.sg/, it is not updated with the latest result. Therefore, I obtain the GE2020 result from https://www.eld.gov.sg/finalresults2020.html and merge it with the historical data.

The second challenge is the incomparability of data over time. Some constituency boundaries were redrawn in most of the recent elections, for example, Fengshan SMC is subsumed under East Coast in GE2020. Therefore, I manually group the constituencies in 2015 in order to make the data comparable between both elections.

Different plots needs different data format or different filters, for example, the treemap plot shows the result of GE2020 while the historical data is displayed in the other line plots. Instead of assigning the different dataframe for each plot, I do the data wrangling using tidyverse and integrate it to ggplot.

As there are quite a few Opposition Parties and some of them did not receive a high proportion of votes. When making comparsion, it is not meaningful to display the result of each of them. I grouped the Opposition Parties into 1 group when comparing the result. If I need to display them individually, I highlight the top Opposition Parties using other colours.

In this visualization, a lot of the data are in proportion. In order to display the labels properly, I concatenate the proportions with “%” signal.

3. Proposed Visualization

Sketch of the Design

Sketch of the Design

4. Step-by-step Description

4.1 Install and Load R packages

  1. tidyverse contains a set of essential packages for data manipulation and exploration.
  2. ggplot2 allows users to create static plots
  3. xlsx allows users to import data in excel
  4. treemap is a package to create treemap
  5. directlabels is a package that allows users to add labels to plots
  6. GGally is an extension of ggplot2 (for creation of parallel plots)
packages = c('tidyverse', 'xlsx','treemap','directlabels','GGally','ggplot2')
for (p in packages){
  if (!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

4.2 Load the data

GE_result_data <- read.xlsx("C:/Users/User/Desktop/MITB_2019/AY2020T3_Visual_ISSS608_G2/Assignment4/GE_data_1.xlsx", sheetName = "Result")

GE_invalid_vote_data <- read_csv("C:/Users/User/Desktop/MITB_2019/AY2020T3_Visual_ISSS608_G2/Assignment4/parliamentary-general-election-registered-electors-rejected-votes-and-spoilt-ballots.csv")

GE_raw_data <- read_csv("C:/Users/User/Desktop/MITB_2019/AY2020T3_Visual_ISSS608_G2/Assignment4/parliamentary-general-election-results-by-candidate.csv")

4.3 Data-Wrangling

Change the data format of number of vote and percentage of votes columns to numeric. Change the data format of an indicator of which party won to factor.

GE_raw_data[, c(6,7)] <- sapply(GE_raw_data[, c(6,7)], as.numeric)
GE_raw_data[, c(9)] <- sapply(GE_raw_data[, c(9)], as.factor)

To create a new dataframe for parallel plot as the data format requires 2015 and 2020 figures in different columns, and another column that indicates the party.

#Data wrangling for Parallel Chart
myvars <- c("Constituency","X2015_PAP_Proportion", "X2020_PAP_Proportion")
parallel_pap_data <- GE_result_data[myvars]
parallel_pap_data<-parallel_pap_data[!(is.na(parallel_pap_data$X2015_PAP_Proportion) | parallel_pap_data$X2015_PAP_Proportion==""), ]
parallel_pap_data["Party"] <- "PAP"
names(parallel_pap_data)[names(parallel_pap_data) == "X2015_PAP_Proportion"] <- "2015"
names(parallel_pap_data)[names(parallel_pap_data) == "X2020_PAP_Proportion"] <- "2020"

myvars <- c("Constituency","X2015_Opposition_Proportion", "X2020_Opposition_Proportion")
parallel_opp_data <- GE_result_data[myvars]
parallel_opp_data<-parallel_opp_data[!(is.na(parallel_opp_data$X2015_Opposition_Proportion) | parallel_opp_data$X2015_Opposition_Proportion==""), ]
names(parallel_opp_data)[names(parallel_opp_data) == "X2015_Opposition_Proportion"] <- "2015"
names(parallel_opp_data)[names(parallel_opp_data) == "X2020_Opposition_Proportion"] <- "2020"
parallel_opp_data["Party"] <- "Opposition"

parallel_data <- rbind(parallel_opp_data, parallel_pap_data)

4.4 Plotting the charts

  1. I use treemap package to create a treemap to display the percentage of votes received by each party. Missing data have to be replaced with zero for the total number of votes to compute properly.
  2. I create a stacked bar chart of the number of seats contested vs the number of seats won.
  3. I use a lollipop chart to show the percentage of votes received by PAP vs the percentage of votes received by the Opposition. This is to show how close are the votes received for each constituency.
  4. I create a grouped bar chart to compare the proportion of spoilt votes in GE2020 and GE2015.
  5. I use a parallel chart to show the increase or decrease in supporters between the ruling party and the Opposition parties from GE2015 to GE2020.
  6. Finally, I use a couple of line chart to show the proportion of supporters for each party from 1997 onwards. This is to understand the trends before majority of the voters swung their votes, and whether this trend can be found in the 2 constituencies that are captured by the Opposition party previously.

5. Visualization and Insights

PAP remains the preferred ruling party for 61% of the Singaporeans (Figure 1). Of the 93 seats contested, PAP has won 83 of them. The rest are won by the Worker’s Party (Figure 2). The ruling party has successfully edged out the Opposition Parties in most constituencies by quite a big margin, with the exception of Aljunied GRC (won by Worker’s Party), Bukit Panjang SMC (PAP won by a narrow margin of 8%), East Coast GRC (PAP won by only 6%), Hougang SMC (won by Worker’s Party), Sengkang GRC (won by Worker’s Party) and West Coast GRC (PAP won by just 4%)(Figure 3). Although there are reports of voters who spoilt their votes to express their unhappiness with the long waiting time, the number of spoilt votes did not increase significantly across all constituencies except for Bukit Panjang SMC (Figure 4).

When compared with 2015 GE result, the proportion of PAP supporters decreased in almost all constituencies while the Opposition Parties seemed to have garnered more supporters since 2015 (Figure 5). In fact, the number of supporters has dipped to almost the same level as 2011, which is the worst performing election for PAP since independence in 1965 (Figure 6). In this election, PAP failed to recapture Aljunied GRC and Hougang SMC. Is it possible for them to recapture in the near future? Let’s look at the historical trend of Potong Pasir SMC before voters swung their votes. In 1984, voters had chosen the Worker’s Party, but before 1984, the votes for PAP had been on a downward trend (Figure 7). The same trend was observed in 2011 when voters swung their votes towards PAP again. However, this trend is not observed in both Aljunied GRC and Hougang SMC (Figure 8). Therefore, it is not likely for PAP to recapture these constituencies by the next 5 years.

GE_raw_data %>%
  mutate(vote_count = replace_na(vote_count, 0)) %>%
  filter(year==2020) %>%
  group_by(party) %>%
  dplyr::summarize(Proportion_voters = sum(vote_count)) %>%
  mutate(Party.Index=paste(party, format(Proportion_voters,big.mark=","), 
                           freq =paste0(round(Proportion_voters*100/
                           sum(Proportion_voters),0),'%'),sep ="\n"))%>%
  ##Treemap                           
  treemap(index="Party.Index",
        vSize="Proportion_voters",
        type="index",
        palette = "Paired",
        title="Figure 1 : Number of Supporters by Political Party\n",
        fontsize.title=15)

legend_title <- "Outcome"

GE_raw_data %>% 
  filter(year==2020) %>%
  group_by(party, Win_ind) %>%
  dplyr::summarize(Seat_no = sum(seat_contested))%>%
  ggplot(aes(fill=Win_ind, y=Seat_no, x = reorder(party, -Seat_no, FUN = sum),label = Seat_no))+
  geom_bar(position="stack", stat="identity", color = "black") +
  geom_text(size = 4, position = position_stack(vjust = 0.5), color="black")+
  ggtitle("Figure 2 : Number of Seats Contested by Political Party and Outcome\n") +
  labs(x = "Political Party", y = "Number of Seats") +
  theme(panel.background = element_rect(fill = "white", colour = "grey50"),
        plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black"),
        axis.title = element_text(face = "bold", size = 12),
        axis.text.x = element_text(size=10),
        axis.text.y = element_text(size=10),
        legend.position="bottom",
        legend.title = element_text(face = "bold"))+
  scale_fill_manual(legend_title,values=c('grey','darkseagreen'),labels = c("Lost", "Won"))

GE_diff_data <- GE_result_data %>% 
  rowwise() %>% 
  mutate(mymean = mean(c(X2020_Opposition_Proportion*100,X2020_PAP_Proportion*100) )) %>% 
  arrange(mymean) %>% 
  mutate(x=factor(Constituency, Constituency))

ggplot(GE_diff_data) +
 geom_segment(aes(x=x, xend=x, y=X2020_Opposition_Proportion*100,
                  yend=X2020_PAP_Proportion*100), color="black") +
  geom_vline(xintercept = 1:31, col = "grey80") +
  geom_point( aes(x=x, y=X2020_Opposition_Proportion*100), color="lightblue", size=11) +
  geom_point( aes(x=x, y=X2020_PAP_Proportion*100), color=rgb(0.7,0.2,0.1,0.5), size=11) +
  geom_text(aes(x=x, y=X2020_Opposition_Proportion*100,
                label=paste0(round(X2020_Opposition_Proportion*100,0),"%")), size=3.5)+
  geom_text(aes(x=x, y=X2020_PAP_Proportion*100,
                label=paste0(round(X2020_PAP_Proportion*100,0),"%")), size=3.5)+
  ggtitle("Figure 3 : Comparison of Votes Received by PAP vs the Opposition Parties") +
  coord_flip()+
  theme_classic() +
  theme(
    plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black"),
    axis.title = element_text(face = "bold", size = 12),
    axis.text.x = element_text(size=10),
    axis.ticks.y = element_blank(),
    axis.line = element_blank()
  ) +
  geom_text(aes(x = x, y = y, label = label, col = label),data.frame(y = c(35, 67), x = 31.4, 
                       label = c("Opposition", "PAP")), size = 5)+
  scale_color_manual(values = c("lightblue", rgb(0.7,0.2,0.1,0.5)), guide = "none")+
  xlab("") +
  ylab("Proportion of Votes Received (%)")

t <- c(2015,2020)

GE_invalid_vote_data %>%
  filter(year %in% t) %>%
  mutate(rejected_votes_proportion=no_of_rejected_votes*100/no_of_votes_cast) %>%
  ggplot(aes(fill = factor(year, levels = c(2020, 2015)),
             x=reorder(constituency,rejected_votes_proportion), y=rejected_votes_proportion)) + 
  geom_bar(width=0.7, position=position_dodge2(reverse = TRUE), stat="identity",colour="black")+
  geom_text(aes(label = paste0(round(rejected_votes_proportion,1),"%")), hjust = -0.15, 
            size = 4,position = position_dodge(width = -0.85),inherit.aes = TRUE,
            color="black")+
  scale_fill_manual(values = c("#31a354", "#a1d99b"))+
  theme_minimal() +
  labs(title = "Figure 4 : Proportion of Spolit Votes by Constituency\n", y="Proportion (%)")+
        theme(axis.title.y=element_blank(),
              axis.title = element_text(face = "bold", size = 12),
              panel.grid.major.y = element_blank(),
              panel.grid.minor = element_blank(),
  legend.position = c(.95, .95),
  legend.justification = c("right", "top"),
  legend.box.just = "right",
  legend.margin = margin(6, 6, 6, 6),
              legend.background = element_blank(),
              legend.direction="horizontal",
              legend.title = element_blank(),
              plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black")) +
  coord_flip()

ggparcoord(parallel_data,
    columns = 2:3, groupColumn = "Party",
    scale="uniminmax",
    showPoints = TRUE, 
    title = "Figure 5 : Parallel Coordinate Plot of Percentage of Votes Received by PAP and Opposition Parties",alphaLines = 0.3) + 
  theme(plot.title = element_text(size=10))+
  theme_minimal()+
  scale_color_manual(values=c( "steelblue3", "indianred2"))+
  geom_line(size=1)+
  theme(axis.title=element_blank(), legend.position="bottom")

t <- c(1997, 2001, 2006, 2011, 2015,2020)
legend_title <- "Political Party"

GE_raw_data %>%
  filter(year %in% t)%>%
  group_by(year, party) %>%
  mutate(vote_count = replace_na(vote_count, 0))%>%
  dplyr::summarize(sum_voters = sum(vote_count)) %>%
  mutate(proportion_voters=sum_voters/sum(sum_voters))%>%

  ggplot(aes(x=as.factor(year), y=proportion_voters*100, group=party, color=party)) +
  geom_line(size=1)+
  ggtitle("Figure 6 : Proportion of Votes Received by Year") +
  labs(x = "Election Year", y = "Proportion of Votes (%)") +
  theme_bw() + 
  theme(plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black"),
    plot.background = element_blank(),
        axis.title.y=element_blank(),
         panel.grid.major = element_blank(),
         panel.grid.minor = element_blank(), legend.position="none")+
  geom_point(aes(x=as.factor(year),y=proportion_voters*100), size=4)+
  scale_colour_manual(values=c(PAP="indianred2",WP="steelblue3",DPP="grey80",PV="grey80",                          PPP="grey80",PSP="palegreen4",RDU="grey80",RP="grey80",SDA="grey80",
                   SDP="grey80",SGF="grey80",SPP="grey80",Independent="grey80", NSP="grey80"))+
  geom_dl(aes(label=party), method=list("last.bumpup",cex = 0.8, hjust = -.5))+
  ylim(c(0,100))

GE_raw_data %>%
  filter(constituency=="Potong Pasir")%>%
  group_by(year, party) %>%
  mutate(vote_count = replace_na(vote_count, 0)) %>%
  dplyr::summarize(sum_voters = sum(vote_count)) %>%
  mutate(proportion_voters=sum_voters/sum(sum_voters))%>%
  mutate(PAP_ind = ifelse(party == "PAP", "PAP", "Opposition")) %>%

  ggplot(aes(x=as.factor(year), y=proportion_voters*100, group=PAP_ind, color=PAP_ind)) +
    geom_line(size=1)+
    ggtitle("Figure 7 : Proportion of Votes in Potong Pasir SMC by Year") +
  labs(x = "Election Year", y = "Proportion of Votes (%)") +
  theme_bw() + 
  theme(plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black"),
    plot.background = element_blank(),
        axis.title.y=element_blank(),
         panel.grid.major = element_blank(),
         panel.grid.minor = element_blank(), legend.position="none")+
  geom_point(aes(x=as.factor(year),y=proportion_voters*100), size=4)+
  scale_colour_manual(values=c("steelblue3","indianred2"))+
  geom_dl(aes(label=party), method=list("last.points",cex = 1, hjust = -.25))+
  ylim(c(0,100))

a <- c("Aljunied","Hougang")

GE_raw_data %>%
  filter(constituency %in% a)%>%
  group_by(constituency, year, party) %>%
  mutate(vote_count = replace_na(vote_count, 0)) %>%
  dplyr::summarize(sum_voters = sum(vote_count)) %>%
    mutate(proportion_voters=sum_voters/sum(sum_voters))%>%
  mutate(PAP_ind = ifelse(party == "PAP", "PAP", "Opposition")) %>%

  ggplot(aes(x=as.factor(year), y=proportion_voters*100, group=PAP_ind, color=PAP_ind)) +
    geom_line(size=1)+
    ggtitle("Figure 8 : Proportion of Votes in Aljunied GRC and Hougang SMC by Year") +
  labs(x = "Election Year", y = "Proportion of Votes (%)",
       caption = "Note: Missing points and disconnected lines due to walkover") +
  theme_bw() + 
  theme(plot.title = element_text(size = 15, face = "bold", hjust = 0.5, color="black"),
        plot.caption = element_text(size = 12),
    plot.background = element_blank(),
        axis.title.y=element_blank(),
         panel.grid.major = element_blank(),
         panel.grid.minor = element_blank(), legend.position="none")+
  geom_point(aes(x=as.factor(year),y=proportion_voters*100), size=4)+
  scale_colour_manual(values=c("steelblue3", "indianred2"))+
  geom_dl(aes(label=party), method=list("last.points",cex = 1, hjust = -.5))+
  ylim(c(0,100))+
  facet_grid(constituency~.)