Assignment 2

Click the Original, Code and Reconstruction tabs to read about the issues and how they were fixed.

Original

Source: (eccekevin, 2020).

Objective

The objective of the original data visualisation is to educate audience on how well current Democratic and Republican Congressmen represent the population they are serving. Are they diverse enough in terms of religion, ethnicity and gender to reflect the demographics of US population and to make sure all voices and opinions are heard at the table.

Presumably, the visualisation is developed primarily for American voters, to raise awareness around both over- and under-representation of their Congressmen from the two major political parties. It is also for general audience, those that have an interest in American politics.

The visualisation chosen had the following three main issues:

Use of pie charts is not suitable for comparisons due to lack of visual accuracy. Area and angle make it harder to judge relativity, especially when proportions are similar (in the case of Democrats vs US Population) and when there are too many categories resulting in small slices that are hard to see.
Since pie charts are used to compare between groups, they rely on colour to differentiate between segments. However, there are too many colours in the visualisation due to the number of categories, and key patterns are not highlighted appropriately using colour.
Data are not labelled directly - this is twofold. One, there are three separate legends so viewers have to move back and forth between data and legends. Two, there are no value labels in the pie charts to make comparison easier for viewers.

Reference

eccekevin. (2020, August 28). How representative are the representatives? Retrieved from reddit: https://www.reddit.com/r/dataisbeautiful/comments/iho046/how_representative_are_the_representatives_the/

Code

The following code was used to fix the issues identified in the original.

library(ggplot2)
library(tidyr)
library(dplyr)
library(gridExtra)
library(grid)

congress <- data.frame(Demographic = c("Religion", "Religion", "Religion", "Religion", "Religion", "Religion", "Religion", "Ethnicity", "Ethnicity", "Ethnicity", "Ethnicity", "Ethnicity", "Gender", "Gender"),
                       Attribute = c("Christian", "Jewish", "Muslim", "Hindu", "Other", "Unaffiliated", "Unsure", "White", "Hispanic", "Black", "Asian", "Other", "Female", "Male"),
                       US_Pop = c(71.0, 2.0, 1.0, 1.0, 1.0, 23.0, 1.0, 60.1, 18.5, 13.4, 5.9, 2.1, 50.8, 49.2),
                       Republicans = c(99.2, 0.8, 0, 0, 0, 0, 0, 93.3, 4.7, 0.8, 0.4, 0.8, 9.5, 90.5),
                       Democrats = c(78.4, 11.3, 1.1, 1.1, 1.3, 0.4, 6.4, 59.4, 13.9, 19.2, 6.8, 0.7, 38.1, 61.9))
congress_new <- gather(congress, 'US_Pop', 'Republicans', 'Democrats', key="Group", value="Percent")
congress_new$Group <- congress_new$Group %>% 
  factor(levels = c("Democrats", "US_Pop", "Republicans"), 
         labels = c("Democrats", "US Population", "Republicans"))
congress_new <- congress_new %>% filter(Percent > 0)
congress_new$Demographic <- congress_new$Demographic %>% 
  factor(levels = c("Religion", "Ethnicity", "Gender"))
congress_r <- congress_new %>% filter(Demographic == "Religion")
congress_r$Attribute <- congress_r$Attribute %>% 
  factor(levels = c("Unsure", "Other", "Hindu", "Muslim", "Jewish", "Unaffiliated", "Christian"))
congress_e <- congress_new %>% filter(Demographic == "Ethnicity")
congress_e$Attribute <- congress_e$Attribute %>%
  factor(levels = c("Other", "Asian", "Black", "Hispanic", "White"),
         labels = c("    Other  ", "    Asian  ", "    Black  ", "   Hispanic ", "   White "))
congress_g <- congress_new %>% filter(Demographic == "Gender")
congress_g$Attribute <- congress_g$Attribute %>%
  factor(levels = c("Male", "Female"),
         labels = c("           Male", "         Female"))

p1 <- ggplot(data = congress_r, aes(x=Attribute, y=Percent, fill=Group)) + 
  geom_bar(stat = "identity", width = 0.9) + 
  theme_minimal() +
  coord_flip(ylim = c(0,130)) +
  facet_wrap( ~ Demographic + Group, ncol = 3) +
  labs(y = "", x = "") +
  scale_fill_manual(values = c("dodgerblue4", "gray28", "firebrick")) +
  geom_text(aes(label=round(Percent,1)), size=3, hjust=-0.2) +
  theme(strip.text.x = element_text(margin = margin(2, 0, 2, 0)),
        panel.grid.minor = element_line(colour = "white"),
        panel.grid.major = element_line(colour = "white"),
        axis.ticks.x = element_blank(), axis.text.x = element_blank(),
        legend.position = "none")

p2 <- ggplot(data = congress_e, aes(x=Attribute, y=Percent, fill=Group)) + 
  geom_bar(stat = "identity", width = 0.7) + 
  theme_minimal() +
  coord_flip(ylim = c(0,130)) +
  facet_wrap( ~ Demographic + Group, ncol = 3) +
  labs(y = "", x = "") +
  scale_fill_manual(values = c("dodgerblue4", "gray28", "firebrick")) +
  geom_text(aes(label=round(Percent,1)), size=3, hjust=-0.2) +
  theme(strip.text.x = element_text(margin = margin(2, 0, 2, 0)),
        panel.grid.minor = element_line(colour = "white"),
        panel.grid.major = element_line(colour = "white"),
        axis.ticks.x = element_blank(), axis.text.x = element_blank(),
        legend.position = "none")
  
p3 <- ggplot(data = congress_g, aes(x=Attribute, y=Percent, fill=Group)) + 
  geom_bar(stat = "identity", width = 0.4, position = position_dodge()) + 
  theme_minimal() +
  coord_flip(ylim = c(0,130)) +
  facet_wrap( ~ Demographic + Group, ncol = 3) +
  labs(y = "Proportion of Party / Population", x = "", 
       caption = "Source: US Census, Pew Research Centre, Congressional Research Service") +
  scale_fill_manual(values = c("dodgerblue4", "gray28", "firebrick")) +
  geom_text(aes(label=round(Percent,1)), size=3, hjust=-0.2) +
  theme(strip.text.x = element_text(margin = margin(2, 0, 2, 0)),
        panel.grid.minor = element_line(colour = "white"),
        panel.grid.major = element_line(colour = "white"),
        axis.ticks.x = element_blank(), 
        axis.text.x = element_blank(),
        axis.title.y = element_text(size = 2, vjust = -0.35),
        legend.position = "none")

pt <- ggplot(data = congress_g, aes(x=Attribute, y=Percent)) +
  theme_minimal() +
  labs(y= "", x = "",
       title = "How Representative are US Representatives in 2020",
       subtitle = "The demographics of the 116th Congress broken down by party vs US Population") +
  theme(plot.title = element_text(size=18, face="bold", margin = margin(20, 0, 10, 0)),
        plot.subtitle = element_text(size = 12, margin = margin(5, 0, 0, 0)),
        axis.ticks.x = element_blank(), 
        axis.text.x = element_blank(),
        axis.ticks.y = element_blank(), 
        axis.text.y = element_blank(),
        panel.grid.minor = element_line(colour = "white"),
        panel.grid.major = element_line(colour = "white"))

Data Reference

Manning, J. E. (2020, July 22). Membership of the 116th Congress: A Profile. Retrieved from Federation of American Scientists: https://fas.org/sgp/crs/misc/R45583.pdf
Pew Research Centre. (2019, January 3). Faith on the Hill - The religious composition of the 116th Congress. Retrieved from Pew Forum: https://www.pewforum.org/2019/01/03/faith-on-the-hill-116/
United States Census Bureau. (2019). QuickFacts United States. Retrieved from United States Census Bureau: https://www.census.gov/quickfacts/fact/table/US/PST045219

Reconstruction

The following plot fixes the main issues in the original.

Assignment 2

Deconstruct, Reconstruct Web Report

Shuyu Huang (s3743291)

Original

Code

Reconstruction