Group Lab #1: Political Polarization in US Counties

Module 4: Enter the Tidyverse (Lab 4)

Open this Project on RStudio.Cloud!

Background and Instructions

Background: Stories from news, political figures, and individual anecdotal accounts indicate that Americans’ political views have grown more divided over time. Is this true? Political scientist Morris Fiorina famously disputed this in his works, arguing that we were mostly seeing ‘elite polarization,’ where political party elites and party officials are growing further apart, not ordinary voters. In contrast, Shanto Iyengar, Alan Abramowitz, and others have argued that ordinary American voters are indeed also growing further apart in their political views.

The Lab: You have been commissioned to study changes in political polarization over time, to verify whether or not American voters have grown more polarized over time. You have chosen to examine County Elections Outcomes in every presidential election since the year 2000, drawing from data from the MIT Elections Data.

Together as a group, please complete this lab and write up results of your analysis, following the directions below.

Requirements: Lab reports should include:

  • Methods section, introducing your: (1a) research question, (1b) hypothesis, and (1c) how you will test them (eg. data source, what test, etc.)
  • Results section, summarizing results for each hypothesis,
  • 1 visualization, described in the text.
  • Discussion section, summarizing the significance of your results and any limitations.
  • 2 pages, double spaced. No cover page, no references page.
  • Cite works from in class (frequently).
  • One submission per group. Everyone must participate.
  • Please write and sign at the top of your assignment: “I have neither given nor received unauthorized aid on this assignment. Signed: [All of Your Names Here].”

You will do the lab together in class on Friday. You may use your lesson, workshop, and lab notes and code from class. When you are finished, submit your lab on Canvas by 11:59 midnight.


Please load the following packages and dataset using the link below.

library(tidyverse)
library(viridis)

counties <- read_csv("https://bit.ly/us_county_voteshare_2000_2020")

Task 1. Describe Data

Examine your dataset using glimpse(). Notice the following variables:

  • year is the election year.
  • fips is the unique 5-digit code for each county.
  • totalvotes is the total votes cast in that county in that election.
  • democrat is the share of voters who voted Democrat (0 to 100%).
  • republican is share of voters who voted Republican (0 to 100%).
  • gap is the size of the gap between the share of democrat and republican votes in a county. Bigger gaps indicate greater political polarization.
  • region indicates the broader region where that county is located (1 of 9, eg. New England).

Note: There are NAs in this dataset.


Please calculate the average share of votes for the democrat vs. republican candidate in the 2020 election. What difference do you see?


Please calculate the total votes for the democrat vs. republican candidate in the 2020 election.

Note: you will need to divide our percentages by 100 and remove NAs.



Above, you calculated the average share of support vs. the total share of support. Based on that, which of the following did you find to be true? Why do you think that is?

  1. on average, counties tend to lean Republican, but more total people voted Democrat.

  2. on average, counties tend to lean Democrat, but more total people voted Republican.

  3. on average, counties tend to lean Republican, and more total people voted Republican.

  4. on average, counties tend to lean Democrat, and more total people voted Democrat.




Task 2. Visualize Distributions

Next, please examine the distribution of votes for democrats vs. republicans over time. You can use any valid strategy for visualizing a distribution, although some will be easier than others. Do these distributions converge or diverge over time? How much? (Note: you will need to pivot your dataframe.) Please add some rad colors, proper labels, and a theme to your visualization.




Task 3. Measuring the Partisan Gap

Third, please examine the gap variable. How much, on average, has the size of the gap between Democrats and Republican voters grown in US counties over the last 6 elections? Please calculate using summarize() and any other necessary functions.




Task 4. Regional Differences

Finally, please analyze the change in political polarization over time in terms of the US’s 9 different regions. Please make 1 summary visualization with 9 panels that displays regional trends over time well. You may choose to visualize average trends for each region per year, or the entire distributions for each region per year; please use a valid visualization strategy appropriate for your data.

Please add some rad colors, proper labels, and a theme to your visualization.




Turn it in

After you have completed these steps, please remember to write up your work in a short two-page report, one per group, with a methods, results, and discussion section. Please see instructions at the top of this document. Then, turn it in on Canvas by 11:59 PM on Friday..


Nice work!