PACKAGES USED

library(tidyverse)
library(openintro)
library("dplyr")
library(grid)
library(pander)

LOADING DATASET FROM GITHUB

urlfile="https://raw.githubusercontent.com/jonburns2454/DATA607/main/data_aging_congress.csv"

usCongressAges<-read_csv(url(urlfile))
usCongressAges <- as.tibble(usCongressAges)

OVERVIEW: [Congress Today Is Older Than It’s Ever Been] (https://fivethirtyeight.com/features/aging-congress-boomers/)

The article linked above is referring to recent congressional hearing with the CEO of Tik Tok and how at least three members of Congress referred to the app as “Tic Tac”. However, I wanted to look at this data because of the recent news about Dianne Feinstein and Mitch McConnell. As well as, many referring to the 117th Congress as the oldest Congress since 1789. Thus, I was interested in looking into the age distribution of each party, and attempting to see which one was older on average, as well as, which party had the higher ends of the age distribution overall (Because the longer you are in Congress the more likely you are to hold a high rank and sit on crucial committees).

GLIMPSE | SUBSETTING

glimpse(usCongressAges)
## Rows: 29,120
## Columns: 13
## $ congress      <dbl> 82, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, …
## $ start_date    <date> 1951-01-03, 1947-01-03, 1949-01-03, 1951-01-03, 1953-01…
## $ chamber       <chr> "House", "House", "House", "House", "House", "House", "H…
## $ state_abbrev  <chr> "ND", "VA", "VA", "VA", "VA", "VA", "VA", "VA", "VA", "V…
## $ party_code    <dbl> 200, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1…
## $ bioname       <chr> "AANDAHL, Fred George", "ABBITT, Watkins Moorman", "ABBI…
## $ bioguide_id   <chr> "A000001", "A000002", "A000002", "A000002", "A000002", "…
## $ birthday      <date> 1897-04-09, 1908-05-21, 1908-05-21, 1908-05-21, 1908-05…
## $ cmltv_cong    <dbl> 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1, 2, 3, 4…
## $ cmltv_chamber <dbl> 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1, 2, 3, 4…
## $ age_days      <dbl> 19626, 14106, 14837, 15567, 16298, 17028, 17759, 18489, …
## $ age_years     <dbl> 53.73306, 38.62012, 40.62149, 42.62012, 44.62149, 46.620…
## $ generation    <chr> "Lost", "Greatest", "Greatest", "Greatest", "Greatest", …
congressTable <- subset(usCongressAges, select = c('chamber', 'party_code', 'age_years'))


rename_with(congressTable, toupper)
## # A tibble: 29,120 × 3
##    CHAMBER PARTY_CODE AGE_YEARS
##    <chr>        <dbl>     <dbl>
##  1 House          200      53.7
##  2 House          100      38.6
##  3 House          100      40.6
##  4 House          100      42.6
##  5 House          100      44.6
##  6 House          100      46.6
##  7 House          100      48.6
##  8 House          100      50.6
##  9 House          100      52.6
## 10 House          100      54.6
## # ℹ 29,110 more rows

PLOTTING WITH GGPLOT

congressTable |>
filter(party_code == "100" | party_code == "200") |>

ggplot(congressTable,mapping = aes(factor(party_code),age_years)) +
  geom_violin(adjust = 0.5, fill = "snow1", color = "mintcream", trim = FALSE) +
  geom_boxplot(width = 0.1) +
  labs(
    x = "Party Alignment",
    y = "Age of Congressional Members",
    title = "Which Congresional party is older?",
    subtitle = "DEM OR REP"
      ) +
  scale_x_discrete(labels = c('Democrat', 'Republican'))

Conclusion:

I decided to utilize the GEOM_VIOLIN() + GEOM_BOXPLOT() to evaluate the data set I previously subset. The violin plot visualized all of the questions I set out to answer. The median, density (distribution), and maximum. Looking solely at the maximum its clear that the Republican Party has the oldest members of Congress, with a few coming close to 90 years old. I was able to get a clearer idea about median age of each party by adding in the boxplot. This, along with the distribution shows that the Republicans have a slightly higher median age over the Republicans.

For future study: I think that there is a lot of potential with this data set and a lot of things someone may be able to visualize. One thing I would like to do is join this data set with different voting counts on bills to see how older Congress people vote on average compared to younger Representatives and Senators. Additionally, this data can be split into House and Senate.

TABULAR OUTPUT

pandoc.table(head(congressTable[, 1:3], 6), justify = c('right', 'center', 'left'))
## 
## ----------------------------------
##   chamber  party_code  age_years  
## --------- ------------ -----------
##     House     200      53.73      
## 
##     House     100      38.62      
## 
##     House     100      40.62      
## 
##     House     100      42.62      
## 
##     House     100      44.62      
## 
##     House     100      46.62      
## ----------------------------------