607 Assignment 1

GitHub Documents

Introduction: I chose the “aging congress” dataset (https://fivethirtyeight.com/features/aging-congress-boomers/). This article discusses how the average age of Congress is higher than ever before and the baby boomer generation has the highest representation in Congress as of 2023.

Conclusion: Based on my findings, there is a clear steep increase in average age starting from the 105th Congress. For the future, it may be interesting to see if there has been an increase of number of terms served as well.

Including Code

library(readr)
data_aging_congress_xlxs <- read_csv("C:/Users/janej/Downloads/congress-demographics/congress-demographics/data_aging_congress.xlxs.csv")
## Rows: 29120 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (5): chamber, state_abbrev, bioname, bioguide_id, generation
## dbl  (6): congress, party_code, cmltv_cong, cmltv_chamber, age_days, age_years
## date (2): start_date, birthday
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(data_aging_congress_xlxs)
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.2

## Warning: package 'ggplot2' was built under R version 4.4.2

## Warning: package 'dplyr' was built under R version 4.4.2

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ purrr     1.0.2
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
summary_data <- data_aging_congress_xlxs %>%
 group_by(congress) %>%
 summarise(Age = mean(age_years))
 View(summary_data)

ggplot(summary_data, aes(x = congress, y = Age)) +
       geom_line(color = "black", size = 1) + 
       geom_point(color = "black", size = 1.5) + 
       labs(title = "Average Age by Congress Number", subtitle = "Trends from the 66th to 118th Congress", x = "Congress number", y = "Average Age") +
       theme_minimal()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.