Data Introduction: This dataset includes Federal Election Commission data collected on Cincinnati political donors. The data set includes contributors’ demographic information, the details of the donation, and the political committee that received the donation.
Analytical Approach: I will be performing an analysis to more deeply understand the political climate and trends between contributors’ employers, donations, and party affiliations.
R Packages Packages Introduction: Several R packages are necessary for the analysis. 1. Tidyverse 2. Dplyr 3. Skimr
## -- Attaching packages ------------------------------------------------ tidyverse 1.3.0 --
## v ggplot2 3.2.1 v purrr 0.3.3
## v tibble 2.1.3 v dplyr 0.8.4
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts --------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
Loading Data The dataset is hosted at https://myxavier-my.sharepoint.com/:x:/g/personal/mcfaddenr_xavier_edu/EdzW7z3Ij9dDvsn7ssvMOO0BWH_6QwMtvVXYaOe94X_8lA?download=1
## Parsed with column specification:
## cols(
## contributor_last_name = col_character(),
## contributor_first_name = col_character(),
## contributor_street_1 = col_character(),
## contributor_employer = col_character(),
## contributor_occupation = col_character(),
## contribution_receipt_date = col_character(),
## contribution_receipt_amount = col_double(),
## contributor_aggregate_ytd = col_double(),
## committee_name = col_character(),
## committee_type = col_character(),
## committee_party_affiliation = col_character()
## )
Cleaning Data 1. Correct the data type in contribution_receipt_date to a date format. 2. Identify the columns with missing data. ~1400 entries do not have an employer or occupation available. Because there are so many missing, I will leave them in the dataset.
## contributor_last_name contributor_first_name
## 6 0
## contributor_street_1 contributor_employer
## 17 1430
## contributor_occupation contribution_receipt_date
## 1342 0
## contribution_receipt_amount contributor_aggregate_ytd
## 0 0
## committee_name committee_type
## 0 0
## committee_party_affiliation DatesFormatted
## 0 2
Summary Statistics The mean individual contribution is $355 with a $1,522 standard deviation. The mean YTD contribution is $909 with a $2,093 standard deviation. Both columns include negative values so I am assuming this also includes the committees refunding or paying contributors for something like hosting events.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -30700.0 20.0 50.0 354.8 200.0 66100.0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 250.0 395.0 908.6 780.0 100000.0
Directed Analysis Question 1: Which employers are most heavily represented in the data? Answer: Self employed and University of Cincinnati employees are the most represented in the data.
## # A tibble: 3,908 x 2
## contributor_employer n
## <chr> <int>
## 1 SELF-EMPLOYED 3414
## 2 <NA> 1430
## 3 UNIVERSITY OF CINCINNATI 1375
## 4 SELF EMPLOYED 1017
## 5 SELF 686
## 6 HOMEMAKER 482
## 7 CINCINNATI CHILDREN'S HOSPITAL 400
## 8 INFORMATION REQUESTED PER BEST EFFORTS 256
## 9 PROCTER & GAMBLE 202
## 10 XAVIER UNIVERSITY 150
## # ... with 3,898 more rows
Question 2: What percent of money contributed in Cincinnati benefits Democrats? Republicans? All others? Answer: 46.04% of contributions go to the Democratic Party, 53.64% of contributions go to the Republican Party, and 0.32% of contributions go to another political party.
Self-Directed Analysis Question 3: How much money are people employed by Kroger and 84.51 contributing? Approach: I will filter any observations with employers from Kroger and 84.51 and sum their contribution receipt amounts. Answer: Kroger/84.51 employees contributed $39,665.51 to political committees.