Fall 2019 - Exam 1 - Cincinnati Money in Politics
Lonie Moore
Introduction, Definitions, and Required Packages
Introduction
The purpose of this document is to use the Federal Election Commission (FEC) data to perform an analysis on the shifting political climate in the Cincinnati region from 2015 to present day. The data will be converted into a usable format from which basic analytic insights can be derived. Proper manipulation and analysis of the data will help the reader better visualize the way that money (contributions) influences political campaigns in Cincinnati.
Definitions
The data is stored in a table where each observational row is an instance of a monetary contribution being made to a political committee. The data collected and reported by the FEC includes all monetary contributions made by individuals to politically affiliated organizations. The data includes only contributions where a political party is affiliated with the committee receiving the contribution. This means all non-affiliated super PAC committees are excluded. The decision to exclude non-affiliated contributions was made to reduce the data to a manageable size. The variables recorded describe some aspect of the contribution. The data includes the following variables:
| Column | Data Type | Description |
|---|---|---|
| Contributor First Name | Character | First Name of the person who made the donation |
| Contributor Last Name | Character | Last Name of the person who made the donation |
| Contributor Street Address | Character | Address of the person who made the donation |
| Contributor Employer | Character | Name of contributor’s employer |
| Contributor Occupation | Character | Occupation of contributor |
| Date of Contribution | Date | Date the donation was received |
| Contribution Amount | Numeric | Positive values reflect a donation whereas negative values reflect a refund of a previous donation |
| Contributor Aggregate YTD | Numeric | The total amount of contributions made year-to-date by the individual contributor |
| Committee Name | Character/Factor | Name of the political committee receiving the contribution |
| Committee Type | Character/Factor | House, PAC, Party Non-Qualified, Party Qualified, Presidential, Senate |
| Committee Party Affiliation | Character/Factor | Democratic, Republican, Libertarian, Green, Independent, and Democratic-Farmer-Labor parties are all represented in the data |
Required Packages
Some of the below information is redundant. For example, the dplyr package is contained within the tidyverse package. Both are provided when a package within tidyverse provides a particularly useful function. The packages required for this markdown are:
| Package | Description |
|---|---|
| tidyverse | the tidyverse collection of packages all together |
| DT | makes interactive javascript data tables |
| dplyr | also in tidyverse; allows for easy data manipulation in R (filter, select, mutate, group_by, etc) |
| stringr | provides useful functions for searching for specifc strings within a character field |
| lubridate | provides useful date parsing and manipulation functions |
| ggplot2 | makes graphs |
| skimr | has a useful “skim” function for quick summary data |
Data Cleaning
Contribution Date
Description of changes made to the date field:
| Type | Description |
|---|---|
| formatting | This field was originally showing up as a character field in a “dd-mm-yyyy” format. This was modified using the dmy() function from the lubridate package and converted to a “date” data type. |
| missing | There were 2 records with missing contribution dates. These records were excluded because the main purpose of the overall analysis relies on contribution trends over time, so no meaning can be derived from a record having an invalid date. |
| date range | 10 records were found to have contribution dates prior to 2015. These records are erroneous since the exam instructions indicated that all data contribution data was as of 2015 or greater. Since there only 10 records out of over 57,000 that fell into this category, it was not worth analyzing further, and these records were tossed out of the dataset. |
Contribution Amounts
Description of changes made:
| Column | Description of changes |
|---|---|
| contribution_receipt_amount | There were 12 records that were removed due to having $0.00 in the contribution amount. |
| contributor_aggregate_ytd | Due to the possibility of refunds, there were many records found to have $0.00 in this field but with non-zero amounts in the contribution_receipt_amount field. These records were retained. |
Committee Information
Description of changes made:
| Column | Description of changes |
|---|---|
| Committee Name | No apparent missing or bad data. No changes were made. |
| Qualified | This is a new column added to the data set to detect whether or not a contribution is considered “Qualified” (versus Nonqualified). The column is coded as a 0,1 dummy variable. PAC and Party records contained an additional string with this “Qualified”/“Nonqualified” as part of the Committee Type record. These will be stripped off and coded as a 1 if the committee type contains “Qualified”, otherwise coded as 0 for Nonqualified. House, Senate, and Presidential committees are always considered “Qualified”. |
| Committee Type | There is a relatively small number of PAC and Party records (less than 500 out of 57,000) that contain reference to a “non-qualified” or “non-contribution account”. Because of this smaller number (and further, confirming that these same records do not add up to a material amount), these records were modified to exclude the extra “non-qualified” and “non-contribution account” descriptors from the string. Further, the inclusion of a new “Qualified” dummy variable further negates the need for a “qualified” descriptor in another 19,000 records. These roughly 19,500 records were unified into simple committee types displaying only “PAC” and “Party”. No other records were affected or modified, and there doesn’t appear to be any missing values. |
| Committee Party Affiliation | There 6 Political Parties appearing in the original dataset (Democratic, Republican, Green, Libertarian, Independent, and Democratic-Farmer-Labor). However, there are only 251 contributions (out of over 57,000) associated with the 4 smaller parties (Green, Libertarian, Independent, and Democratic-Farmer-Labor). Due to the very small number of contributions, these 4 were unified into an “Other” category. In order to shorten the length of this field, I also removed the word “PARTY” from every record since this is implied from the column name. |
Employer/Occupation
Description of changes made:
| Column | Description of changes |
|---|---|
| Employer | Records containing the word “RETIRED” (multiple variations) were unified to contain only the word “RETIRED”. Note that I was careful NOT to modify 18 records containing “ALLIANCE FOR RETIRED AMERICANS” as this appeared to me to be a legitimate employer. Likewise, records containing the word “HOMEMAKER” were also unified, as well as variations of self employed (“SELF-EMPLOYED”, “SELF EMPLOYED”, “SELF”, “SELF-NA”, “SELF NA”). Lastly, the word “University” was mispelled a couple different ways in “Xavier University”, and these were corrected. |
| Missing Employer Data | The following words were interpreted as “missing” and were unified to contain only “NA” (distinct words are separated by semicolons in the following list): N.A.; N A; N/A; NA; NONE; NOT NEEDED; NOT REQUIRED; NOT-EMPLOYED; NOT, EMPLOYED; NOT EMPLOYED; (Other); NA’s; NULL; EMPLOYED; 1950; 1966; REFUSED; REQUESTED; INFORMATION NA; INFORMATION NA PER BEST EFFORTS. |
| Occupation | Records containing the word “RETIRED” (multiple variations, including 1 instance of “REETIRED”) were unified to contain only the word “RETIRED”. Likewise, records containing the word “HOMEMAKER” were also unified, as well as variations of self employed (“SELF-EMPLOYED”, “SELF EMPLOYED”, “SELF”, “SELF-NA”, “SELF NA”). |
| Missing Occupation Data | The following words were interpreted as “missing” and were unified to contain only “NA” (distinct words are separated by semicolons in the following list): [INFORMATION REQUESTED]; NA; N/A; NONE; NOT EMPLOYED; NOT EMPLOYED; NOT IN WORKFORCE; NOT NEEDED; NOT REQUIRED; NOT-EMPLOYED; NULL; OCCUPATION; REFUSED; REQUESTED. |
Contributor Name
Description of changes made:
| Column | Description of changes |
|---|---|
| contributor_first_name | trailing spaces were removed |
| contributor_last_name | trailing spaces were removed |
Summary of Prepared Dataset
Data Table
Data Summary
Some useful data summary tables are provided below. These tables provide quick searchable contribution and donor count summaries for many of the columns in the prepared dataset, which allows the reader the self-service capability of answering many specific questions on the fly, such as “which party received the most contributions in 2017?” (filter Summary Table 1 on the year 2017 column) or “which employer is associated with the highest contributions in 2018?” (Summary Table 7) or “which Republican committee type had the most donors in 2016?” (Summary Table 12).
| Table | Description |
|---|---|
| Summary Table 1 | Contributions by Party by Year |
| Summary Table 2 | Donors by Party by Year |
| Summary Table 3 | Contributions by Committee Type by Year |
| Summary Table 4 | Donors by Committee Type by Year |
| Summary Table 5 | Contributions by Committee Name by Year |
| Summary Table 6 | Donors by Committee Name by Year |
| Summary Table 7 | Contributions by Employer by Year |
| Summary Table 8 | Donors by Employer Type by Year |
| Summary Table 9 | Contributions by Occupation by Year |
| Summary Table 10 | Donors by Occupation Type by Year |
| Summary Table 11 | Contributions by Year - All Searchable COlumns |
| Summary Table 12 | Donors by Year - All Searchable Columns |
Summary Table 1 - Contributions by Party by Year
Summary Table 2 - Donors by Party by Year
Summary Table 3 - Contributions by Committee Type by Year
Summary Table 4 - Donors by Committee Type by Year
Summary Table 5 - Contributions by Committee Name by Year
Summary Table 6 - Donors by Committee Name by Year
Summary Table 7 - Contributions by Employer by Year
Summary Table 8 - Donors by Employer Type by Year
Summary Table 9 - Contributions by Occupation by Year
Summary Table 10 - Donors by Occupation Type by Year
Summary Table 11 - Contributions by Year - All Searchable COlumns
Summary Table 12 - Donors by Year - All Searchable Columns
Simple Analysis & Trends
Instructions are listed above each visualization and commentary is found below each visualization.
Visualization 1
Illustrate how the amount contributed has varied over time from 2015 to present.
Contributions tend to go up in even-numbered years when there’s an election (presential, house, senate) and dip dramatically during odd-numbered years when there’s no election. This pattern is similar when viewing by number of contributions received as well as total contribution dollars. It’s interesting to note that while the number of contributions were consistent from 2016 to 2018, the size of the Democratic party contributions increased dramatically (overtaking that of the Republican party contributions) in the 2018 House/Senate elections.
Visualization 2
Compare the amount of money each party raised for the two most recent election cycles. (2015-2016; 2017-2018)
When aggregating at the “election cycle level”, a higher-level trend becomes apparent. Contribution dollars appear tend to go down in aggregreate between non-presidential elections, although not as dramatically as I would have thought.
Visualization 3
Illustrate the association between the amount contributed and committee type.
It’s interesting to note that while contributions went down in total from the 15-16 election cycle to the 17-18 election cycle, the contribution patterns are consistent with intuition when viewed at the specific election type level. The presidential election contributions went down dramatically between election cycles as did the house and senate contributions. We know that the presidential election occurs in 2016 and the house election occurs in 2018, and we see from the visualization that presidential election contributions went up in the 15-16 election cycle and fell dramatically in the 17-18 (non-presidential) election cycle, while the house contributions fell in the 15-16 election and spiked again in the 17-18 house election cycle. The senate doesn’t show as much of a dip between the senate actually has elections every 2 years, and it doesn’t appear that senate contributions are impacted too much by the presidential election cycle. Again, this all makes intuitive sense, as contributions are expected to increase during election cycles, respectively.
Directed Analysis
Analysis 1
Which Cincinnati residents spend the most money on politics and where does it go?
The visualization shows that the top donors appear to give more to House and Senate election committees than to presidential committees. Also, many of the top donors appear to be related (same last name).
The table shows that 8 out of the top 10 donors gave to the Republican party.
Analysis 2
Who make the most individual contributions and why do you think this is?
The data table provided shows that Democrats make the most donations.
Most of these top donors (as measured by number of contributions) appear to be either retired or without a stated occupation.
Further inspection reveals that the occupations contributing the most-frequently are “RETIRED”, “NO OCCUPATION LISTED” (NAs), “ATTORNEY”, “PHYSICIAN” and “PROFESSOR”. My guess is that retirees give often because they have nothing better to do than to sit around and think about what to spend their money on; plus as people get older they tend to get choosier about what they give money too, and politics seems to be a source of identity for many people and so the more often people give, the more “dedicated to their cause” they might feel. Professors also tend to be in a more politically-charged environment in academia relative to other professions, so it would make sense to me intuitively that Professors donate to political campaigns more frequently.
However, on the basis of average contribution size, “GENERAL MANAGEMENT”, “CO-CHIEF EXECUTIVE OFFICER”, “REAL ESTATE INVESTMENT & MANAG”, “BOARDMEMBER” and “CANDIDATE” are the occupations who give the most per contribution.
Analysis 3
Which employers are most heavily represented in the data? Hint: Using only the total number of individual contributions per employer will lead to a very biased outcome.
The top employers represented based on total amount contributed (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are American Financial Group, University of Cincinnati, River Trading Company, Chavez Properties, Cincinnati Children’s, Proctor & Gamble.
The top employers represented based on total number of employees contributing (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are University of Cincinnati, Cincinnati Children’s, Proctor & Gamble, and TriHealth.
The top employers represented based on the average size of the employee total contributions (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are Castellini Management Company, River Trading Company, Bachman Group, Chemed, Capital Investment Group and Communicare. These employers could be thought of as having the most-generous employees (depending on how you spin it).
Analysis 4
How many individuals who work for Xavier have made at least one contribution? How much money in total have Xavier University employees contributed, and which employees have contributed the most?
A total of 22 Xavier University employees have made donations a total of 84 times. Total contributions sum to 10,892. Average donation size is 130, and each employee has contributed 495 on average.
The top contributors based on number of contributions are George Traub (28), Martha Holland (10), Janet Schultz (8), Elizabeth Johnson (6), and Brenda Levya-Gardner (4).
Based on total amount contributed, the top donors at Xavier University are Ann Tracey (1,750), Martha Holland (1,350), Karl Stukenberg (1,000), Janey Schultz (700), and Annmarie Tracey (650), who appears could be the same as “Ann Tracey” but under a different spelling of her name.
Based on average donation size (average amount contributed per donation) of people who have donated more than once, the top donors at Xavier University are Ann Tracey (583 per donation), Annmarie Tracey (325 per donation), Amanda Pavlick (263 per donation), Terry Toepker (200 per donation), and Katherine Loveland (175 per donation).
Self-Directed Analysis
Analysis 1
Compare contributions to refunds (positive versus negative amounts in the contribution field). For example, you might consider exploring whether there is something unique about the organizations refunding donations or perhaps refunds appear to occur at the same time ever year.
The largest contributors to refunded donations appears to be No Employer Listed (NA), Self-Employed and Retirees. These 3 account for the vaset majority of refunds.
The vast majority of these refunds occur in the first 3 months of the year. The original contributions could either be related to tax deductions that were being sought, or donor remorse, where the donor felt pressured to give in the late in the months leading up to election day, and then after a few months the donor changed his or her mind, perhaps due to the election not ended in his/her favor.
Analysis 2
Pursue one additional self-directed analytical question using the following framework:
I’m interested in exploring whether donor refunds are associated with election losses. Do donors experience “donor remorse” if an election doesn’t end in his/her favor? I will expand on the data used for the prior dataset to determine which party donation refunds are associated. For example, were Democrats more likely to ask for refunds after the 2016 Presidential election loss? And were Republicans more likely to ask for donation refunds after the 2018 House elections favored Democrats?
Based on the data shown above, I’m not seeing a strong association (on the surface) between refunds being related to election losses. There were plent of republican refunds in 2018 (when the Republican House lost the election), but there were even more refunds given to Democrats. There also doesn’t appear to be a large number of refunds associated with the 2016 Presidential election (one might have expected a greater number of refunds to Democratic donors appearing during late 2016 or early 2017 after Trump won the presidential election, but I’m not seeing that in the data.)