Fall 2019 - Exam 1 - Cincinnati Money in Politics

Introduction, Definitions, and Required Packages

Introduction

The purpose of this document is to use the Federal Election Commission (FEC) data to perform an analysis on the shifting political climate in the Cincinnati region from 2015 to present day. The data will be converted into a usable format from which basic analytic insights can be derived. Proper manipulation and analysis of the data will help the reader better visualize the way that money (contributions) influences political campaigns in Cincinnati.

Definitions

The data is stored in a table where each observational row is an instance of a monetary contribution being made to a political committee. The data collected and reported by the FEC includes all monetary contributions made by individuals to politically affiliated organizations. The data includes only contributions where a political party is affiliated with the committee receiving the contribution. This means all non-affiliated super PAC committees are excluded. The decision to exclude non-affiliated contributions was made to reduce the data to a manageable size. The variables recorded describe some aspect of the contribution. The data includes the following variables:

Column Data Type Description
Contributor First Name Character First Name of the person who made the donation
Contributor Last Name Character Last Name of the person who made the donation
Contributor Street Address Character Address of the person who made the donation
Contributor Employer Character Name of contributor’s employer
Contributor Occupation Character Occupation of contributor
Date of Contribution Date Date the donation was received
Contribution Amount Numeric Positive values reflect a donation whereas negative values reflect a refund of a previous donation
Contributor Aggregate YTD Numeric The total amount of contributions made year-to-date by the individual contributor
Committee Name Character/Factor Name of the political committee receiving the contribution
Committee Type Character/Factor House, PAC, Party Non-Qualified, Party Qualified, Presidential, Senate
Committee Party Affiliation Character/Factor Democratic, Republican, Libertarian, Green, Independent, and Democratic-Farmer-Labor parties are all represented in the data

Required Packages

Some of the below information is redundant. For example, the dplyr package is contained within the tidyverse package. Both are provided when a package within tidyverse provides a particularly useful function. The packages required for this markdown are:

Package Description
tidyverse the tidyverse collection of packages all together
DT makes interactive javascript data tables
dplyr also in tidyverse; allows for easy data manipulation in R (filter, select, mutate, group_by, etc)
stringr provides useful functions for searching for specifc strings within a character field
lubridate provides useful date parsing and manipulation functions
ggplot2 makes graphs
skimr has a useful “skim” function for quick summary data

Data Cleaning

Contribution Date

Description of changes made to the date field:

Type Description
formatting This field was originally showing up as a character field in a “dd-mm-yyyy” format. This was modified using the dmy() function from the lubridate package and converted to a “date” data type.
missing There were 2 records with missing contribution dates. These records were excluded because the main purpose of the overall analysis relies on contribution trends over time, so no meaning can be derived from a record having an invalid date.
date range 10 records were found to have contribution dates prior to 2015. These records are erroneous since the exam instructions indicated that all data contribution data was as of 2015 or greater. Since there only 10 records out of over 57,000 that fell into this category, it was not worth analyzing further, and these records were tossed out of the dataset.

Contribution Amounts

Description of changes made:

Column Description of changes
contribution_receipt_amount There were 12 records that were removed due to having $0.00 in the contribution amount.
contributor_aggregate_ytd Due to the possibility of refunds, there were many records found to have $0.00 in this field but with non-zero amounts in the contribution_receipt_amount field. These records were retained.

Committee Information

Description of changes made:

Column Description of changes
Committee Name No apparent missing or bad data. No changes were made.
Qualified This is a new column added to the data set to detect whether or not a contribution is considered “Qualified” (versus Nonqualified). The column is coded as a 0,1 dummy variable. PAC and Party records contained an additional string with this “Qualified”/“Nonqualified” as part of the Committee Type record. These will be stripped off and coded as a 1 if the committee type contains “Qualified”, otherwise coded as 0 for Nonqualified. House, Senate, and Presidential committees are always considered “Qualified”.
Committee Type There is a relatively small number of PAC and Party records (less than 500 out of 57,000) that contain reference to a “non-qualified” or “non-contribution account”. Because of this smaller number (and further, confirming that these same records do not add up to a material amount), these records were modified to exclude the extra “non-qualified” and “non-contribution account” descriptors from the string. Further, the inclusion of a new “Qualified” dummy variable further negates the need for a “qualified” descriptor in another 19,000 records. These roughly 19,500 records were unified into simple committee types displaying only “PAC” and “Party”. No other records were affected or modified, and there doesn’t appear to be any missing values.
Committee Party Affiliation There 6 Political Parties appearing in the original dataset (Democratic, Republican, Green, Libertarian, Independent, and Democratic-Farmer-Labor). However, there are only 251 contributions (out of over 57,000) associated with the 4 smaller parties (Green, Libertarian, Independent, and Democratic-Farmer-Labor). Due to the very small number of contributions, these 4 were unified into an “Other” category. In order to shorten the length of this field, I also removed the word “PARTY” from every record since this is implied from the column name.

Employer/Occupation

Description of changes made:

Column Description of changes
Employer Records containing the word “RETIRED” (multiple variations) were unified to contain only the word “RETIRED”. Note that I was careful NOT to modify 18 records containing “ALLIANCE FOR RETIRED AMERICANS” as this appeared to me to be a legitimate employer. Likewise, records containing the word “HOMEMAKER” were also unified, as well as variations of self employed (“SELF-EMPLOYED”, “SELF EMPLOYED”, “SELF”, “SELF-NA”, “SELF NA”). Lastly, the word “University” was mispelled a couple different ways in “Xavier University”, and these were corrected.
Missing Employer Data The following words were interpreted as “missing” and were unified to contain only “NA” (distinct words are separated by semicolons in the following list): N.A.; N A; N/A; NA; NONE; NOT NEEDED; NOT REQUIRED; NOT-EMPLOYED; NOT, EMPLOYED; NOT EMPLOYED; (Other); NA’s; NULL; EMPLOYED; 1950; 1966; REFUSED; REQUESTED; INFORMATION NA; INFORMATION NA PER BEST EFFORTS.
Occupation Records containing the word “RETIRED” (multiple variations, including 1 instance of “REETIRED”) were unified to contain only the word “RETIRED”. Likewise, records containing the word “HOMEMAKER” were also unified, as well as variations of self employed (“SELF-EMPLOYED”, “SELF EMPLOYED”, “SELF”, “SELF-NA”, “SELF NA”).
Missing Occupation Data The following words were interpreted as “missing” and were unified to contain only “NA” (distinct words are separated by semicolons in the following list): [INFORMATION REQUESTED]; NA; N/A; NONE; NOT EMPLOYED; NOT EMPLOYED; NOT IN WORKFORCE; NOT NEEDED; NOT REQUIRED; NOT-EMPLOYED; NULL; OCCUPATION; REFUSED; REQUESTED.

Contributor Name

Description of changes made:

Column Description of changes
contributor_first_name trailing spaces were removed
contributor_last_name trailing spaces were removed

Summary of Prepared Dataset

Data Table

Data Summary

Some useful data summary tables are provided below. These tables provide quick searchable contribution and donor count summaries for many of the columns in the prepared dataset, which allows the reader the self-service capability of answering many specific questions on the fly, such as “which party received the most contributions in 2017?” (filter Summary Table 1 on the year 2017 column) or “which employer is associated with the highest contributions in 2018?” (Summary Table 7) or “which Republican committee type had the most donors in 2016?” (Summary Table 12).

Table Description
Summary Table 1 Contributions by Party by Year
Summary Table 2 Donors by Party by Year
Summary Table 3 Contributions by Committee Type by Year
Summary Table 4 Donors by Committee Type by Year
Summary Table 5 Contributions by Committee Name by Year
Summary Table 6 Donors by Committee Name by Year
Summary Table 7 Contributions by Employer by Year
Summary Table 8 Donors by Employer Type by Year
Summary Table 9 Contributions by Occupation by Year
Summary Table 10 Donors by Occupation Type by Year
Summary Table 11 Contributions by Year - All Searchable COlumns
Summary Table 12 Donors by Year - All Searchable Columns

Summary Table 1 - Contributions by Party by Year

Summary Table 2 - Donors by Party by Year

Summary Table 3 - Contributions by Committee Type by Year

Summary Table 4 - Donors by Committee Type by Year

Summary Table 5 - Contributions by Committee Name by Year

Summary Table 6 - Donors by Committee Name by Year

Summary Table 7 - Contributions by Employer by Year

Summary Table 8 - Donors by Employer Type by Year

Summary Table 9 - Contributions by Occupation by Year

Summary Table 10 - Donors by Occupation Type by Year

Summary Table 11 - Contributions by Year - All Searchable COlumns

Summary Table 12 - Donors by Year - All Searchable Columns

Directed Analysis

Analysis 1

Which Cincinnati residents spend the most money on politics and where does it go?

The visualization shows that the top donors appear to give more to House and Senate election committees than to presidential committees. Also, many of the top donors appear to be related (same last name).

The table shows that 8 out of the top 10 donors gave to the Republican party.

Analysis 2

Who make the most individual contributions and why do you think this is?

The data table provided shows that Democrats make the most donations.

Most of these top donors (as measured by number of contributions) appear to be either retired or without a stated occupation.

Further inspection reveals that the occupations contributing the most-frequently are “RETIRED”, “NO OCCUPATION LISTED” (NAs), “ATTORNEY”, “PHYSICIAN” and “PROFESSOR”. My guess is that retirees give often because they have nothing better to do than to sit around and think about what to spend their money on; plus as people get older they tend to get choosier about what they give money too, and politics seems to be a source of identity for many people and so the more often people give, the more “dedicated to their cause” they might feel. Professors also tend to be in a more politically-charged environment in academia relative to other professions, so it would make sense to me intuitively that Professors donate to political campaigns more frequently.

However, on the basis of average contribution size, “GENERAL MANAGEMENT”, “CO-CHIEF EXECUTIVE OFFICER”, “REAL ESTATE INVESTMENT & MANAG”, “BOARDMEMBER” and “CANDIDATE” are the occupations who give the most per contribution.

Analysis 3

Which employers are most heavily represented in the data? Hint: Using only the total number of individual contributions per employer will lead to a very biased outcome.

The top employers represented based on total amount contributed (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are American Financial Group, University of Cincinnati, River Trading Company, Chavez Properties, Cincinnati Children’s, Proctor & Gamble.

The top employers represented based on total number of employees contributing (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are University of Cincinnati, Cincinnati Children’s, Proctor & Gamble, and TriHealth.

The top employers represented based on the average size of the employee total contributions (excluding “NA”, “SELF”, “RETIRED” and “HOMEMAKER”) are Castellini Management Company, River Trading Company, Bachman Group, Chemed, Capital Investment Group and Communicare. These employers could be thought of as having the most-generous employees (depending on how you spin it).

Analysis 4

How many individuals who work for Xavier have made at least one contribution? How much money in total have Xavier University employees contributed, and which employees have contributed the most?

A total of 22 Xavier University employees have made donations a total of 84 times. Total contributions sum to 10,892. Average donation size is 130, and each employee has contributed 495 on average.

The top contributors based on number of contributions are George Traub (28), Martha Holland (10), Janet Schultz (8), Elizabeth Johnson (6), and Brenda Levya-Gardner (4).

Based on total amount contributed, the top donors at Xavier University are Ann Tracey (1,750), Martha Holland (1,350), Karl Stukenberg (1,000), Janey Schultz (700), and Annmarie Tracey (650), who appears could be the same as “Ann Tracey” but under a different spelling of her name.

Based on average donation size (average amount contributed per donation) of people who have donated more than once, the top donors at Xavier University are Ann Tracey (583 per donation), Annmarie Tracey (325 per donation), Amanda Pavlick (263 per donation), Terry Toepker (200 per donation), and Katherine Loveland (175 per donation).

Self-Directed Analysis

Analysis 1

Compare contributions to refunds (positive versus negative amounts in the contribution field). For example, you might consider exploring whether there is something unique about the organizations refunding donations or perhaps refunds appear to occur at the same time ever year.

The largest contributors to refunded donations appears to be No Employer Listed (NA), Self-Employed and Retirees. These 3 account for the vaset majority of refunds.

The vast majority of these refunds occur in the first 3 months of the year. The original contributions could either be related to tax deductions that were being sought, or donor remorse, where the donor felt pressured to give in the late in the months leading up to election day, and then after a few months the donor changed his or her mind, perhaps due to the election not ended in his/her favor.

Analysis 2

Pursue one additional self-directed analytical question using the following framework:

I’m interested in exploring whether donor refunds are associated with election losses. Do donors experience “donor remorse” if an election doesn’t end in his/her favor? I will expand on the data used for the prior dataset to determine which party donation refunds are associated. For example, were Democrats more likely to ask for refunds after the 2016 Presidential election loss? And were Republicans more likely to ask for donation refunds after the 2018 House elections favored Democrats?

Based on the data shown above, I’m not seeing a strong association (on the surface) between refunds being related to election losses. There were plent of republican refunds in 2018 (when the Republican House lost the election), but there were even more refunds given to Democrats. There also doesn’t appear to be a large number of refunds associated with the 2016 Presidential election (one might have expected a greater number of refunds to Democratic donors appearing during late 2016 or early 2017 after Trump won the presidential election, but I’m not seeing that in the data.)

Lonie Moore

2019-10-22