Practice Exam
Section 1: Introduction
The purpose of this document is to illustrate how Cincinnati money influences politics. The data used is a subset of all Cincinnati data from 2015-present. It contains information about Cincinnati residents, including their names, their employers, their occupations, the amount of money contributed and refunded, as well as which type and party of committee their money goes to. The data also records the date of each contribution as well as the YTD contribution of their contributions.
The analysis of the data is in hope to help lay person and any individuals to understand how Cincinnati money influences politics. For example, how the amount contributed has varied over time from 2015 to present, who has made the most individual contributions, how many individuals who work at Xavier University have made at least one contribution, how much money in total have Xavier University employees contributed, and which employees have contributed the most.
Section 2: Required Packages
The packages required for this markdown are:
| Package | Summary |
|---|---|
| tidyverse | The tidyverse collection of packages |
| rmdformats | RMarkdown themes |
| knitr | RMarkdown documents |
| DT | Javascript enabled data tables |
| stargazer | Fancy regression tables |
| corrplot | Simple correlation plots |
| PerformanceAnalytics | Detailed plots and tables for analytics |
| pander | Pretty summary tables output |
| lubridate | Working with date formats |
Section 3: Data Preparation (Cleaning & Wrangling)
Source Data Explained
After cleaning & wrangling, there are a total of 57,375 observations left. Missing values are recorded as “N/A.” More than that, values in contribution_employers and contribution_occupation that were recorded as “Information Requested” and “None” were also converted to “N/A,” indicating missing values. I also converted date in the contribution_receipt_date column into functional date format.
In the committee_type column, committee types of PAC and Party were categorized into qualified and non-qualified. Before creating a new column specifying the qualification of each observation, I first replace all non-qualified committee type to just its name (i.e. “PAC - Nonqualified” to “PAC” and “Party - Nonqualified” to “Party”) for the ease of detecting strings. Then, I use str_detect to look for “House,” “Senate,” “Presidential,” and “Qualified” instead of “PAC” and “Party” since both of them are still named “PAC - qualified” and “Party - qualified.” After that, I replace the rest of strings to only its name.
Below is a summary table explaining each variable in the data set.
| Variable Name | Explanation |
|---|---|
| contributor_last_name | Last name of the contributor |
| contributor_first_name | First name of the contributor |
| contributor_street_1 | Contributor Street Address |
| contributor_employee | Employer of contributor |
| contributor_occupation | Occupation of contributor |
| contribution_receipt_date | Date of contribution |
| contribution_receipt_amount | The amount of contribution contributed in an instance of a contribution being made. Positive values reflect a donation whereas negative values reflect a refund of a previous donation |
| contribution_aggregate_ytd | The total amount of contributions made year-to-date by a particular individual contributor |
| committee_name | Name of the political committee receiving the contribution |
| committee_type | Type of committee (Presidential, Senate, House, Party, or Political Action Committee) |
| committee_party_affiliation | Party affiliation of a committee |
| qualified | TRUE or FALSE attribute, where TRUE represents qualified committee |
Data Table
Data table below shows a fraction of data of Cincinnati Money in Politics
Section 4: Simple Analysis & Trends
Visualization 4.1
The bar chart below illustrates how the amount contributed has varied over time from 2015 to present, 2019.
Visualization 4.2
Below are a summary table and a bar chart comparing the amount of money each party raised for the two most recent election cycles (2015-2016; 2017-2018).
| 2015-2016 | 2017-2018 | Committee Party Affiliation | Amount of Money Raised |
|---|---|---|---|
| FALSE | TRUE | DEMOCRATIC-FARMER-LABOR | 21237 |
| FALSE | TRUE | DEMOCRATIC PARTY | 4473933 |
| FALSE | TRUE | GREEN PARTY | 1150 |
| FALSE | TRUE | INDEPENDENT | 8846 |
| FALSE | TRUE | LIBERTARIAN PARTY | 884 |
| FALSE | TRUE | REPUBLICAN PARTY | 4557983 |
| TRUE | FALSE | DEMOCRATIC PARTY | 4515200 |
| TRUE | FALSE | GREEN PARTY | 6386 |
| TRUE | FALSE | INDEPENDENT | 10740 |
| TRUE | FALSE | LIBERTARIAN PARTY | 11083 |
| TRUE | FALSE | REPUBLICAN PARTY | 5282867 |
Visualization 4.3
The bar chart below illustrates the association between the amount contributed and committee type.
Section 5: Directed Analysis
5.1 Which Cincinnati residents spend the most money on politics and where does it go?
Below are a summary table and a bar chart illustrating the top 5 Cincinnati residents who spent the most money on politics, including the party affiliation in which the money goes to.
The result indicates that Richard Rosenthal is the top Cincinnati resident who has made the most contribution, where most of the contribution goes to the Democratic Party.
Note that the analysis excludes the contribution_receipt_amount that is below 0, which is the refund amount, and summed up the overall amount contributed.
| Contributor Name | Committee Party Affiliation | Total Contribution Amount |
|---|---|---|
| ROSENTHAL, RICHARD | DEMOCRATIC PARTY | 768320 |
| CASTELLINI, ROBERT | REPUBLICAN PARTY | 502100 |
| CASTELLINI, SUSAN | REPUBLICAN PARTY | 374000 |
| WARNER, GERALDINE | REPUBLICAN PARTY | 333400 |
| OETERS, DONALD | REPUBLICAN PARTY | 301560 |
5.2 Who has made the most indiviual contributions? What do you think is going on here?
Below is a bar chart illustrating the most number of individual contributions. This could mean that Ann Ruchhoft contribute more often but in small amount of contributions (in dollar).
5.3 Which employers are most heavily represented in the data?
The bar chart below illustrates the top 5 employers most represented in the data. The result shows that “Retired” and “Self-Employed” are most represented in the data.
5.4 How many individuals who work for Xavier have made at least one contribution? How much money in total have Xavier University employees contributed, and which employees have contributed the most?
The table below answers how many individuals who work for Xavier University have made at least one contribution. There are a total of 23 individuals who made at least one contribution.
| Contributor Employer | Number of Individials |
|---|---|
| XAVIER UNIVERSITY | 23 |
The table below answers the question as to how much money in total have Xavier University employees contributed. The result shows that the total amount of contribution is $11,224.
| Contributor Employer | Total Amount of Contribution |
|---|---|
| XAVIER UNIVERSITY | 11224 |
The table below answers which employees have contributed the most. The result shows the top 5 Xavier employees who have contributed the most to politics, with Ann Tracey contributed $2,400, which is the most out of every employee.
| Contributor Name | Total Amount of Contribution |
|---|---|
| TRACEY, ANN | 2400 |
| HOLLAND, MARTHA | 1350 |
| STUKENBERG, KARL | 1000 |
| TRAUB, GEORGE | 855 |
| TOEPKER, TERRY | 600 |
Section 6: Self-Directed Analysis
6.1 Comparing contributions to refunds
This analysis explores trends of amount of contributions and refunds in each month and in each yea. Faceted bar charts below show trends of refunds amount throughout each year. It shows that refunds are made mostly at the beginning of the year, speficially January to March. Note that there seems to be an outlier on February, 2018 that makes the data stand out.
Faceted bar chart below illustrates trends on the amount of contributions throughout each year. The charts show that contributions are mostly made in 2016 and 2018, and they usually slows down by November. This makes sense since election days are at the beginning of November.
6.2 Which 5 contributors have contributed the most in 2018? What occupation does the contributors have? Which party did the money go to? How much money was contributed in total?
This question is a continuation from 6.1, which specifically explores the contributions in the year of 2018. What I found interesting about this is whether the contributors who contributed the most would get more refund, as well as, what kind of occupation they have.
How I plan to go about answering these questions is that I would first mutate the year of contribution by using the year() function, pasting last and first name of contributor together for the ease of grouping later. Then, I would filter out the year of contribution to be 2018, and contribution amount to be greater zero to only look at amount of contributions and not refunds amount. After that, I will group the data by name of contributor, contributor occupation, and party in which the money goes to. Then, I would summarize the total amount of contributions using the sum() function. After this, I would arrange the total amount by the descending order and selecting the top 5 contributors using head() function. The result table is as shown below.
| Contributor Name | Contributor Occupation | Party Affiliation |
|---|---|---|
| ROSENTHAL, RICHARD | RETIRED | DEMOCRATIC PARTY |
| HATFIELD, EDWARD | PRESIDENT | REPUBLICAN PARTY |
| CASTELLINI, SUSAN | HOMEMAKER | REPUBLICAN PARTY |
| O’MALEY, DAVID | RETIRED | REPUBLICAN PARTY |
| WARNER, DAVID | REAL ESTATE DEVELOPER | REPUBLICAN PARTY |
| Amount of Contributions |
|---|
| 178600 |
| 75700 |
| 74300 |
| 73300 |
| 61200 |
Besides, below is a bar chart showing the top contributors who contributed the most over the years. It turns out that they all only contributed during the years of the election. The chart shows that Richard Rosenthal, who contributed the most, has contributed in both 2016 and 2018, whereas others only contributed in 2016. I found this interesting because as of earlier analysis shows that Richard Rosenthal clearly supports the Democratic Party, and he contributed the most during the year of 2018, the midterm election occurred during the president of Republican Donald Trump.