Sirron Melville
This report will be an exploratory data analysis of the financial contributions made to the 2016 Presidential Campaign in the state of Massachusetts. The data set being examined is provided by the Federal Election Commission and contains the financial contributions o the campaign from April 18th 2015 to November 24th 2016.
The following questions will be answered by the analysis:
## [1] 295667 18
## 'data.frame': 295667 obs. of 18 variables:
## $ cmte_id : chr "C00577130" "C00577130" "C00577130" "C00577130" ...
## $ cand_id : chr "P60007168" "P60007168" "P60007168" "P60007168" ...
## $ cand_nm : chr "Sanders, Bernard" "Sanders, Bernard" "Sanders, Bernard" "Sanders, Bernard" ...
## $ contbr_nm : chr "LEDWELL, BENJAMIN" "LEDWELL, BENJAMIN" "LEDWELL, BENJAMIN" "LEDWELL, BENJAMIN" ...
## $ contbr_city : chr "NEWBURYPORT" "NEWBURYPORT" "NEWBURYPORT" "NEWBURYPORT" ...
## $ contbr_st : chr "MA" "MA" "MA" "MA" ...
## $ contbr_zip : int 19504700 19504700 19504700 19504700 10269501 2420 21392903 24621313 25542718 12016408 ...
## $ contbr_employer : chr "ANDOVER POLICE, MA." "ANDOVER POLICE, MA." "ANDOVER POLICE, MA." "ANDOVER POLICE, MA." ...
## $ contbr_occupation: chr "POLICE OFFICER" "POLICE OFFICER" "POLICE OFFICER" "POLICE OFFICER" ...
## $ contb_receipt_amt: num 40 35 50 27 100 ...
## $ contb_receipt_dt : chr "04-MAR-16" "04-MAR-16" "06-MAR-16" "06-MAR-16" ...
## $ receipt_desc : chr "" "" "" "" ...
## $ memo_cd : chr "" "" "" "" ...
## $ memo_text : chr "* EARMARKED CONTRIBUTION: SEE BELOW" "* EARMARKED CONTRIBUTION: SEE BELOW" "* EARMARKED CONTRIBUTION: SEE BELOW" "* EARMARKED CONTRIBUTION: SEE BELOW" ...
## $ form_tp : chr "SA17A" "SA17A" "SA17A" "SA17A" ...
## $ file_num : int 1077404 1077404 1077404 1077404 1077404 1146165 1091718 1091718 1091718 1077404 ...
## $ tran_id : chr "VPF7BKWGAE6" "VPF7BKWGCP3" "VPF7BKYF9S6" "VPF7BM0K9E6" ...
## $ election_tp : chr "P2016" "P2016" "P2016" "P2016" ...
In this dataset, there are 295667 contributions and 18 variables. Below is a visualization of the distribution of contributions.
From this, I noticed that there was a lot of outliers in the data and that in order to accurately answer questions, better visualizations had to be created.
##
## 5 10 100 50 25
## 16780 26856 34241 36978 39546
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -84240.0 15.0 28.0 116.1 100.0 86940.0
The data was scaled logarithmically in order to better represent the distribution of the contributions. The distribution above is relatively gaussian and the data indicates that the contributions of most of the donors were on the low side.
A summary of the data shows that the most frequent amount donated is $25 followed by $50 and $100. The minimum and maximum donations were -$84240 and maximum respectively.
Individuals are only permitted to donate up to $2700 to a candidate due to the contribution limit set by the FEC(Federal Election Commission). For the analysis, I ommited the negative contributions and the contributions above $2700 as these were probably refunds and redesignations. 5897 contributions are either negative or redesignated contributions.
sum(ma$contb_receipt_amt >= 2700)
## [1] 3244
sum(ma$contb_receipt_amt < 0)
## [1] 2653
More variables need to be created for the analysis. These are donors’ gender, donors’ zip code and party affiliation of the candidate.
The 5 variables that have been created are listed below and 5897 contributions were removed due to them either being negative, refunded or redesignated.
Created variables:
With added variables, I can look at the distribution of contributions by candidate, gender, party, and occuption.
## # A tibble: 3 x 5
## party sum_party number_of_candidate mean_party n
## <chr> <dbl> <int> <dbl> <int>
## 1 democrat 25832080.8 5 5166416.2 243358
## 2 others 270771.3 3 90257.1 981
## 3 republican 4605409.9 17 270906.5 24556
## [1] 268895
Based on the dataset, the total number of donations made to the presidential election is 268,895, the Democratic party received 243,358 donations which is approximately 10 times more than the Republican party(24556 donations).
##
## Bush, Jeb Carson, Benjamin S.
## 388 2591
## Christie, Christopher J. Clinton, Hillary Rodham
## 133 147534
## Cruz, Rafael Edward 'Ted' Fiorina, Carly
## 5624 469
## Gilmore, James S III Graham, Lindsey O.
## 1 110
## Huckabee, Mike Jindal, Bobby
## 91 1
## Johnson, Gary Kasich, John R.
## 457 755
## Lessig, Lawrence McMullin, Evan
## 130 20
## O'Malley, Martin Joseph Pataki, George E.
## 269 3
## Paul, Rand Perry, James R. (Rick)
## 490 2
## Rubio, Marco Sanders, Bernard
## 1578 95408
## Santorum, Richard J. Stein, Jill
## 15 504
## Trump, Donald J. Walker, Scott
## 12256 49
## Webb, James Henry Jr.
## 17
Hillary Clinton led the 25 candidates in the number of contributions with almost 150,000 donations, Bernard Sanders and Donald Trump were second and third respectively.
## # A tibble: 2 x 3
## gender sum_gen n_gen
## <chr> <dbl> <int>
## 1 female 15029545 150055
## 2 male 15678717 118840
Women made up about 56% of the donations. Further analysis may help us determine if Hillary Clinton was the reason for this.
Who are those contributors?
## # A tibble: 10 x 4
## contbr_occupation sum_occu mean_occu n
## <ord> <dbl> <dbl> <int>
## 1 RETIRED 4480345.1 108.43830 41317
## 2 NOT EMPLOYED 1417174.5 53.55103 26464
## 3 TEACHER 389587.2 56.29060 6921
## 4 ATTORNEY 1313684.0 212.50146 6182
## 5 PROFESSOR 876504.6 142.56744 6148
## 6 PHYSICIAN 842674.2 160.11290 5263
## 7 CONSULTANT 805573.5 192.12342 4193
## 8 SOFTWARE ENGINEER 361221.3 96.48006 3744
## 9 HOMEMAKER 686431.1 205.39530 3342
## 10 ENGINEER 309927.2 99.68709 3109
It seems that the top 3 occupations of contributors are retirees, people that are not employed and teachers. Homemakers and engineers round out the top ten.
## Min. 1st Qu. Median Mean 3rd Qu.
## "2014-09-25" "2016-03-12" "2016-06-01" "2016-06-01" "2016-09-18"
## Max.
## "2016-12-30"
The above distribution of the date of contribution is somewhat bimodal showing that most of the contributions were around March/April 2016 and close to the election.
The dataset contains 268895 contributions and 18 variables. The variables that will be used in the analysis are:
Other observations:
The main features of interest in the dataset are candidate, contribution amount and party. Analysis using these variables will help answer the aforementioned questions. A combination of variables can also be used to build a logistics regression model to predict the party a donor contributed to.
The party that receives the contribution and contribution amount can be affected by gender, location, occupation and time of the contribution. The average contribution amount may be influenced by occupation and gender may play a role in the party that receives the contribution.
5 variables were created:
The negative contributions and contributions over $2700 were omitted because of the contribution limit set by the FEC which prohibits donors from giving more than $2700. These contributions were either refunded or redesignated.
## # A tibble: 3 x 5
## party sum_party number_of_candidate mean_party n
## <ord> <dbl> <int> <dbl> <int>
## 1 democrat 25832080.8 5 5166416.2 243358
## 2 others 270771.3 3 90257.1 981
## 3 republican 4605409.9 17 270906.5 24556
## ma$cand_nm
## Jindal, Bobby Perry, James R. (Rick)
## 250.00 750.00
## Gilmore, James S III Pataki, George E.
## 2700.00 3950.00
## Santorum, Richard J. McMullin, Evan
## 7620.10 9305.00
## Huckabee, Mike Webb, James Henry Jr.
## 11048.00 12100.09
## Walker, Scott Paul, Rand
## 46345.00 75241.48
## Lessig, Lawrence Fiorina, Carly
## 88483.86 111371.48
## Stein, Jill Graham, Lindsey O.
## 112948.03 147830.00
## Johnson, Gary Christie, Christopher J.
## 148518.27 161570.00
## O'Malley, Martin Joseph Carson, Benjamin S.
## 206496.39 269372.60
## Bush, Jeb Kasich, John R.
## 399839.00 410268.30
## Cruz, Rafael Edward 'Ted' Rubio, Marco
## 447206.14 622812.22
## Trump, Donald J. Sanders, Bernard
## 1887235.59 4603428.95
## Clinton, Hillary Rodham
## 20921571.54
## [1] 30708262
In Massachusetts, the total amount of contributions to the presidential candidates’ was $34,335,685 USD. Most of that money went to Hillary Clinton,Bernard Sanders and Donald Trump .
The Democratic party received $29,364,787 USD which is 6.3 times more than the Republican party which received $4,686,844 USD. What is even more surprising is the fact that there were 17 Republican candidates and 5 Democratic candidates meaning that the Democratic candidates received more on average.
Hillary Clinton received the most amount of contributions followed by Bernard Sanders and Donald Trump respectively.
Massachusetts is a historically blue state and Hillary Clinton has strong political roots there.
Below, boxplots are used to show contribution patterns between candidates and parties.
It is hard to compare the contributions between the parties without scaling the data logarithmically due to the presence of alot of outliers. Below I will apply a log scale and focus my analysis on the Democratic and Republican parties by removing the “others” group.
## ma$party: democrat
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04 15.00 27.00 106.10 75.00 2700.00
## --------------------------------------------------------
## ma$party: republican
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.80 27.17 50.00 187.50 100.00 2700.00
While the Republican party has a higher median and mean contribution amount, the Democratic party has a spread out distribution meaning that they have a range of donors from small to big.
## ma$cand_nm: Bush, Jeb
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5 100 267 1031 2700 2700
## --------------------------------------------------------
## ma$cand_nm: Carson, Benjamin S.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 25 50 104 100 2700
## --------------------------------------------------------
## ma$cand_nm: Christie, Christopher J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 20 250 1000 1215 2700 2700
## --------------------------------------------------------
## ma$cand_nm: Clinton, Hillary Rodham
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04 19.00 36.30 141.80 100.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Cruz, Rafael Edward 'Ted'
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 25.00 50.00 79.52 100.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Fiorina, Carly
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.0 25.0 100.0 237.5 200.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Gilmore, James S III
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2700 2700 2700 2700 2700 2700
## --------------------------------------------------------
## ma$cand_nm: Graham, Lindsey O.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5 500 1000 1344 2300 2700
## --------------------------------------------------------
## ma$cand_nm: Huckabee, Mike
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.0 16.0 50.0 121.4 100.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Jindal, Bobby
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 250 250 250 250 250 250
## --------------------------------------------------------
## ma$cand_nm: Kasich, John R.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.0 50.0 200.0 543.4 500.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Lessig, Lawrence
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.0 100.0 250.0 680.6 500.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: O'Malley, Martin Joseph
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.0 100.0 500.0 767.6 1000.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Pataki, George E.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 250 625 1000 1317 1850 2700
## --------------------------------------------------------
## ma$cand_nm: Paul, Rand
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 25.0 50.0 153.6 100.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Perry, James R. (Rick)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 250.0 312.5 375.0 375.0 437.5 500.0
## --------------------------------------------------------
## ma$cand_nm: Rubio, Marco
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.05 25.00 75.00 394.70 250.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Sanders, Bernard
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 15.00 27.00 48.25 50.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Santorum, Richard J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.00 17.55 100.00 508.00 500.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Trump, Donald J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.80 28.00 72.02 154.00 150.00 2700.00
## --------------------------------------------------------
## ma$cand_nm: Walker, Scott
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 75.0 250.0 500.0 945.8 1000.0 2700.0
## --------------------------------------------------------
## ma$cand_nm: Webb, James Henry Jr.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 100.0 100.1 250.0 711.8 500.0 2700.0
A look at the above visualization shows that Christopher Christie, Lindsey Graham and George Patake have the highest median and Jeb Bush has the greatest IQR(Interquartile range). Hillary Clinton and Bernard Sanders have the lowest median but they also have the most outliers(big donors).
Below, I will look into how the candidates’ did in their parties.
## # A tibble: 22 x 5
## # Groups: party [2]
## party cand_nm sum_can mean_can n
## <chr> <chr> <dbl> <dbl> <int>
## 1 republican Jindal, Bobby 250.00 250.0000 1
## 2 republican Perry, James R. (Rick) 750.00 375.0000 2
## 3 republican Gilmore, James S III 2700.00 2700.0000 1
## 4 republican Pataki, George E. 3950.00 1316.6667 3
## 5 republican Santorum, Richard J. 7620.10 508.0067 15
## 6 republican Huckabee, Mike 11048.00 121.4066 91
## 7 democrat Webb, James Henry Jr. 12100.09 711.7700 17
## 8 republican Walker, Scott 46345.00 945.8163 49
## 9 republican Paul, Rand 75241.48 153.5540 490
## 10 democrat Lessig, Lawrence 88483.86 680.6451 130
## # ... with 12 more rows
In each party, the majority of the donations were received by only few candidates. Hillary Clinton(81%) and Bernard Sanders(18%) received 99% of the total donations received by the Democratic party. Donald Trump received 41% of the total donations received by the Republican party. Donald Trump, Marco Rubio, Ted Cruz, Jeb Bush, John Kasich all made up 83% of the total donations received by the Republican party. The other 12 Republican candidates accounted for the remaining 17%.
It is clear who the top candidates in each party were in Massachusetts. Below, the analysis will continue to examine the candidates who received at least 9% of the total donations received by their party.
## [1] "Clinton, Hillary Rodham" "Sanders, Bernard"
## [3] "Trump, Donald J." "Rubio, Marco"
## [5] "Cruz, Rafael Edward 'Ted'"
We noticed that women made up 56% of the contributions. Further questions to be asked are: Do they make up 56% of the contribution amount? Who do they donate to, liberals and/or women candidates?
## ma$gender: female
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04 15.00 27.00 99.78 72.00 2700.00
## --------------------------------------------------------
## ma$gender: male
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.24 19.00 35.00 131.10 100.00 2700.00
Men donated $131.1 and women donated $99.78 on average. While women made more donations, their contributed amount is a lot less than men as seen by the significant differences in the median, mean and third quartile.
## # A tibble: 2 x 3
## gender sum_gen n
## <chr> <dbl> <int>
## 1 female 14934709 149682
## 2 male 15502782 118232
The above visualization shows that the total contribution amount by gender is very close. this is due to the fact that even though women donated less on average, they made more donations.
## # A tibble: 10 x 3
## # Groups: cand_nm [?]
## cand_nm gender sum_gen_can
## <chr> <chr> <dbl>
## 1 Clinton, Hillary Rodham female 11598864.9
## 2 Clinton, Hillary Rodham male 9322706.6
## 3 Cruz, Rafael Edward 'Ted' female 137480.1
## 4 Cruz, Rafael Edward 'Ted' male 309726.0
## 5 Rubio, Marco female 178444.7
## 6 Rubio, Marco male 444367.6
## 7 Sanders, Bernard female 1987548.9
## 8 Sanders, Bernard male 2615880.1
## 9 Trump, Donald J. female 437974.0
## 10 Trump, Donald J. male 1449261.5
In Massachusetts, the women contributed about 15 million USD to the presidential campaign in 2016. Almost 12 million USD was donated to Hillary Clinton and approximately 1.5 million USD was donated to Bernard Sanders. This supports the assumption that in Massachusetts, women donate more to the liberals and/or women candidates.
We saw that retirees make the most contributions, now we will analyze the total contribution amount and average contribution amount across the top 10 occupations.
## # A tibble: 10 x 4
## contbr_occupation sum_occu mean_occu n
## <ord> <dbl> <dbl> <int>
## 1 RETIRED 4480345.1 108.43830 41317
## 2 NOT EMPLOYED 1417174.5 53.55103 26464
## 3 TEACHER 389587.2 56.29060 6921
## 4 ATTORNEY 1313684.0 212.50146 6182
## 5 PROFESSOR 876504.6 142.56744 6148
## 6 PHYSICIAN 842674.2 160.11290 5263
## 7 CONSULTANT 805573.5 192.12342 4193
## 8 SOFTWARE ENGINEER 361221.3 96.48006 3744
## 9 HOMEMAKER 686431.1 205.39530 3342
## 10 ENGINEER 309927.2 99.68709 3109
Looking at the above visualizations, the retirees, people who are not employed and attorneys are the top three in terms of number of contributions. The attorneys and homemakers are the top 2 when we look at the average contribution amount. Unemployed people tend to contribute the least on average which is expected.
Above is a boxplot of the contribution amount distribution among the various occupations. It is difficult to analyze because of all the outliers.
## top_occu_df$contbr_occupation: ATTORNEY
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.04 25.00 50.00 212.60 200.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: CONSULTANT
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.24 25.00 50.00 191.50 100.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: ENGINEER
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 25.00 40.50 98.23 100.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: HOMEMAKER
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 10.0 25.0 202.9 100.0 2700.0
## --------------------------------------------------------
## top_occu_df$contbr_occupation: NOT EMPLOYED
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.05 13.50 27.00 53.55 50.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: PHYSICIAN
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.44 25.00 50.00 159.80 100.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: PROFESSOR
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.45 23.00 50.00 142.20 100.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: RETIRED
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.5 20.0 35.0 108.1 100.0 2700.0
## --------------------------------------------------------
## top_occu_df$contbr_occupation: SOFTWARE ENGINEER
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 15.00 35.00 93.86 100.00 2700.00
## --------------------------------------------------------
## top_occu_df$contbr_occupation: TEACHER
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.97 15.00 25.00 56.15 50.00 2700.00
The above boxplot has excluded outliers and gives a better representation of the data. The median contribution of homemaker, teacher and unemployed are relatively low.
Attorneys made the largest contributions, some of them donated approximately 4 times their median, they had the most variability and the highest average donation.
Homemakers had the 2nd highest average contribution amount, but they were among the lowest in terms of median contribution. This suggests that the distribution of the data is skewed right with a lot of outliers. One can also assume that the majority of homemakers are women.
Men donated more than women on average even though there were more women donors..
Hillary Clinton raised the most money and had the most donors in Massachusetts. This wasn’t always the case throughout the campaign process. The above two visualizations show that:
Above is the time series trend for the top candidates, Hillary Clinton had steady growth in contribution amount, so did Bernard Sanders until he dropped out to endorse Hillary Clinton. Ted Cruz had a slow and consistent growth in contribution amount which ended when he suspended his campaign in May 2016. Donald Trump’s contribution amount had a steady growth from March 2016 until around September 2016. He was quoted as saying that he wanted to compete in Massachusetts which is a predominatly Democratic state, he even set up a Massachusetts Headquarters.
Where in Massachusetts do the contributors reside?
As stated above, Massachusetts is a historically Democratic state. Most of the Republican supporters seem to be concentrated around Boston which is the largest city in Massachusetts.
Below, I will apply a logistic regression model to the data in order to predict the contributing party of a donor using their gender, donation amount and location(latitude, longitude). The steps to be taken are as follows:
##
## Call:
## glm(formula = party ~ ., family = binomial(link = "logit"), data = train)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.1990 -0.5264 -0.3468 -0.3227 2.6417
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.520e+01 1.206e+00 29.196 < 2e-16 ***
## contb_receipt_amt 3.798e-04 1.544e-05 24.591 < 2e-16 ***
## gendermale 1.000e+00 1.475e-02 67.788 < 2e-16 ***
## latitude -7.499e-01 2.751e-02 -27.253 < 2e-16 ***
## longitude -8.896e-02 1.233e-02 -7.217 5.31e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 150246 on 239877 degrees of freedom
## Residual deviance: 143860 on 239873 degrees of freedom
## (122 observations deleted due to missingness)
## AIC: 143870
##
## Number of Fisher Scoring iterations: 5
##
## model_pred_direction democrat republican
## democrat 26150 1761
## republican 0 3
## [1] "Accuracy 0.936913376800172"
An accuracy of 0.94 on the test set is a very good result but it may not be precise enough as the result is based on the manual split of the data.
For a while, it seemed as though Bernard Sanders was more popular than Hillary Clinton because he received more donations.
The financial contributions to the presidential campaign in Massachusetts were distributed unevenly. In the Democratic party, 99% of the contributions went to Hillary Clinton(81%) and Bernard Sanders(18%). Massachusetts is a historically Democratic state and Hillary Clinton also has strong political ties there.
There is a lot of disparity in the total contribution across occupations. One would assume that attorneys and engineers would be the major contributors but it tunrned out that retirees contributed the most money to the 2016 Presidential campaign in Massachusetts.
It is surprising that software engineers and engineers contributed the least amount to the 2016 Presidential campaign especially since they more than likely make an above average salary. To gain further insight, one would have to know the political background of the industry.
Hillary Clinton was way ahead of the other candidates in the number of contributions and contribution amount received towards the election.
Bernard Sanders was on par with Hillary Clinton in the contribution amount received and ahead in the number of contributions received until he pulled out and decided to endorse her.
The downloaded dataset for the 2016 Presidential campaign for the state of Massachusetts from April 2015 to November 2016 contained 295667 donations. The challenges that I encountered during the analysis are listed below:
These were all challenges due to the fact that I had to either change the data or utilize packages and models that were new to me.
The analysis of the financial contributions to the 2016 Presidential campaign for the state of Massachusetts provided some interesting revelations. * Most of the donations were to a few candidates. * Massachusetts is mostly a Democratic state. * Females seem to donate to liberals and/or a female candidate. * The retirees are the group that made the most number of contributions. * The engineers and software engineers make the least number of contributions and are in the bottom 4 of the top 10 occupations in average contribution amount despite having above average salaries. * Bernard Sanders was more popular than Hillary Clinton until he dropped out of the Presidential campaign.
This analysis was for the state of Massachusetts, analysis of swing states like Florida, Nevada, North Carolina or even analysis of the whole U.S. would provide some very different and interesting insights.
There was a Post-Election surge in contributions to groups that pledge to fight Donald Trumps’s proposed policies. I think that this will be another dataset that might pique interest.