Introduction

Political writing usually discusses things at the statewide level. Here instead using publicly available data we will drill down to see what we can learn about the individual donor. This paper will address the same questions to the three different levels: all politcial parties lumped in together , Republicans v Democrats, and finally, the candidates against each other regardless of party. We will look at the number of donors per zip, median donation per zip, highest participation level per zip, people who gave multiple times per zip.

Overview

Primarily we will be working with the Federal Election Data for NY state as provided by Udacity. We will suppliment this dataset with ZipPop2 a dataset which provides the population per zip code as of 2010 and lastly with a dataset , Candidate_Party ,that provides each candidates political party and that party’s color.

## # A tibble: 100,000 x 12
##      cmte_id         contbr_nm contbr_city   Zip
##       <fctr>            <fctr>      <fctr> <chr>
##  1 C00575795   SHIMANSKY, REBA    NEW YORK 10023
##  2 C00575795      WEBB, LOYANA   RIDGEWOOD 11385
##  3 C00580100      FELDMAN, ROB     COMMACK 11725
##  4 C00577130     KING, MICHAEL   RIDGEWOOD 11385
##  5 C00577130  DUNLOP, DAVID S.    NEWBURGH 12550
##  6 C00575795   PRIORE, PATRICK    NEW YORK 10011
##  7 C00458844         GANZ, ZEV     HEWLETT 11557
##  8 C00575795 SMAYLOVSKY, BELLA    BROOKLYN 11229
##  9 C00575795    COOPER, ANDREW    BROOKLYN 11226
## 10 C00577130     ALLISON, MATT    BROOKLYN 11206
## # ... with 99,990 more rows, and 8 more variables: contbr_employer <fctr>,
## #   contbr_occupation <fctr>, contb_receipt_amt <dbl>,
## #   contb_receipt_dt <fctr>, Last_Name <chr>, Population <int>,
## #   Party <fctr>, Party_Color <fctr>
## [1] -9300

Univariate Plots Section

All Of The Parties Mixed In Together

The number of donations per zip might just be another way of saying how many people live in each zip. Look at how the donations per zip across all of NYS’s thousands of zips and the population per zip graphs are almost interchangeable. Looked at this way we are comparing a NYC zip code that has many apartment buildings with a farming community, the first might have tens of thousands living in the zip while the second might have tens.

But let us zoom in by applying a Log Scale to the Y-axis to the number of donors per zip.

Now NYS donor participation can be seen for its vibrancy. In the non-Logged view it looked like only a few people particiapted and the rest were apathetic. Now that we have made an adjustment for poplualtion differences one can see a lot more involvement.

What was the median donation per zip in NYS dfor the 2016 presidential election?

Now let us look a the median donation across NYS on a log scale. Like the number of donatins per zip the median amount of donation per zip looks like the rest of NYS is not being obsured by the NYC.

Which Zip codes were the most enthusiatic, i.e., which had the highest percentage of donors? We will call this our Enthusiasm level and it is a way of putting all zips on the same playing field . Our Enthusiasm level is the particpation rate, the number of donors devided by a zip code’s population.

What occupations were the most likely to make a donation to a presidential candidate in 2016?

Of all of New York State’s 20million who were the people that donated the most number of times?

##   contbr_nm         Donation.Times   
##  Length:46623       Min.   :  1.000  
##  Class :character   1st Qu.:  1.000  
##  Mode  :character   Median :  1.000  
##                     Mean   :  2.094  
##                     3rd Qu.:  2.000  
##                     Max.   :199.000

Now let us seperate the Republican and Democratic parties and put these same questions to them.

Our first question again is which zips each gave the most number of times. This is an absolute number and is not the same as Enthusiasm which is a percentage of the zip population.

Here is a big difference between the number of Republican and Democratic donors. According to this graph only the Democrats donated. But it turns out that we have to log and zoom in on our data to detect the number of Republican donors.

Now let us look at the differences between the two parties median donations accros the entire state. The Median donation for republicans was two times that of the democrates, $50 v $25.

And which zip codes gave the most number of times to each party? In the ordered barblots it looks like the Republicans had higher rate

##       cmte_id     contbr_nm         contbr_city             Zip     
##  C00580100:187   Length:336         Length:336         10165  :  1  
##  C00574624: 74   Class :character   Class :character   10302  :  1  
##  C00573519: 49   Mode  :character   Mode  :character   10314  :  1  
##  C00458844: 16                                         10454  :  1  
##  C00579458:  6                                         10470  :  1  
##  C00577981:  2                                         10509  :  1  
##  (Other)  :  2                                         (Other):330  
##  contbr_employer                                 contbr_occupation
##  Length:336         RETIRED                               :141    
##  Class :character   INFORMATION REQUESTED                 : 34    
##  Mode  :character   INFORMATION REQUESTED PER BEST EFFORTS: 11    
##                     SALES                                 :  5    
##                     SELF-EMPLOYED                         :  5    
##                     PHYSICIAN                             :  4    
##                     (Other)                               :136    
##  contb_receipt_amt  contb_receipt_dt    Last_Name     Population   
##  Min.   :   2.00   12-Jul-16: 14     Trump   :187   Min.   :    2  
##  1st Qu.:  25.75   11-Jul-16: 11     Cruz    : 74   1st Qu.: 1592  
##  Median :  50.00   8-Aug-16 :  9     Carson  : 49   Median : 3698  
##  Mean   : 159.18   1-Jul-16 :  8     Rubio   : 16   Mean   : 9137  
##  3rd Qu.: 100.00   9-Aug-16 :  8     Bush    :  6   3rd Qu.:11456  
##  Max.   :2700.00   19-Jul-16:  7     Huckabee:  2   Max.   :99598  
##                    (Other)  :279     (Other) :  2                  
##           Party     Party_Color       Freq          Enthusiasm      
##  Conservative:  0   Blue  :  0   Min.   :  1.00   Min.   : 0.00244  
##  Democrat    :  0   Green :  0   1st Qu.:  3.00   1st Qu.: 0.08200  
##  Green       :  0   Orange:  0   Median :  9.00   Median : 0.41503  
##  Libeterian  :  0   Red   :336   Mean   : 24.37   Mean   : 3.41790  
##  Republican  :336   Yellow:  0   3rd Qu.: 30.00   3rd Qu.: 2.08818  
##                                  Max.   :210.00   Max.   :81.51724  
##                                                                     
##   Times.Given   
##  Min.   : 1.00  
##  1st Qu.: 1.00  
##  Median : 1.00  
##  Mean   : 1.78  
##  3rd Qu.: 2.00  
##  Max.   :36.00  
## 
##       cmte_id     contbr_nm         contbr_city             Zip      
##  C00575795:611   Length:1124        Length:1124        10001  :   1  
##  C00577130:513   Class :character   Class :character   10002  :   1  
##  C00458844:  0   Mode  :character   Mode  :character   10003  :   1  
##  C00500587:  0                                         10004  :   1  
##  C00573519:  0                                         10005  :   1  
##  C00574624:  0                                         10006  :   1  
##  (Other)  :  0                                         (Other):1118  
##  contbr_employer                contbr_occupation contb_receipt_amt
##  Length:1124        RETIRED              :183     Min.   :   1.00  
##  Class :character   NOT EMPLOYED         :168     1st Qu.:  15.00  
##  Mode  :character   TEACHER              : 36     Median :  27.00  
##                     PROFESSOR            : 24     Mean   :  74.01  
##                     INFORMATION REQUESTED: 19     3rd Qu.:  50.00  
##                     ATTORNEY             : 17     Max.   :2700.00  
##                     (Other)              :677                      
##   contb_receipt_dt    Last_Name     Population              Party     
##  29-Feb-16:  17    Clinton :611   Min.   :     2   Conservative:   0  
##  30-Apr-16:  14    Sanders :513   1st Qu.:  1670   Democrat    :1124  
##  31-May-16:  14    Bush    :  0   Median :  5460   Green       :   0  
##  6-Nov-16 :  13    Carson  :  0   Mean   : 14403   Libeterian  :   0  
##  25-Oct-16:  12    Christie:  0   3rd Qu.: 19161   Republican  :   0  
##  30-Mar-16:  12    Cruz    :  0   Max.   :109931                      
##  (Other)  :1042    (Other) :  0                                       
##  Party_Color        Freq           Enthusiasm         Times.Given    
##  Blue  :1124   Min.   :   1.00   Min.   :  0.00107   Min.   : 1.000  
##  Green :   0   1st Qu.:   6.00   1st Qu.:  0.06357   1st Qu.: 1.000  
##  Orange:   0   Median :  22.00   Median :  0.31572   Median : 3.000  
##  Red   :   0   Mean   :  80.47   Mean   :  2.68706   Mean   : 4.018  
##  Yellow:   0   3rd Qu.:  68.00   3rd Qu.:  1.52381   3rd Qu.: 5.000  
##                Max.   :2582.00   Max.   :100.00000   Max.   :46.000  
## 

But if we look instead at the numbers of all donations made in the state a different picture ememrges.

Which occupations most frequently gave to each party?

**Do the two parties have any occupations in common among their top 10? Yes, attorneys,physicians, retired. More interestingly though are the self-described groups, “tribes”, that they do not have in common. In a face-off for the Republicans it would be Engineers, Homemakers,Sales,and the Self-Employed against the Democrat’s Consultants, Not-Employed, Professors & Teachers, and Writers. Two of these labels strike me as particuarly politicized in these times: Home maker and *Not-Employed.**

Which party was able to inspire people to give multiple times the most?

Did the Republicans and Democrats have any contributors in common in their top ten list? No.

intersect(Rep.Rank.Multi.Donors$contbr_nm,Dem.Rank.Multi.Donors$contbr_nm)
## character(0)

The Individual Candidates

Finally, let us put our questions to the candidates themselves. If we focused on all 25 we would loose the forest for the trees but 2016 had the great fortune to have 3 viable candidates with seemingly very different bases of support. A mainstream democrat who leaned right, a populist republican something that has not been a main attraction since at least 1970, and a populist democrat who almost won the party nomination. Having these 3 viable candidates gives us the rare opportunity to refract each in turn to see the others in different light. In other words instead of everyone lining up behind their usual more money for schools Democrat or the increase the military budget Republican this cycle we chose from a more money for schools Democrat v anti-trade ,free college and healthcare for all Democrat v a anti- trade, more schooling is not the answer Republican both a anti-trade, anti-elite, Republican.

How do the candidates donations compare? Like in almost every situation we are looking at the number of donations not what the dollars totaled.

Her are allof the donations made to Trump, clinton, and Sanders. Considering how very different the candidates are the donations look similar.

Clinton clearly recieved a greater number of donations but let us look into this a little closer. Trump has the highest median donation.

Did any candidates inspire their donors multiple times?

Were some occupations more likely to give to one candidate rather than another?

Do our three candidates have any occupations in common in their respective Top 5 lists? Trump And Sanders share only the Retire voters

## (polygon[GRID.polygon.1354], polygon[GRID.polygon.1355], polygon[GRID.polygon.1356], polygon[GRID.polygon.1357], text[GRID.text.1358], text[GRID.text.1359], text[GRID.text.1360], text[GRID.text.1361], text[GRID.text.1362])

And what occupations do Clinton and Sanders have in common? They have the most overlap: Retired people, Lawyers, Teachers

Now let us look at tour three candidates respective levels of enthusiasm.

## geom_bar: width = NULL, na.rm = FALSE
## stat_count: width = NULL, na.rm = FALSE
## position_stack

Final Plots And Summary

Looking back on my exploration I see that what I expected to see was that as one zoomed in closer and closer what is thought of as NYS predisendial poling results disaggreagated into something not obviously at all like what statewide polls would have one think. Was it succesfull?

NY is thought of as a Blue , reliably, uncontested state. In popular poling results this is how it is presentedon the left when in fact it is mor elike the right. Granted the difference is that the right side has the Y-axis on log10 scale but I would suggest thuis does not exagerate the Republican presence but dims the light on NYC so the rest of the state can be seen.

If someone were asked to name scatterp;lot of all donations below how many would answer th that is a reliably blue state like NY?

Here is how polling is often presented. Same info as above but leaves one with avery different impression.

Now here is a little dissonance . Trump supporters were characterized as the Left Behind but infact contributions to Trump were the highest of the three major candidates.

Which is the real populist? I think one would have to aggree its the candidate who has a the category Not Employed to himself, Bernie Sanders. The other populsit plays best not with people who want a job tomorrow but rather withtheose who’s working career sare over. Nostalgia, perhaps.

And finally what of our own statistic , Enthusiasm? The 3d graph below pits a zip code’s population against its Enthusiasm. For each of the candidates as the zip code gets less populous , that is asn one goes down the vertical axis the the Enthusiasm , the diagnal axis, grows, but while for Clinton and Sanders the difference is small , for Trump the differnce is pronounced.

Below Clinton really sets herself apart. the diagnal line measures how oftena the same person gave. Clinton is almost alone in getting people to donate ove an and over

So in conclusion I come away surpised by how many Republican donors there are in the very blue state of NY, at how many people were inspired by Clinton , and surprised that Sanders did not do better in NY. All of these things became visible by drilling down into the zip code level.

Reflection

What were some of the struggles in this paper? they are apparent. Things like manually assigning colors on some plots either took way too long or just never got done. Another time sink hole was turning the axis lables 45 deg. It would work on 1 or 2 of 3 but not all three. Seemingly identical code an=on the same data set. Exasperating. Clearly something differnt inhte data but I could not spot it. Also, in an earlier iteration of this aper when on ocation I “Viewed” the data I would find that the columns had been multiplying. Another time I found that dome of my columns wer dataframes INSIDE of my data frames and they do not behaive at all like I wanted. Or it took me a long time to find out that when one applies FILTER to a FACTOR it does not drop out the unwanted but only reduces their value to zero. This is so counterintuitive and I do not knwo when that would be the desired effect. I wish I had been able to import images of NYTime s polls.

Where to go from here? That is easy. Do a longitudinal study. If Udaacity assigned a sequal I would drill down on idvidual contributors through as many election cycles as I could get and see how many , if any, were swing voters and if any were, try to find a a correlation with who knows what? Do they always vote against the party that is in, do they vote acording to economic cycle, does the amount that they give parallel whether they are switching parties? That would be intersting to know. And perhaps take it up a notch and compare voters from different states. This would be one big data set. The NYS 2016 was 100megs. A hundred megs here and a hundred megs there and pretty soon…. This leads me to the next thing. Is it possible to set up AWS on a windows 10 machine and use Amazons machines to hold the data and to run RStudio on it?

Resources