Homework Four 4_Part 2

DACSS 603

Cynthia Hester
April 18,2022
hide
project_data_path <- here("project_donation_data")
here(project_data_path)
[1] "C:/Users/Bud/Desktop/project_donation_data"
hide
#here::here()

Rough Draft

PART 2 (Final Project)

1.What is your research question for the final project?

2.What is your hypothesis (i.e. an answer to the research question) that you want to test?

3.Present some exploratory analysis. In particular:

a.Numerically summarize (e.g. with the summary() function) the variables of interest (the outcome, the explanatory variable, the control variables).

b.Plot the relationships between key variables. You can do this any way you want, but one straightforward way of doing this would be with the pairs() function or other scatter plots / box plots. Interpret what you see.


Introduction

Professional sports leagues in the United States have been at the intersection of politics since the inception of sports leagues. It is therefore not surprising that owners of these sports leagues donate to political parties.

For this project, I draw from the Five-Thirty-Eight database of professional sports league ownership,2016-2020 in 2 year election cycles.Data is in CSV file format.

The data contains every confirmed partisan political contribution from team owners and commissioners in the NFL, NBA, WNBA, NHL, MLB and NASCAR from 2016-2020.

Project Proposal

  1. What is your research question for the final project?

What were the political donation patterns of major sport team owners in the United States,during election cycles 2016-2020?

  1. What is your hypothesis (i.e. an answer to the research question) that you want to test?

My hypothesis is that professional sports team franchise owners are more likely to donate to Republican candidates, issues and causes.

  1. Present some exploratory analysis. In particular:

a.Numerically summarize (e.g. with the summary() function) the variables of interest (the outcome, the explanatory variable, the control variables).

First csv data is imported

hide
library (readr)

urlfile="https://raw.githubusercontent.com/fivethirtyeight/data/master/sports-political-donations/sports-political-donations.csv"

project_donation_data<-read_csv(url(urlfile))

To gain insight I will first explore the data set.

hide
head(project_donation_data,10)      #first 10 rows of data set
# A tibble: 10 x 7
   Owner       Team      League Recipient Amount `Election Year` Party
   <chr>       <chr>     <chr>  <chr>     <chr>            <dbl> <chr>
 1 Adam Silver Commissi~ NBA    WRIGHT 2~ $4,000            2016 Demo~
 2 Adam Silver Commissi~ NBA    BIDEN FO~ $2,800            2020 Demo~
 3 Adam Silver Commissi~ NBA    CORY 2020 $2,700            2020 Demo~
 4 Adam Silver Commissi~ NBA    Kamala H~ $2,700            2020 Demo~
 5 Adam Silver Commissi~ NBA    Win The ~ $2,700            2020 Demo~
 6 Adam Silver Commissi~ NBA    KOHL FOR~ $2,000            2018 Demo~
 7 Adam Silver Commissi~ NBA    BETO FOR~ $1,000            2018 Demo~
 8 Adam Silver Commissi~ NBA    MONTANAN~ $1,000            2018 Demo~
 9 Adam Silver Commissi~ NBA    SERVE AM~ $1,000            2018 Demo~
10 Adam Silver Commissi~ NBA    ADAM SCH~ $1,000            2020 Demo~
hide
tail(project_donation_data,10)      #last 10 rows of data set
# A tibble: 10 x 7
   Owner     Team        League Recipient Amount `Election Year` Party
   <chr>     <chr>       <chr>  <chr>     <chr>            <dbl> <chr>
 1 Zygi Wilf Minnesota ~ NFL    AMY FOR ~ $5,600            2020 Demo~
 2 Zygi Wilf Minnesota ~ NFL    KLOBUCHA~ $5,400            2018 Demo~
 3 Zygi Wilf Minnesota ~ NFL    Gridiron~ $5,000            2020 Bipa~
 4 Zygi Wilf Minnesota ~ NFL    MCCOLLUM~ $2,700            2016 Demo~
 5 Zygi Wilf Minnesota ~ NFL    TERRI BO~ $2,700            2016 Demo~
 6 Zygi Wilf Minnesota ~ NFL    ANGIE CR~ $2,700            2018 Demo~
 7 Zygi Wilf Minnesota ~ NFL    DEAN PHI~ $2,700            2018 Demo~
 8 Zygi Wilf Minnesota ~ NFL    MENENDEZ~ $2,700            2018 Demo~
 9 Zygi Wilf Minnesota ~ NFL    TINA SMI~ $2,700            2018 Demo~
10 Zygi Wilf Minnesota ~ NFL    TOM MALI~ $2,700            2018 Demo~
hide
summary(project_donation_data)    #provides look at structure of data
    Owner               Team              League         
 Length:2798        Length:2798        Length:2798       
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
                                                         
                                                         
                                                         
  Recipient            Amount          Election Year 
 Length:2798        Length:2798        Min.   :2016  
 Class :character   Class :character   1st Qu.:2016  
 Mode  :character   Mode  :character   Median :2018  
                                       Mean   :2018  
                                       3rd Qu.:2020  
                                       Max.   :2020  
    Party          
 Length:2798       
 Class :character  
 Mode  :character  
                   
                   
                   
hide
glimpse(project_donation_data)
Rows: 2,798
Columns: 7
$ Owner           <chr> "Adam Silver", "Adam Silver", "Adam Silver",~
$ Team            <chr> "Commissioner", "Commissioner", "Commissione~
$ League          <chr> "NBA", "NBA", "NBA", "NBA", "NBA", "NBA", "N~
$ Recipient       <chr> "WRIGHT 2016", "BIDEN FOR PRESIDENT", "CORY ~
$ Amount          <chr> "$4,000", "$2,800", "$2,700", "$2,700", "$2,~
$ `Election Year` <dbl> 2016, 2020, 2020, 2020, 2020, 2018, 2018, 20~
$ Party           <chr> "Democrat", "Democrat", "Democrat", "Democra~

Checking for missing data

hide
#view(project_donation_data)
#names(project_donation_data)    #names of data set variables

project_donation_data %>%
is.na() %>% 
 sum()
[1] 0

I use the skimr() function to glean any additional insight into data set.

hide
skim(project_donation_data)  #provides insight into data set
Table 1: Data summary
Name project_donation_data
Number of rows 2798
Number of columns 7
_______________________
Column type frequency:
character 6
numeric 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Owner 0 1 9 43 0 158 0
Team 0 1 9 59 0 115 0
League 0 1 3 14 0 16 0
Recipient 0 1 3 96 0 1274 0
Amount 0 1 3 10 0 244 0
Party 0 1 3 33 0 7 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Election Year 0 1 2017.93 1.6 2016 2016 2018 2020 2020 <U+2587><U+2581><U+2587><U+2581><U+2587>

For easier readability and more insight into the data, it is converted to tabular format.

hide
library(DT)
datatable(data = project_donation_data,
          rownames = FALSE,
          filter ="top",
          options = list(autoWidth = TRUE))

Variables of Interest

hide
colnames(project_donation_data)       #variable names
[1] "Owner"         "Team"          "League"        "Recipient"    
[5] "Amount"        "Election Year" "Party"        

Here I create a new object donation_owner and then check for which team owners donated to political parties most frequently

hide
#shows what owners donated most frequently to political affiliations.
donation_owner<-(sort(table(project_donation_data$Owner),decreasing = TRUE))    
head(donation_owner,10)

                 Charles Johnson                     Micky Arison 
                             213                              178 
                     John Rogers                        Dan DeVos 
                             149                              116 
Jody Allen (Paul G. Allen Trust)           Jimmy and Susan Haslam 
                             108                              102 
                    Ken Kendrick                  Jerry Reinsdorf 
                              86                               78 
                   Herbert Simon                  Stephen M. Ross 
                              68                               65 
hide
#donation_owner %>% 
  kable(head(donation_owner,n = 10)) %>% 
  kable_classic(full_width=F,html_font = "Cambria4")
Var1 Freq
Charles Johnson 213
Micky Arison 178
John Rogers 149
Dan DeVos 116
Jody Allen (Paul G. Allen Trust) 108
Jimmy and Susan Haslam 102
Ken Kendrick 86
Jerry Reinsdorf 78
Herbert Simon 68
Stephen M. Ross 65

Here I create a new object donation_league check for frequency of individual league donations.

hide
donation_league<-(sort(table(project_donation_data$League),decreasing = TRUE))
head(donation_league,10)     #displays which leagues received donations most often

      MLB       NBA       NFL       NHL      WNBA NBA, WNBA  NBA, NFL 
      746       462       444       329       274       118       109 
 NBA, NHL  NBA, MLB    NASCAR 
      109        84        79 
hide
#donation_owner %>% 

  kable(head(donation_league,n = 5)) %>% 
  kable_classic(full_width=F,html_font = "Cambria4")
Var1 Freq
MLB 746
NBA 462
NFL 444
NHL 329
WNBA 274

Here a new object donation_party checks for which political party received the most donations.

hide
donation_party<-(sort(table(project_donation_data$Party),decreasing = TRUE))
head(donation_party,3)

Republican   Democrat Bipartisan 
      1625        921        195 
hide
#donation_owner %>% 
  kable(head(donation_party)) %>%  
  kable_classic(full_width=F,html_font = "Cambria4") %>% 
  scroll_box(width = "100%",height = "400px")
Var1 Freq
Republican 1625
Democrat 921
Bipartisan 195
Bipartisan, but mostly Republican 40
N/A 9
Bipartisan, but mostly Democratic 5

b. Plot the relationships between key variables. You can do this any way you want, but one straightforward way of doing this would be with the pairs() function or other scatter plots / box plots. Interpret what you see.

Here I use a bar plot to show the relationship between donations and sports leagues.

hide
ggplot(data = project_donation_data) +
   #theme(axis.text.x = element_text(vjust=1,angle=90)) 
   # theme(axis.text.x = element_text(angle = 90))
  stat_count(mapping = aes(x = League))+
theme(axis.text.x = element_text(angle = 45, hjust = 1))

With this barplot we are able to discern the amount of money donated to each professional sports league.We can see that MLB received the most donations followed by the NBA and the NFL at a close third received the most donations.


Here I use a bar plot to show the relationship between political parties and donations.

hide
ggplot(data = project_donation_data) +
   #theme(axis.text.x = element_text(vjust=1,angle=90)) 
   # theme(axis.text.x = element_text(angle = 90))
  stat_count(mapping = aes(x = Party))+
theme(axis.text.x = element_text(angle = 45, hjust = 1))

Here we see the Republican party received more donations from sports owners with the democratic party coming in a distant second.

Debug

hide
ggplot(data = project_donation_data,
        aes(x=`Election Year`,y=League))+

geom_boxplot()