DACSS 603
[1] "C:/Users/Bud/Desktop/project_donation_data"
#here::here()
1.What is your research question for the final project?
2.What is your hypothesis (i.e. an answer to the research question) that you want to test?
3.Present some exploratory analysis. In particular:
a.Numerically summarize (e.g. with the summary() function) the variables of interest (the outcome, the explanatory variable, the control variables).
b.Plot the relationships between key variables. You can do this any way you want, but one straightforward way of doing this would be with the pairs() function or other scatter plots / box plots. Interpret what you see.
Professional sports leagues in the United States have been at the intersection of politics since the inception of sports leagues. It is therefore not surprising that owners of these sports leagues donate to political parties.
For this project, I draw from the Five-Thirty-Eight database of professional sports league ownership,2016-2020 in 2 year election cycles.Data is in CSV file format.
The data contains every confirmed partisan political contribution from team owners and commissioners in the NFL, NBA, WNBA, NHL, MLB and NASCAR from 2016-2020.
What were the political donation patterns of major sport team owners in the United States,during election cycles 2016-2020?
My hypothesis is that professional sports team franchise owners are more likely to donate to Republican candidates, issues and causes.
a.Numerically summarize (e.g. with the summary() function) the variables of interest (the outcome, the explanatory variable, the control variables).
First csv data is imported
To gain insight I will first explore the data set.
head(project_donation_data,10) #first 10 rows of data set
# A tibble: 10 x 7
Owner Team League Recipient Amount `Election Year` Party
<chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 Adam Silver Commissi~ NBA WRIGHT 2~ $4,000 2016 Demo~
2 Adam Silver Commissi~ NBA BIDEN FO~ $2,800 2020 Demo~
3 Adam Silver Commissi~ NBA CORY 2020 $2,700 2020 Demo~
4 Adam Silver Commissi~ NBA Kamala H~ $2,700 2020 Demo~
5 Adam Silver Commissi~ NBA Win The ~ $2,700 2020 Demo~
6 Adam Silver Commissi~ NBA KOHL FOR~ $2,000 2018 Demo~
7 Adam Silver Commissi~ NBA BETO FOR~ $1,000 2018 Demo~
8 Adam Silver Commissi~ NBA MONTANAN~ $1,000 2018 Demo~
9 Adam Silver Commissi~ NBA SERVE AM~ $1,000 2018 Demo~
10 Adam Silver Commissi~ NBA ADAM SCH~ $1,000 2020 Demo~
tail(project_donation_data,10) #last 10 rows of data set
# A tibble: 10 x 7
Owner Team League Recipient Amount `Election Year` Party
<chr> <chr> <chr> <chr> <chr> <dbl> <chr>
1 Zygi Wilf Minnesota ~ NFL AMY FOR ~ $5,600 2020 Demo~
2 Zygi Wilf Minnesota ~ NFL KLOBUCHA~ $5,400 2018 Demo~
3 Zygi Wilf Minnesota ~ NFL Gridiron~ $5,000 2020 Bipa~
4 Zygi Wilf Minnesota ~ NFL MCCOLLUM~ $2,700 2016 Demo~
5 Zygi Wilf Minnesota ~ NFL TERRI BO~ $2,700 2016 Demo~
6 Zygi Wilf Minnesota ~ NFL ANGIE CR~ $2,700 2018 Demo~
7 Zygi Wilf Minnesota ~ NFL DEAN PHI~ $2,700 2018 Demo~
8 Zygi Wilf Minnesota ~ NFL MENENDEZ~ $2,700 2018 Demo~
9 Zygi Wilf Minnesota ~ NFL TINA SMI~ $2,700 2018 Demo~
10 Zygi Wilf Minnesota ~ NFL TOM MALI~ $2,700 2018 Demo~
summary(project_donation_data) #provides look at structure of data
Owner Team League
Length:2798 Length:2798 Length:2798
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
Recipient Amount Election Year
Length:2798 Length:2798 Min. :2016
Class :character Class :character 1st Qu.:2016
Mode :character Mode :character Median :2018
Mean :2018
3rd Qu.:2020
Max. :2020
Party
Length:2798
Class :character
Mode :character
glimpse(project_donation_data)
Rows: 2,798
Columns: 7
$ Owner <chr> "Adam Silver", "Adam Silver", "Adam Silver",~
$ Team <chr> "Commissioner", "Commissioner", "Commissione~
$ League <chr> "NBA", "NBA", "NBA", "NBA", "NBA", "NBA", "N~
$ Recipient <chr> "WRIGHT 2016", "BIDEN FOR PRESIDENT", "CORY ~
$ Amount <chr> "$4,000", "$2,800", "$2,700", "$2,700", "$2,~
$ `Election Year` <dbl> 2016, 2020, 2020, 2020, 2020, 2018, 2018, 20~
$ Party <chr> "Democrat", "Democrat", "Democrat", "Democra~
Checking for missing data
[1] 0
I use the skimr() function to glean any additional insight into data set.
skim(project_donation_data) #provides insight into data set
Name | project_donation_data |
Number of rows | 2798 |
Number of columns | 7 |
_______________________ | |
Column type frequency: | |
character | 6 |
numeric | 1 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Owner | 0 | 1 | 9 | 43 | 0 | 158 | 0 |
Team | 0 | 1 | 9 | 59 | 0 | 115 | 0 |
League | 0 | 1 | 3 | 14 | 0 | 16 | 0 |
Recipient | 0 | 1 | 3 | 96 | 0 | 1274 | 0 |
Amount | 0 | 1 | 3 | 10 | 0 | 244 | 0 |
Party | 0 | 1 | 3 | 33 | 0 | 7 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Election Year | 0 | 1 | 2017.93 | 1.6 | 2016 | 2016 | 2018 | 2020 | 2020 | <U+2587><U+2581><U+2587><U+2581><U+2587> |
For easier readability and more insight into the data, it is converted to tabular format.
Variables of Interest
colnames(project_donation_data) #variable names
[1] "Owner" "Team" "League" "Recipient"
[5] "Amount" "Election Year" "Party"
Here I create a new object donation_owner and then check for which team owners donated to political parties most frequently
Charles Johnson Micky Arison
213 178
John Rogers Dan DeVos
149 116
Jody Allen (Paul G. Allen Trust) Jimmy and Susan Haslam
108 102
Ken Kendrick Jerry Reinsdorf
86 78
Herbert Simon Stephen M. Ross
68 65
#donation_owner %>%
kable(head(donation_owner,n = 10)) %>%
kable_classic(full_width=F,html_font = "Cambria4")
Var1 | Freq |
---|---|
Charles Johnson | 213 |
Micky Arison | 178 |
John Rogers | 149 |
Dan DeVos | 116 |
Jody Allen (Paul G. Allen Trust) | 108 |
Jimmy and Susan Haslam | 102 |
Ken Kendrick | 86 |
Jerry Reinsdorf | 78 |
Herbert Simon | 68 |
Stephen M. Ross | 65 |
Here I create a new object donation_league check for frequency of individual league donations.
MLB NBA NFL NHL WNBA NBA, WNBA NBA, NFL
746 462 444 329 274 118 109
NBA, NHL NBA, MLB NASCAR
109 84 79
#donation_owner %>%
kable(head(donation_league,n = 5)) %>%
kable_classic(full_width=F,html_font = "Cambria4")
Var1 | Freq |
---|---|
MLB | 746 |
NBA | 462 |
NFL | 444 |
NHL | 329 |
WNBA | 274 |
Here a new object donation_party checks for which political party received the most donations.
Republican Democrat Bipartisan
1625 921 195
#donation_owner %>%
kable(head(donation_party)) %>%
kable_classic(full_width=F,html_font = "Cambria4") %>%
scroll_box(width = "100%",height = "400px")
Var1 | Freq |
---|---|
Republican | 1625 |
Democrat | 921 |
Bipartisan | 195 |
Bipartisan, but mostly Republican | 40 |
N/A | 9 |
Bipartisan, but mostly Democratic | 5 |
b. Plot the relationships between key variables. You can do this any way you want, but one straightforward way of doing this would be with the pairs() function or other scatter plots / box plots. Interpret what you see.
Here I use a bar plot to show the relationship between donations and sports leagues.
ggplot(data = project_donation_data) +
#theme(axis.text.x = element_text(vjust=1,angle=90))
# theme(axis.text.x = element_text(angle = 90))
stat_count(mapping = aes(x = League))+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
With this barplot we are able to discern the amount of money donated to each professional sports league.We can see that MLB received the most donations followed by the NBA and the NFL at a close third received the most donations.
Here I use a bar plot to show the relationship between political parties and donations.
ggplot(data = project_donation_data) +
#theme(axis.text.x = element_text(vjust=1,angle=90))
# theme(axis.text.x = element_text(angle = 90))
stat_count(mapping = aes(x = Party))+
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Here we see the Republican party received more donations from sports owners with the democratic party coming in a distant second.
Debug
ggplot(data = project_donation_data,
aes(x=`Election Year`,y=League))+
geom_boxplot()