require(dplyr)
require(knitr)
require(pollster)
require(kableExtra)
setwd("/Users/olivia/Documents/Documents/Study/Semester 5/PBA")
judges <- read.csv(file = 'judges.csv')
temp<-xtabs(~woman+republican,judges)
rownames(temp)<-c("Men","Woman")
colnames(temp)<-c("Democrat","Republican")
temp
## republican
## woman Democrat Republican
## Men 82 124
## Woman 27 11
gender_rep_table<-temp/sum(temp);
rownames(gender_rep_table)<-c("Men","Woman")
colnames(gender_rep_table)<-c("Democrat","Republican")
gender_rep_table
## republican
## woman Democrat Republican
## Men 0.33606557 0.50819672
## Woman 0.11065574 0.04508197
cat("Sum of Judges:",sum(temp))
## Sum of Judges: 244
cat("Men Proportion:",round(sum(gender_rep_table[1,])*100,3),"%")
## Men Proportion: 84.426 %
| Democrat | Republican | |
|---|---|---|
| Men | 82 | 124 |
| Woman | 27 | 11 |
| Democrat | Republican | |
|---|---|---|
| Men | 0.3360656 | 0.5081967 |
| Woman | 0.1106557 | 0.0450820 |
hist(judges$progressive_vote,main = "Histogram of progressive_vote",breaks = 10,col="lightcoral")
mean(judges$progressive_vote)
## [1] 0.4351555
median(judges$progressive_vote)
## [1] 0.422619
It shows that the median is more left rather than the mean which is the same as the characteristic of a right-skewed graph.
mean(judges$progressive_vote)
## [1] 0.4351555
median(judges$progressive_vote)
## [1] 0.422619
The highest density is in between 0.25 and 0.50
judges$gender_party<-case_when(judges$woman==1 & judges$republican==0 ~ "F_Demo",
judges$woman==1 & judges$republican==1 ~ "F_Repub",
judges$woman==0 & judges$republican==0 ~ "M_Demo",
judges$woman==0 & judges$republican==1 ~ "M_Repub")
gender_party_means<-tapply(judges$progressive_vote,judges$gender_party,mean)
gender_party_means
## F_Demo F_Repub M_Demo M_Repub
## 0.4547162 0.3069867 0.5062359 0.3952614
barplot(height = gender_party_means,col="lightcoral")
judges$any_girls<-if_else(judges$girls<1|is.na(judges$girls)==TRUE,0,1)
head(judges)
## name child circuit girls progressive_vote race religion
## 1 Alarcon, Arthur L. 3 9 1 0.0000000 3 4
## 2 Aldisert, Ruggero 3 3 1 0.6666667 1 4
## 3 Aldrich, Bailey 2 1 0 0.3333333 1 1
## 4 Alito, Samuel A., Jr. 2 3 1 0.5000000 1 4
## 5 Altimari, Frank X. 4 2 1 0.5000000 1 4
## 6 Ambro, Thomas NA 3 NA 0.0000000 NA NA
## republican sons woman yearb gender_party any_girls
## 1 0 2 0 1925 M_Demo 1
## 2 0 2 0 1919 M_Demo 1
## 3 1 2 0 1907 M_Repub 0
## 4 1 1 0 1950 M_Repub 1
## 5 1 3 0 1928 M_Repub 1
## 6 0 NA 0 NA M_Demo 0
parents<- subset(judges,child>=1)
head(parents)
## name child circuit girls progressive_vote race religion
## 1 Alarcon, Arthur L. 3 9 1 0.0000000 3 4
## 2 Aldisert, Ruggero 3 3 1 0.6666667 1 4
## 3 Aldrich, Bailey 2 1 0 0.3333333 1 1
## 4 Alito, Samuel A., Jr. 2 3 1 0.5000000 1 4
## 5 Altimari, Frank X. 4 2 1 0.5000000 1 4
## 7 Anderson, R. Lanier, III 3 5 NA 0.2500000 1 2
## republican sons woman yearb gender_party any_girls
## 1 0 2 0 1925 M_Demo 1
## 2 0 2 0 1919 M_Demo 1
## 3 1 2 0 1907 M_Repub 0
## 4 1 1 0 1950 M_Repub 1
## 5 1 3 0 1928 M_Repub 1
## 7 0 NA 0 1936 M_Demo 0
ATE <- tapply(parents$progressive_vote, parents$any_girls, mean)
ATE
## 0 1
## 0.3976401 0.4518024
ATE<-ATE[2]-ATE[1]
cat("Difference : ",ATE)
## Difference : 0.05416231
No, we cannot interpret the result by just seeing how many daugther the judges have can influence the decision since there are many other factors that can influence someone’s decision. We can also see the difference is so small (0.05…) so we cannot interpret just like that.