CPP 524
Kidist Gondel
Packages
library( dplyr )
library( pander )
library( ggplot2 )
Data
# load lab data
URL <- "https://github.com/DS4PS/cpp-524-sum-2020/blob/master/labs/data/female-np-entrepreneurs.rds?raw=true"
dat <- readRDS(gzcon(url( URL )))
head( dat )
## gender age income edu.level years.prof.exp experience.np.create
## 1 Female 54 79669 Graduate 11-15 No
## 2 Female 62 63474 Graduate 15+ No
## 3 Female 70 27887 Graduate 15+ Yes
## 4 Male 63 63474 Graduate 15+ Yes
## 5 Female 60 170832 Graduate 15+ Yes
## 6 Female 41 69531 Graduate 6-10 Yes
## experience.np.form experience.np.other take.on.debt seed.funding
## 1 No Yes $0 No
## 2 Yes Yes $0 No
## 3 Yes Yes $0 No
## 4 No Yes $0 No
## 5 Yes Yes $0 Yes
## 6 No No $0 Yes
## most.imp.fund.source
## 1 Donations
## 2 Gov Grant
## 3 Donations
## 4 Donations
## 5 Corp Grant
## 6 Gov Grant
Compare education levels of male and female entrepreneurs.
levels(dat$edu.level)
## [1] "None" "High School" "Some College" "Bachelor" "Graduate"
t <- table( dat$edu.level, dat$gender )
t %>% prop.table( margin=1 ) %>% round(2) %>% pander()
| Â | Female | Male |
|---|---|---|
| None | 0.47 | 0.53 |
| High School | 0.4 | 0.6 |
| Some College | 0.57 | 0.43 |
| Bachelor | 0.6 | 0.4 |
| Graduate | 0.52 | 0.48 |
summary(t)
## Number of cases in table: 554
## Number of factors: 2
## Test for independence of all factors:
## Chisq = 4.383, df = 4, p-value = 0.3566
## Chi-squared approximation may be incorrect
chisq.test(t)
##
## Pearson's Chi-squared test
##
## data: t
## X-squared = 4.3831, df = 4, p-value = 0.3566
chisq.test(t, simulate.p.value = TRUE, B=10000)
##
## Pearson's Chi-squared test with simulated p-value (based on 10000
## replicates)
##
## data: t
## X-squared = 4.3831, df = NA, p-value = 0.3636
ANSWER:
In the test for study group equivalence, conclude there is not enough statistical evidence to support a difference in education level between female and male entrepreneurs.
Compare work experience for male and female entrepreneurs.
levels(dat$years.prof.exp)
## [1] "0" "1-2" "3-5" "6-10" "11-15" "15+"
t2 <- table( dat$years.prof.exp, dat$gender )
t2 %>% prop.table( margin=1 ) %>% round(2) %>% pander()
| Â | Female | Male |
|---|---|---|
| 0 | 0.73 | 0.27 |
| 1-2 | 0.67 | 0.33 |
| 3-5 | 0.62 | 0.38 |
| 6-10 | 0.57 | 0.43 |
| 11-15 | 0.58 | 0.42 |
| 15+ | 0.53 | 0.47 |
chisq.test(t2)
##
## Pearson's Chi-squared test
##
## data: t2
## X-squared = 4.0086, df = 5, p-value = 0.5482
chisq.test(t2, simulate.p.value = TRUE, B=10000)
##
## Pearson's Chi-squared test with simulated p-value (based on 10000
## replicates)
##
## data: t2
## X-squared = 4.0086, df = NA, p-value = 0.5671
bon_alpha2<- 0.05/6
bon_alpha2
## [1] 0.008333333
ANSWER:
In the test for group equivalence, conclude there is not enough statistical evidence to support a difference in work experience between male and female entrepreneurs.
Compare success in accessing seed funding for male and female entrepreneurs.
levels(dat$seed.funding)
## [1] "No" "Yes"
t3 <- table( dat$seed.funding, dat$gender)
t3 %>% prop.table( margin=1 ) %>% round(2) %>% pander()
| Â | Female | Male |
|---|---|---|
| No | 0.55 | 0.45 |
| Yes | 0.54 | 0.46 |
chisq.test(t3)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: t3
## X-squared = 0.0086147, df = 1, p-value = 0.9261
chisq.test(t3, simulate.p.value = TRUE, B=10000)
##
## Pearson's Chi-squared test with simulated p-value (based on 10000
## replicates)
##
## data: t3
## X-squared = 0.033448, df = NA, p-value = 0.8562
bon_alpha3<-0.05/2
bon_alpha3
## [1] 0.025
ANSWER
In the test for group equivalence,conclude there is not enough statistical evidence to support a difference in success in accessing seed funding between male and female entrepreneurs.
Compare the willingness to take on personal debt for male and female entrepreneurs.
levels(dat$take.on.debt)
## [1] "$0" "$0k-$10k" "$10k-$25k" "$25k-$50k" "$50k+"
t4 <- table( dat$take.on.debt, dat$gender )
t4 %>% prop.table( margin=1 ) %>% round(2) %>% pander()
| Â | Female | Male |
|---|---|---|
| $0 | 0.57 | 0.43 |
| $0k-$10k | 0.6 | 0.4 |
| $10k-$25k | 0.39 | 0.61 |
| $25k-$50k | 0.47 | 0.53 |
| $50k+ | 0.36 | 0.64 |
chisq.test(t4)
##
## Pearson's Chi-squared test
##
## data: t4
## X-squared = 8.6158, df = 4, p-value = 0.07145
chisq.test(t4, simulate.p.value = TRUE, B=10000)
##
## Pearson's Chi-squared test with simulated p-value (based on 10000
## replicates)
##
## data: t4
## X-squared = 8.6158, df = NA, p-value = 0.06499
bon_alpha4<-0.05/5
bon_alpha4
## [1] 0.01
ANSWER ANSWER: In the test for group equivalence,conclude there is not enough statistical evidence to support a difference in willingness to take on personal debt between male and female entrepreneurs.
Compare sources of first year funding for male and female entrepreneurs.
levels(dat$most.imp.fund.source)
## [1] "Donations" "Founder" "Earned Revenues" "Foundation Grant"
## [5] "Gov Grant" "Member Fees" "Parent Org" "Angel"
## [9] "Corp Grant"
t5 <- table( dat$most.imp.fund.source, dat$gender )
t5 %>% prop.table( margin=1 ) %>% round(2) %>% pander()
| Â | Female | Male |
|---|---|---|
| Donations | 0.51 | 0.49 |
| Founder | 0.56 | 0.44 |
| Earned Revenues | 0.66 | 0.34 |
| Foundation Grant | 0.59 | 0.41 |
| Gov Grant | 0.59 | 0.41 |
| Member Fees | 0.46 | 0.54 |
| Parent Org | 0.5 | 0.5 |
| Angel | 0.52 | 0.48 |
| Corp Grant | 0.67 | 0.33 |
chisq.test(t5)
##
## Pearson's Chi-squared test
##
## data: t5
## X-squared = 8.8304, df = 8, p-value = 0.3568
chisq.test(t5, simulate.p.value = TRUE, B=10000)
##
## Pearson's Chi-squared test with simulated p-value (based on 10000
## replicates)
##
## data: t5
## X-squared = 8.8304, df = NA, p-value = 0.3681
bon_alpha5<-0.05/9
bon_alpha5
## [1] 0.005555556
ANSWER
In the test for group equivalence,conclude there is not enough statistical evidence to support a difference in sources of first year funding between male and female entrepreneurs.
Compare age at the time of nonprofit formation for male and female entrepreneurs.
is.numeric(dat$age)
## [1] TRUE
tapply(dat$age, dat$gender, summary)
## $Female
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 22.00 44.00 52.00 51.94 59.50 85.00 15
##
## $Male
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 29.00 45.25 57.00 55.05 64.75 82.00 5
boxplot(age~gender, data=dat, col=c("pink", "light green"))
t.test(age~gender, data=dat)
##
## Welch Two Sample t-test
##
## data: age by gender
## t = -3.1749, df = 589.02, p-value = 0.001577
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
## -5.031881 -1.185709
## sample estimates:
## mean in group Female mean in group Male
## 51.93948 55.04828
ANSWER In the test for group equivalence,conclude there IS enough statistical evidence to support a difference age between male and female entrepreneurs.
Compare income levels prior to starting the nonprofit for male and female entrepreneurs.
is.numeric(dat$income)
## [1] TRUE
tapply(dat$income, dat$gender, summary)
## $Female
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 81 33139 61112 67742 86094 199147
##
## $Male
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1026 49437 69351 80883 107363 199684
boxplot(income~gender, data=dat, col=c("pink", "light green"))
t.test(income~gender, data=dat)
##
## Welch Two Sample t-test
##
## data: income by gender
## t = -3.6353, df = 630.22, p-value = 0.0003003
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
## -20239.710 -6042.518
## sample estimates:
## mean in group Female mean in group Male
## 67741.83 80882.95
ANSWER In the test for group equivalence, conclude there IS enough statistical evidence to support a difference in prior income levels between male and female entrepreneurs.
Based upon these seven contrasts, would you conclude that the resources male and female nonprofit entrepreneurs have at the time of founding were equivalent?
Q8-A:
What is the adjusted decision criteria used for contrasts to maintain an alpha of 0.05 for the omnibus test of group equivalence?
ANSWER
In Omnibus hypotheses scenarios, the results are rejected contingent on the failure of any one test. The Bonferroni Correction divides alpha by the number of contrasts and compares the p-value to the new alpha.
Q8-B:
What is the lowest p-value you observed across the seven contrasts?
ANSWER p-value = 0.0003003
Q8-C:
Can we claim study group equivalency? Why or why not?
ANSWER
bon_alpha<-0.05/7
bon_alpha
## [1] 0.007142857
ANSWER The smallest p-value 0.0003003 is much smaller than the Bonferroni Corrected Alpha of 0.007143. This concludes that we CANNOT claim the study group equivalency.