Graded: 2.6, 2.8, 2.20, 2.30, 2.38, 2.44
poverty <- 14.6
ESL <- 20.7
both <- 4.2
library(VennDiagram)
## Loading required package: grid
## Loading required package: futile.logger
venn.plot <- draw.pairwise.venn(14.6, 20.7, 4.2, c("Poverty", "Not English"))
grid.draw(venn.plot)
grid.newpage();
p_a = 20.7
p_b = 14.6
p_a_and_b = 4.2
p_a_given_b = p_a_and_b/p_b
p_a_given_b
## [1] 0.2876712
p_a
## [1] 20.7
\[ P(B|A) \neq P(B) \]
df <- data.frame(matrix(c(78,19,11,108, 23,23,9,55,13,12,16,41,114,54,36,204),nrow=4,ncol=4))
rownames(df) <- c("mblue", "mbrown", "mgreen", "totals")
colnames(df) <- c("fblue", "fbrown", "fgreen", "totals")
df
## fblue fbrown fgreen totals
## mblue 78 23 13 114
## mbrown 19 23 12 54
## mgreen 11 9 16 36
## totals 108 55 41 204
mb = 114
fb = 108
bfm = 78
answer = (mb+ fb-bfm)/204
answer
## [1] 0.7058824
p_a = fb
p_b = mb
p_a_and_b = bfm
p_a_given_b = p_a_and_b/p_b
p_a_given_b
## [1] 0.6842105
p_a == p_a_given_b
## [1] FALSE
pfb = 108/204
pmbr = 54/204
fbmbr = 19/204
a = pfb
b = pmbr
ab = fbmbr
pagb = ab/b
pagb
## [1] 0.3518519
a == pagb
## [1] FALSE
df <- data.frame(matrix(c(13,15,28,59,8,67,72,23,95),nrow=3,ncol=3))
rownames(df) <- c("fiction", "non-fiction", "totals")
colnames(df) <- c("hardcover", "paperback", "totals")
df
## hardcover paperback totals
## fiction 13 59 72
## non-fiction 15 8 23
## totals 28 67 95
(28/95)*(67/94)
## [1] 0.2100784
fiction_and_hardcover = 13/95
fiction_not_hardcover = 59/95
second_fiction_given_fh = 27/94
second_fiction_given_nfh = 28/94
h_after_fh = second_fiction_given_fh*fiction_and_hardcover
h_after_nfh =second_fiction_given_nfh*fiction_not_hardcover
answer = h_after_nfh+h_after_fh
answer
## [1] 0.2243001
(72/95)*(28/95)
## [1] 0.2233795
We know that revenues are $25 for first bag and $35 or the 2nd (making revenue $60 for each person who checks 2 bags). We also know that 54% people check no bags, 34% check 1, and 12% check 2. The expected per passenger is
df <- data.frame(matrix(c(0,25,60,.54,.34,.12),nrow=3,ncol=2))
colnames(df) <- c("x","P(x)")
df
## x P(x)
## 1 0 0.54
## 2 25 0.34
## 3 60 0.12
The expected value is
expected = 0*.54 + 25*.34+.12*60
expected
## [1] 15.7
Likewise, the standard deviation can be found by
prices = c(0,25,60)
means = c(.54,.34,.12)
vec = c(prices - expected)
squares = c(vec * vec)
squares %*% means
## [,1]
## [1,] 398.01
(squares %*% means)^.5
## [,1]
## [1,] 19.95019
df <- data.frame(matrix(c(9999, 14999, 24999, 34999, 49999, 64999, 74999, 99999, 100000,.022,.047,.158,.183,.212,.139,.058,.084,.097),nrow=9,ncol=2))
colnames(df) <- c("income up to","total")
df
## income up to total
## 1 9999 0.022
## 2 14999 0.047
## 3 24999 0.158
## 4 34999 0.183
## 5 49999 0.212
## 6 64999 0.139
## 7 74999 0.058
## 8 99999 0.084
## 9 100000 0.097
barplot(df$total)
Personal income tends to be grouped near $43k while a sizeable amount of people also make more than $75k. In fact, 62% of people make less than $50,000.
sum(df$`total`[0:5])
## [1] 0.622
If we assume that gender is independent of income, then the probability of a woman making less than $50,000 is the same as the probability for the total population. However, that population is only 1/2 female. So, we’d multiply these two seemingly independent events together to get
.622 * .5
## [1] 0.311
However since we know that the probability of a person being a woman given that they make <$50k a year is 71.8%, we know that these events are not independent becase \[ P(A) \neq P(A|B) \] s