Practice: 2.5, 2.7, 2.19, 2.29, 2.43

Graded: 2.6, 2.8, 2.20, 2.30, 2.38, 2.44

2.6 Dice rolls.
Answer:
1) 0

2)  1+4 or 2+3 = 5
2*(P(1 and 4) or P(2 and 3)) = 2*(P(1)*P(4) + P(2)*P(3)) = 2*(1/6*1/6 + 1/6*1/6) = 2*2/36 = 1/9 = 0.11

3) possible combination that adds up to 12 is 6+6 
P(6 and 6) = 1/6*1/6 = 1/36 = 0.028
2.8 Poverty and language.
Answer:
  1. No, there people who live below the poverty line and speak a foreign language at home so the two are not disjoint.

  2. 14.6 - 4.2 = 10.4% Americans live below poverty line and speak only English at home

  3. 14.6 + 20.7 = 35.3% Americans are below poverty line or speak a foreign language at home.

  4. 20.7 - 4.2 = 16.5% speak a foreign language at home and above poverty line. 100 - 16.5 = 84.5% are above poverty line and only speak English

  5. P(below poverty line) * P(foreign language) = (0.146 x 0.207) = 0.030

P(below poverty and foreign language) = 0.042

The two are not independent

library(VennDiagram)
## Loading required package: grid
## Loading required package: futile.logger
#venndiagram

venn.plot <- draw.pairwise.venn(14.6, 20.7, 4.2, c("Below poverty line", "Foreign language"))

grid.draw(venn.plot)

2.20. Assortative mating
Answer:
  1. 108+36 = 144 are blue eyed male or female

144 / 204 = 0.71

  1. 78 / 114 = 0.68

  2. 19 / 54 = 0.35

11/36 = 0.305

  1. From the distribution and sample probabilities above it is clear that the probability of male respondent and their partner having same eye color is much higher than probability of males with partner of different eye color. Hence eye color of partners is not independent.
2.30 Books on a Bookshelf
Answer:
  1. P(Hardcover) = 28/95 = 0.295
    P(paperback) = 67/94 = 0.71

P(H and P) = 0.295 x 0.71 = 0.209

  1. P(fiction) = 72/95 = 0.76
    P(hardcover) = 28/94 = 0.298

0.298 x 0.76 = 0.227

  1. P(fiction) = 72/95 = 0.76
    P(hardcover) = 28 /95 = 0.295

0.76 x 0.295 = 0.224

  1. The only difference is the number of choices available for hardcover is 94 instead of 95
2.38 Baggage Fees
Answer:
  1. total Number of checked bags per 100 people = # of 1 checked bag + 2 x # of 2 checked bags = 34 + 24 = 58

Revenue with 0 bags = 0 Revenue from ppl with 1 bag = 34 * 25 = $850 Revenue from ppl with 2 bags = 12 * 50 = $600

total revenue = $1450 Revenue per person (the mean) = $14.50 (xi - )^2*P(xi) x1 = 0, P(x1) = 0.54; (-14.50)^2*0.54 = 113.54 x2 = $25, P(x2) = 0.34; (25 - 14.50)^2*0.34 = 37.485 x3 = $50, P(x3) = 0.12; (50 - 14.50)^2*0.12 = 151.23

SD = (113.54 + 37.54 + 151.23) = (302.31) = 17.386

  sqrt(302.31)
## [1] 17.38706
  1. Airline would expect around 120 * 14.50 = $1740
2.44 Income and Gender
Answer:
income <- c("$1 - $9,999 or loss", 
            "$10,000 to $14,999", 
            "$15,000 to $24,999",
            "$25,000 to $34,999",
            "$35,000 to $49,999",
            "$50,000 to $64,000",
            "$65,000 to $74,999",
            "$75,000 to $99,999",
            "$100,000 or more")

fr_total <- c( 2.2, 4.7,  15.8,18.3, 21.2, 13.9, 5.8, 8.4, 9.7)

income_df <- data.frame(income, fr_total)
income_df
##                income fr_total
## 1 $1 - $9,999 or loss      2.2
## 2  $10,000 to $14,999      4.7
## 3  $15,000 to $24,999     15.8
## 4  $25,000 to $34,999     18.3
## 5  $35,000 to $49,999     21.2
## 6  $50,000 to $64,000     13.9
## 7  $65,000 to $74,999      5.8
## 8  $75,000 to $99,999      8.4
## 9    $100,000 or more      9.7
barplot(income_df$fr_total, xlab = income_df$income)

fr_less_than_fifty  <- ( 2.2 + 4.7 + 15.8 + 18.3 + 21.2)
fr_less_than_fifty
## [1] 62.2
men <- (96420486 * 59 )/ 100
women <- (96420486 * 41) / 100
men
## [1] 56888087
women
## [1] 39532399
total_lt_fifty <- (96420486 * fr_less_than_fifty) / 100
total_lt_fifty
## [1] 59973542
f_lt_fifty <- (total_lt_fifty * 41)/100
f_lt_fifty
## [1] 24589152
(62.2 * 41)/100
## [1] 25.502
f2_lt_fifty <- (women * 71.8) / 100
  1. The distribution here is right skewed

  2. fr_less_than_fifty = 2.2 + 4.7 + 15.8 + 18.3 + 21.2) = 62.2 %

  3. Assuming that the ratio of males to females is similar across all categories,
    Probability of women making less than 50K = 62.2 * 41 = 25.5%

  4. 71.8% of women making less than 50K is 28,384,262, however based on the assumption made in c) total women in this category would be 24,589,152 which is less than 71.8%, hence the assumption is incorrect.