Dice rolls. (3.6, p. 92) If you roll a pair of fair dice, what is the probability of
The probability = P(1) = 0 as the minimum sum we can get is 1+1=2.
The probability = P(5) = P(1+4 OR 2+3) = 2/6 * 1/6 + 2/6 * 1/6 = 4/36 = 1/9 = approximately 0.11
The probabbility = P(12) = P(6+6) = 1/6 * 1/6 = 1/36 = approximately 0.028
Poverty and language. (3.8, p. 93) The American Community Survey is an ongoing survey that provides data every year to give communities the current information they need to plan investments and services. The 2010 American Community Survey estimates that 14.6% of Americans live below the poverty line, 20.7% speak a language other than English (foreign language) at home, and 4.2% fall into both categories.
As there are 4.2% of the sample live below the poverty line and speak foreign language at home, therefore they are not disjoint.
## Loading required package: grid
## Loading required package: futile.logger
both <- 4.2/100
poverty <- 14.6/100
foreignlanguage <- 20.7/100
venn.plot <- draw.pairwise.venn(area1 = poverty, area2 = foreignlanguage, cross.area = both,
c("Poverty", "Speaks Foreign Language at Home"), scaled = TRUE,
col = c("yellow", "lightblue"), fill = c("yellow", "lightblue"),
cat.cex = 1, cat.dist = -0.11)
grid.draw(venn.plot)Americans live below the poverty line and only speak English at home
= P(Poverty and English)
= P(Poverty and ForeignLanguage’)
= 14.6% - 4.2%
= 10.4%
Americans live below the poverty line or speak a foreign language at home
= P(Poverty U ForeignLanguage)
= P(Poverty) + P(ForeignLanguage) - P(both)
= 14.6% + 20.7% - 4.2%
= 31.1%
Americans live above the poverty line and only speaks English at home
= P(Poverty’ and English)
= P(Poverty’ and ForeignLanguage’)
= P(Poverty U ForeignLanguage)’
= 1 - P(Poverty U ForeignLanguage)
= 1 - 31.1%
= 68.9%
P(Poverty and ForeignLanguage) = 4.2%
P(Poverty) * P(ForeignLanguage)
= 0.146 * 0.207
= 3.02%, which does not equal to P(Poverty and ForeignLanguage)
Therefore, they are not independent.
Assortative mating. (3.18, p. 111) Assortative mating is a nonrandom mating pattern where individuals with similar genotypes and/or phenotypes mate with one another more frequently than what would be expected under a random mating pattern. Researchers studying this topic collected data on eye colors of 204 Scandinavian men and their female partners. The table below summarizes the results. For simplicity, we only include heterosexual relationships in this exercise.
P(male = blue U female = blue)
= P(male = blue) + P(female = blue) - P(both = blue)
= 114/204 + 108/204 - 78/204
= 144/204
= 70.59%
P(female = blue | male = blue)
= P(both = blue) / P(male = blue)
= (78/204) / (114/204)
= 78/114
= 68.42%
P(female = blue | male = brown)
= P(f=blue and m=brown) / P(m=brown)
= (19/204) / (54/204)
= 19/54
= 35.185%
P(female = blue | male = green)
= P(f=blue and m=green) / P(m=green)
= (11/204) / (36/204)
= 11/36
= 30.56%
We have P(female = blue | male = blue) = 68.42%, which does not equal to P(female = blue) = 108/204 = 52.94%.
Therefore, the eye colors of male amd females are dependent.
Books on a bookshelf. (3.26, p. 114) The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
P(hardcover then paperback fiction w/o replacement)
= 28/95 * 59/94
= 18.50%
We have two situations here. The second book may vary depends on the first fiction book is hardcover or not.
So, the first situation is, we draw a hardcover fiction, then a hardcover book.
The second situation is, we draw a paperback fiction, then a hardcover book.
P(fiction then hardcover w/o replacement)
= P(fiction hardcover then hardcover w/o replacement) + P(fiction paperback then hardcover w/o replacement)
= 13/95 * 27/94 + 59/95 * 28/94
= 22.43%
P(fiction then hardcover w/ replacement)
= 72/95 * 28/95
= 22.34%
First, our sample is large enough with total of 95 books. With replacement of one book won’t cause much difference in the result.
Second, in part (b), when the first fiction drawn is hardcover, the probablity of drawing the second book as hardcover without replacement is 27/94, which is smaller than the situation in part (c) as 28/95. It brings down the difference in their results. The difference between part (b) and (c) is only 0.09%.
Baggage fees. (3.34, p. 124) An airline charges the following baggage fees: $25 for the first bag and $35 for the second. Suppose 54% of passengers have no checked luggage, 34% have one piece of checked luggage and 12% have two pieces. We suppose a negligible portion of people check more than two bags.
note: standard deviation (sd, weighted) = square root of (sum of (p_i)*(fee_i - mean)^2)
By calculation, the average revenue is $15.7 per passenger and the corresponding standard deviation is 19.95.
fees <- c(0, 25, 25+35)
p <- c(0.54, 0.34, 0.12)
model <- data.frame(fees, p)
model$weighted <- (model$fees * model$p)
model## [1] 15.7
## [1] 19.95019
The expected revenue is $1,884. The new standard deviation with 120 passengers is 218.54.
## [1] 1884
## [1] 218.5434
Income and gender. (3.38, p. 128) The relative frequency table below displays the distribution of annual total personal income (in 2009 inflation-adjusted dollars) for a representative sample of 96,420,486 Americans. These data come from the American Community Survey for 2005-2009. This sample is comprised of 59% males and 41% females.
The distribution is bimodal with two peaks. As the skewness of the probability is positive, the distribution is right skewed.
income <- c("$1 to $9,999", "$10,000 to $14,999", "$15,000 to $24,999", "$25,000 to $34,999", "$35,000 to $49,999", "$50,000 to $64,999", "$65,000 to $74,999", "$75,000 to $99,999", "$100,000 or more")
p <- c(0.02, 0.047, 0.158, 0.183, 0.212, 0.139, 0.058, 0.084, 0.097)
income_p <- data.frame(income, p)
income_p## [1] 0.165715
P(income<$50,000) = 0.022 + 0.047 + 0.158 + 0.183 + 0.212 = 0.622 = 62.2%
This sample is comprised of 59% males and 41% females, and assume the ratio is the same across all income groups,
P(income<$50,000 and female)
= P(income<$50,000) * P(female)
= 41% * 62.2%
= 25.502%
If 71.8% of females make less than $50,000 per year, the answer to part (c) = 41% * 71.8% = 29.438%
As my result in part (c) is 25.502% and that’s different from 29.438%, the assumption I made in part (c) that “the ratio of male and female is the same across all income group” is invalid.
The result shows that female has a higher ratio to earn less than $50,000 per year than earning more than $50,000.