library(VennDiagram)
## Warning: package 'VennDiagram' was built under R version 3.5.1
## Loading required package: grid
## Loading required package: futile.logger
## Warning: package 'futile.logger' was built under R version 3.5.1
Answers: (a) 0% (b) 4/36 or 11% (c) 1/36 or 2.77%
2.6 Dice rolls. If you roll a pair of fair dice, what is the probability of
Answers: (a) 0% (b) 4/36 or 11% (c) 1/36 or 2.77%
2.8 Poverty and language. The American Community Survey is an ongoing survey that provides data every year to give communities the current information they need to plan investments and services. The 2010 American Community Survey estimates that 14.6% of Americans live below the poverty line, 20.7% speak a language other than English (foreign language) at home, and 4.2% fall into both categories.
Are living below the poverty line and speaking a foreign language at home disjoint? Answer: No. There are overlaps between the two groups as evidence by the fact that 4.2% are in both.
Draw a Venn diagram summarizing the variables and their associated probabilities.
grid.newpage()
draw.pairwise.venn(area1 = .146, area2 = .207, cross.area = .042, category = c("English Poor",
"Foreign Language"),scaled=TRUE)
## (polygon[GRID.polygon.1], polygon[GRID.polygon.2], polygon[GRID.polygon.3], polygon[GRID.polygon.4], text[GRID.text.5], text[GRID.text.6], text[GRID.text.7], text[GRID.text.8], text[GRID.text.9])
What percent of Americans live below the poverty line and only speak English at home? Answer: P(Americans in Poverty) - P(Americans in Poverty and Speak Foreign Language) = 14.6% - 4.2% = 10.4%
What percent of Americans live below the poverty line or speak a foreign language at home? Answer: P(Americans in Poverty) + (P(Speak Foreign Language) - P(Americans in Poverty and Speak Foreign Language)) = 14.6% + (20.7% - 4.2%) = 31.1%
What percent of Americans live above the poverty line and only speak English at home? Answer: (P(All Americans) - P(Americans in Poverty and Speak Only English)) - P(Speak Foreign Language)) = (100% - 10.4%) - 20.7% = 68.9%
Is the event that someone lives below the poverty line independent of the event that the person speaks a foreign language at home?
Answer: To test independence, the product of the probability that someone living in poverty and that they speak a foreign language must equal the probability of someone living in poverty and speaks a foreign which is 4.2%
P(American in Poverty) = .146, P(Speaks Foreign Language) = .207 =.146*.207 =.030 or 3% which does not equal 4.2% so they’re not independent.
2.20 Assortative mating. Assortative mating is a nonrandom mating pattern where individuals with similar genotypes and/or phenotypes mate with one another more frequently than what would be expected under a random mating pattern. Researchers studying this topic collected data on eye colors of 204 Scandinavian men and their female partners. The table below summarizes the results. For simplicity, we only include heterosexual relationships in this exercise.
ANSWERS
= .706 0r 71%
(108 + 114 -78)/ 204
## [1] 0.7058824
(78/204) / (114/204)
## [1] 0.6842105
(19/204) / (54/204)
## [1] 0.3518519
(P(Male Green Eyes and Partner Blue Eyes) / Total Population) / P(Male with Green Eyes) / Total Population =.305 or 30.5%
(11/204) / (36/204)
## [1] 0.3055556
(d)To test independence, the product of the probability of male respondents with the same eye color of their partner should equal the probability of male respondents with different eye color than their partners.
We can see that the probability of males and their partners with the same eye color is 53% and the probability of males with different eye color of their partners is 47%. Although the probabilities are close, they do not equal each other and are not independent.
(108/204)
## [1] 0.5294118
96/204
## [1] 0.4705882
2.30 Books on a bookshelf. The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
(a)Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement. ANSWER:
P(Hardcover Book Selected First) = 28/95 P(Paperback Fiction Book Selected Second) = 59/94 There’s an 18.4% probability of selecting a paperback fiction if first book was hardcover
(28/95)*(59/94)
## [1] 0.1849944
ANSWER There are two possible events. First is the probability that a hardcover fiction book was NOT selected first leaving 28 hardcover books to be chosen on the 2nd selection. Second is the probability that a hard cover fiction book WAS selected first leaving only 27 hardcover books available to be chosen on the 2nd selection. These two probabilities are then added for a probability of 22.4%
P(Fiction Book Not Hardcover Selected First) = 59/95 P(Hardcover Book Selected Second) = 28/94 + P(Fiction Hardcover Book Selected First) = 13/95 P(Harcover Book Selected Second Given that it was selected first) = 27/94
((59/95)*(28/94))+((27/94)*(13/95))
## [1] 0.2243001
ANSWER With replacement, then the probabilities are:
P(Fiction book selected first) = 72/95 P(Hardcover book selected second) = 28/95
(72/95)*(28/95)
## [1] 0.2233795
The probabilities are similar because there’s a large sample size.
2.38 Baggage fees. An airline charges the following baggage fees: $25 for the first bag and $35 for the second. Suppose 54% of passengers have no checked luggage, 34% have one piece of checked luggage and 12% have two pieces. We suppose a negligible portion of people check more than two bags.
ANSWER:
price <- c(0,25,60)
percent <- c(.54,.34,.12)
exp_value = price*percent
exp_value <- c(exp_value)
model <- matrix(c(price, percent, exp_value),ncol=3,byrow=TRUE)
colnames(model) <- c("Zero","One","Two")
rownames(model) <- c("x","percent","exp_val")
model <- as.table(model)
model
## Zero One Two
## x 0.00 25.00 60.00
## percent 0.54 0.34 0.12
## exp_val 0.00 8.50 7.20
avg_rev_person <- sum(model[c(3),c(1,2,3)])
avg_rev_person
## [1] 15.7
The average revenue per passenger is $15.70.
price_minus_avg <- price-avg_rev_person
price_Variance<- (price_minus_avg)^2
percentage_variance <-price_Variance*percent
model2 <- matrix(c(price, percent, exp_value, price_minus_avg,price_Variance, percentage_variance),ncol=3,byrow=TRUE)
colnames(model2) <- c("Zero","One","Two")
rownames(model2) <- c("x","P(X=x)","x*P(X)","x-u","(x-u)^2", "(x-u)^2*P(X=x)")
model2 <- as.table(model2)
model2
## Zero One Two
## x 0.0000 25.0000 60.0000
## P(X=x) 0.5400 0.3400 0.1200
## x*P(X) 0.0000 8.5000 7.2000
## x-u -15.7000 9.3000 44.3000
## (x-u)^2 246.4900 86.4900 1962.4900
## (x-u)^2*P(X=x) 133.1046 29.4066 235.4988
std_dev <-sum(percentage_variance)
sqrt(std_dev)
## [1] 19.95019
The standard deviation in average revenue per passenger is $19.95
Assuming that each checked bag is independent for each passenger, the expected revenue of the 120 passengers is $1,884.00, and the standard deviation is $218.
Exp_rev_120 = avg_rev_person * 120
Exp_std_dev = sqrt(120 *sum(percentage_variance))
print(Exp_rev_120)
## [1] 1884
print(Exp_std_dev)
## [1] 218.5434
2.44 Income and gender. The relative frequency table below displays the distribution of annual total personal income (in 2009 inflation-adjusted dollars) for a representative sample of 96,420,486 Americans. These data come from the American Community Survey for 2005-2009. This sample is comprised of 59% males and 41% females.69
a.) Income bracket of $35,000 to $49,999 has the largest number of observations at 21%. Followed by the bracket $25,0000 to $34,999 at 18.3%. The lowest number of observations is $9,999 or less with only 2.2% of the observations. The distribution is bi-modal as there is a second peak at the bracket, $100,000 or more in addition to the peak at the bracket $35,000 to $49,999. In general, the frequency of observations grows from 0 to $49,999. Then, it declines, but starts to increase again beginning at $75,000 or more.
b.) The probability that someone makes less than $50,0000 a year is 62.2%. We can see that by adding up the distributions between 0 and $49,999.
prob <- .212 + .183 + .158 + .047 + .022
prob
## [1] 0.622
c.) Assuming independence and using the multiplication rule, we see that the probability that someone is female and makes less than $50,000 a year is 25.50%
.622*.41
## [1] 0.25502
d.) The given statistic that 71% of females make $50,000 or less is not correct based on the analysis in part c. This means that gender and income are not independent.