If you roll a pair of fair dice, what is the probability of
I play pen-and-paper role playing games so I’ll assume the text means a D6, of which 2D6 have 36 possibilities.
P(1) = 0, because the minimum roll is 2, one on each die.
P(5) = 4/36 = 1/9; there are two ways to get 1 and 4, two to get 2 and 3 any roll with a 5 or 6 will not result in a total of 5.
P(12) = 1/36; the only way to roll a 12 is two sixes.
The American Community Survey is an ongoing survey that provides data every year to give communities the current information they need to plan investments and services. The 2010 American Community Survey estimates that 14.6% of Americans live below the poverty line, 20.7% speak a language other than English (foreign language) at home, and 4.2% fall into both categories
Disjoint is a term meaning mutually exclusive. No, being below povery line and speaking a foreign langauge at home are not disjoint since there is a 4.6% chance of being both.
I used https://rstudio-pubs-static.s3.amazonaws.com/13301_6641d73cfac741a59c0a851feb99e98b.html
To help with the formatting.
# install.package("VennDiagram")
library(VennDiagram)
## Warning: package 'VennDiagram' was built under R version 3.4.1
## Loading required package: grid
## Loading required package: futile.logger
## Warning: package 'futile.logger' was built under R version 3.4.1
draw.pairwise.venn(area1 = 0.146, area2 = 0.207, cross.area = 0.042, category = c("Below Poverty Line", "Foreign Language Speaker"), fill = c("violet","orange"), alpha = rep(0.5, 2), cat.pos = c(0,0), cat.dist = rep(0.025 ,2))
## (polygon[GRID.polygon.1], polygon[GRID.polygon.2], polygon[GRID.polygon.3], polygon[GRID.polygon.4], text[GRID.text.5], text[GRID.text.6], text[GRID.text.7], text[GRID.text.8], text[GRID.text.9])
10.4 %
Use the addition rule P(A or B) = P(A) + P(B) - P(A and B) = 0.207 + 0.146 - 0.042 = 0.207 + 0.146 - 0.042 = 0.311 or 31.1%
I created a probablity table:
eng_pov <- matrix(c(0.104, 0.042, 0.146, 0.689, 0.165, 0.854, 0.793, 0.207, 1.00), nrow = 3, ncol = 3)
rownames(eng_pov) <- c("English", "Not English", "Marginal Prob")
colnames(eng_pov) <- c("Poverty", "Not Poverty", "Marginal Prob")
eng_pov <- data.frame(eng_pov)
eng_pov
## Poverty Not.Poverty Marginal.Prob
## English 0.104 0.689 0.793
## Not English 0.042 0.165 0.207
## Marginal Prob 0.146 0.854 1.000
The marginal probablity for Poverty and Not English are given, their complitments are 1-P. The joint probablity was also given for Poverty and Not English. Each Column and Row has to add to the Marginal Probablity for that column or row and the Marginal Probablity have to add to 1. So according to the table 68.9 % are not in poverty and only speak English at home.
If they are independent then: \[ P(A \& B) = P(A) x P(B) \] \[ P(A\&B) = 0.165\] \[P(A) x P(B) = 0.146 * 0.207 = 0.030222\]
From above we see that the multiplication rule for independence is not true, so living below poverty and speaking a foreign language at home are not independent events.
Assortative mating is a nonrandom mating pattern where individuals with similar genotypes and/or phenotypes mate with one another more frequently than what would be expected under a random mating pattern. Researchers studying this topic collected data on eye colors of 204 Scandinavian men and their female partners. The table below summarizes the results. For simplicity, we only include heterosexual relationships in this exercise.
First thing I am going to do is adapt the code I wrote above to this problem, and let R do all the conversions form counts to probablities by dividing by 204.
eye_color <- matrix(c(78/204, 19/204, 11/204, 108/204, 23/204, 23/204, 9/204, 55/204, 13/204,12/204,16/204,41/204,114/204,54/204,36/204,204/204), nrow = 4, ncol = 4)
rownames(eye_color) <- c("Blue", "Brown", "Green", "Total")
colnames(eye_color) <- c("Blue", "Brown", "Green", "Total")
eye_color <- data.frame(eye_color)
eye_color
## Blue Brown Green Total
## Blue 0.38235294 0.11274510 0.06372549 0.5588235
## Brown 0.09313725 0.11274510 0.05882353 0.2647059
## Green 0.05392157 0.04411765 0.07843137 0.1764706
## Total 0.52941176 0.26960784 0.20098039 1.0000000
\[P(A or B) = P(A) + P(B) - P (A\&B) \]
From the table:
\[P(A or B) = 0.52941176 + 0.5588235 - 0.38235294 = 0.7058823\]
There is a 70.6% change either a male or female partner has blue eyes.
This is read directly from the table 0.38235294 or a 38.2% chance both a male and a female partner have blue eyes.
This is read directly from the table first for male with brown eyes and female with blue is 0.09313725 or 9.3%, then male wih green eyes and female with blue eyes is 0.05392157 or 5.4%.
As with the English speaker vs Poverty problem we can use the Independent Probabilty multiplaction rule:
\[P(A \& B) = P(A) x P(B) \]
For Blue and Blue:
\[ P(A\&B) = 0.38235294\]
\[P(A) x P(B) = 0.52941176 * 0.5588235 = 0.2958477\]
Not Independent.
For Blue and Green:
\[ P(A\&B) = 0.05392157\]
\[P(A) x P(B) = 0.52941176 * 0.1764706 = 0.09342561\]
Not Independent.
For Blue and Brown:
\[ P(A\&B) = 0.09313725\]
\[P(A) x P(B) = 0.52941176 * 0.2647059 = 0.1401384\]
Not Independent.
For Brown and Green:
\[ P(A\&B) = 0.04411765\]
\[P(A) x P(B) = 0.26960784 * 0.1764706 = 0.04757786\]
Close, but Not Independent.
Brown and Brown:
\[ P(A\&B) = 0.11274510\]
\[P(A) x P(B) = 0.26960784 * 0.2647059 = 0.07136679\]
Not Independent.
For Green and Green:
\[ P(A\&B) = 0.07843137\]
\[P(A) x P(B) = 0.20098039 * 0.1764706 = 0.03546713\]
Not Independent.
It appears that these selections are not indepedent of each other.
The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
Again I am going to laod the table from the book into R using the code I wrote above:
book <- matrix(c(13/95, 15/95, 28/95, 59/95, 8/95, 67/95, 72/95, 23/95, 95/95), nrow = 3, ncol = 3)
rownames(book) <- c("Fiction", "Nonfiction", "Total")
colnames(book) <- c("Hardcover", "Paperback", "Total")
book <- data.frame(book)
book
## Hardcover Paperback Total
## Fiction 0.1368421 0.62105263 0.7578947
## Nonfiction 0.1578947 0.08421053 0.2421053
## Total 0.2947368 0.70526316 1.0000000
These are independent events so the overall probablity is the product of the individual probabilties.
\[P = (28/95)*(59/94) = 0.1849944\]
18.5%
Since you can have a hardcover fiction book we need to find the probablity of hardcover given fiction.
\[P(H|F) = P(H \& F) / P(F) = 0.1368421/0.7578947 = 0.1805556 \]
In this scenerio we have an 81.9445% chance that the draw is
\[P = (72/95)*(28/94) = 0.2257559 \]
And an 18.0555% chance it is
\[P = (72/95)*(27/94) = 0.2176932\]
Overall this is
\[P = 0.819445*0.2257559 + 0.180555*0.2176932 = 0.2243001\]
22.4%
In the first draw there is a 18% chance that the book will be hardcover given that it is fiction. This will effect the second draw.
Since the second book is placed back on the self, it no longer affects the second draw, and all books are available to draw.
\[P = (72/95)x(28/95) = 0.2233795\]
22.3%
If you note dividing by 95 will result in a smaller number than dividing by 94 scenerio (c) will have a smaller probabilty than (b), however in scenerio (b) we have to factor in the slighty lower probablity event that the first book was both hardcover and fiction. When taking this into account it reduces the overall probabilty of (b) and by coincidence makes it similar to (C). If the books where different numbers fof hardcovers, fiction, etc. this might not be the case.
An airline charges the following baggage fees:$25 for the first bag and $35 for the second. Suppose 54% of passengers have no checked luggage, 34% have one piece of checked luggage and 12% have two pieces. We suppose a negligible portion of people check more than two bags.
bags <- matrix(c(0.54,0.34,0.12), nrow = 1, ncol = 3)
rownames(bags) <- c("Probabilty")
colnames(bags) <- c("$0", "1bag=$25", "2bag=$35")
bags
## $0 1bag=$25 2bag=$35
## Probabilty 0.54 0.34 0.12
\[E(X) = 0*0.54+25*0.34+35*0.12 = 12.7 \]
$12.7 per passenger.
\[Var(X) = (0-12.7)^2*0.54 + (25-12.7)^2*0.34+(35-12.7)^2*0.12 = 198.21\]
\[SD(X) = \sqrt{Var(X)} = \sqrt{198.21} = 14.07871\]
\[120*E(X) = 120*12.7 = 1524\]
$1524 for a flight of 120 passengers on average.
\[120*SD(X) = 120*14.07871 = 1689.445\]
A standard deviation of $1689.45 for 120 passengers. In reality it is going to be less than this since $0 is the lowest they can make on luggage this constrains the lower limit of the spread.
The relative frequency table below displays the distribution of annual total personal income (in 2009 inflation-adjusted dollars) for a representative sample of 96,420,486 Americans. These data come from the American Community Survey for 2005-2009. This sample is comprised of 59% males and 41% females.
income <- c(0.022,0.047,0.158,0.183,0.212,0.139,0.058,0.084,0.097)
barplot(income)
The distribution is bimodal with a peak in the $35,000-$49,999 bracket and a second peak in the >$100,000 bracket.
\[P = 0.022+0.047+0.158+0.183+0.212 = 0.622\]
62.2%
I am going to make the unsafe simpifying assumption that income is independent of sex. In reality the odds of a female making less than a male are greater.
\[P = 0.622*0.41 = 0.25502\]
This is really a lower bound and the actual number is going to be higher.
eng_pov <- matrix(c(0.328, 0.294, 0.622, 0.262, 0.116, 0.378, 0.59, 0.41, 1.00), nrow = 3, ncol = 3)
rownames(eng_pov) <- c("Male", "Female", "Marginal Prob")
colnames(eng_pov) <- c(" less 50K", " more 50K", "Marginal Prob")
eng_pov <- data.frame(eng_pov)
eng_pov
## X.less.50K X.more.50K Marginal.Prob
## Male 0.328 0.262 0.59
## Female 0.294 0.116 0.41
## Marginal Prob 0.622 0.378 1.00
With the data we are given we can see that is 71.8% of females make less than 50,000 this means that only 55.6% of males make less than 50,000 to get the overall percentage of 62.2%, so as I stated above it was not a good assumption.
\[P = .328/.59 = 0.556\]