Graded: 2.6, 2.8, 2.20, 2.30, 2.38, 2.44
A pair of dice will only sum 2-12. The porobability of a sum of 1 is 0.
There are 36 total possible outcomes from rolling a pair of dice Out of those 36 a sum of 5 is possible with the following combinations: {(1,4), (2,3), (3,2), (4,1)} The probability is 4/36 = 1/9
12 = {(6,6),(6,6)} The probability of rolling a sum of 2 with a pair of dice is 2/36 = 1/18
No, the description states that 4.2% fall into both categories, meaning they are not disjoint.
library(VennDiagram)
## Loading required package: grid
## Loading required package: futile.logger
library(grid)
library(futile.logger)
library(grid)
library(futile.logger)
library(VennDiagram)
venn.plot <- draw.pairwise.venn(area1 = 14.6, area2 = 20.7, cross.area = 4.2, category = c("Poverty","Foreign Lang"), scaled = TRUE)
grid.newpage()
grid.draw(venn.plot)
10.4%
31.1%
68.9%
prob_PL <- .146
prob_FL <- 0.207
prob_both <- 0.042
# Multiplicative rule
independent <- (prob_PL*prob_FL == prob_both)
independent
## [1] FALSE
FALSE
fiction <- c(13,59,72)
nonfiction <- c(15,8,23)
totalformat <- c(28,67,95)
book_df <- t(data.frame(fiction, nonfiction, totalformat))
colnames(book_df) <- c("Hardcover", "Paperback", "Total Types")
book_df
## Hardcover Paperback Total Types
## fiction 13 59 72
## nonfiction 15 8 23
## totalformat 28 67 95
P(Hardcover) * P(fiction and paperback)
prob_a <- (book_df["totalformat", "Hardcover"]/95) * (book_df["fiction", "Paperback"])/94
prob_a
## [1] 0.1849944
There are 2 outcomes for drawing a fiction book first, either paperback or hardcover. If we draw paper back, the oddds of drawing a hardcover book next are simply 28/94. However, if we draw a hardcover fiction book, the odds of drawing a hardcover book next are now 27/94.
prob_b <- (book_df["fiction", "Paperback"]/95) * (book_df["totalformat", "Hardcover"])/94 + (book_df["fiction", "Hardcover"]/95)*(book_df["totalformat", "Hardcover"] - 1)/94
prob_b
## [1] 0.2243001
In this case, we place the book back so we now do not have to worry about the case where a hardcover fiction book is selected first and then affecting the odds of a hardcover book being selected next.
prob_c <- (book_df["fiction", "Total Types"]/95) * (book_df["totalformat", "Hardcover"])/95
prob_c
## [1] 0.2233795
The explanations are written prior to the code. In (b), we must worry about 2 cases i) fiction is paperback, which does not affect lower the total number of hardcover books availible to be selected in the second event, and ii) the case where the fiction book that is selected first is a hardcover, which then lowers the number of hardcover books availible for event number 2. In (c), we allow for replacement which eliminates the need to worry about selecting a hardover fiction book in the first event causing the second event to lower in probability.
passengers = 1
revenue <- .34*passengers*25 + (.12*passengers*25 + .12*passengers*35)
avg_rev_per_pass <- .34*25 + .12*25 + .12*35
avg_rev_per_pass
## [1] 15.7
# 60 is because of the 12% of customer paying for 2 bags, 25+35 = $60
std_dev <- sqrt(.54*(0-15.7)^2 + .34*(25-15.7)^2 + .12*(60-15.7)^2)
std_dev
## [1] 19.95019
standard deviation will be the same as ii) or (a), $19.95
passengers = 120
revenue <- .34*passengers*25 + (.12*passengers*25 + .12*passengers*35)
revenue
## [1] 1884
( a) Describe the distribution of total personal income.
The distribution seems pretty normal. The right end down have a high nunber of Americans making 100k compared to those making less than 10k. Can even be considered bimodal
library(ggplot2)
groups <- c(.022, .047, .158, .183, .212, .139, .058, .084, .097)
totals_per_group <- 96420486*groups
name_groups <- c("$1 to $9,999", "$10,000 to $14,999","$15,000 to $24,999", "$25,000 to $34,999", "$35,000 to $49,999", "$50,000 to $64,999", "$65,000 to $74,999", "$75,000 to $99,999", "$100,000 or more")
table <- data.frame(name_groups,totals_per_group)
table
## name_groups totals_per_group
## 1 $1 to $9,999 2121251
## 2 $10,000 to $14,999 4531763
## 3 $15,000 to $24,999 15234437
## 4 $25,000 to $34,999 17644949
## 5 $35,000 to $49,999 20441143
## 6 $50,000 to $64,999 13402448
## 7 $65,000 to $74,999 5592388
## 8 $75,000 to $99,999 8099321
## 9 $100,000 or more 9352787
barplot(totals_per_group, xlab = "Different income ranges")
This is the sum of the first 5 categories = .022 + .047 + .158 + .183 + .212 = 0.622
We assume that females are paid equally to men, so the probability = 0.622 * .41 = .25502
THe assumption in (c) is incorrect based off of the data sources