getting a sum of 1?
ANS: The minimum sum from rolling a pair of fair dice is 2. Since 1 is not a possible outcome, the probability is 0.
getting a sum of 5?
ANS : dice1 –>1,2,3,4 dice2 –>4,3,2,1
4 possible outcomes with a sum of 5 out of a total of 6x6 possible outcomes. =4/36 =1/9
getting a sum of 12?
ANS: 1/36
The American Community Survey is an ongoing survey that provides data every year to give communities the current information they need to plan investments and services. The 2010 American Community Survey estimates that 14.6% of Americans live below the poverty line, 20.7% speak a language other than English (foreign language) at home, and 4.2% fall into both categories.59
ANS: No. one could be living below the poverty line and speaking a foreign language at home,
or one could living below the poverty line only, or speaking a foreign language at home only. In the case described in the question, 4.2% fall into both categories.
library(VennDiagram)
## Loading required package: grid
## Loading required package: futile.logger
pov <- 14.6
fl <- 20.7
both <- 4.2
venn.plot <- draw.pairwise.venn(pov,
fl,
cross.area=both,
c("Poverty", "Foreign Language"),
fill=c("red", "blue"),
cat.dist=-0.08,
ind=FALSE)
grid.draw(venn.plot)
povOnly <- pov - both
povOnly #% of American live below the poverty line and only speak English at home.
## [1] 10.4
#P(A or B) = P(A) + P(B) - P(A and B)
fl + pov - both
## [1] 31.1
povOnly <- pov - both
100 - fl - povOnly # % Americans living above poverty line and only speak English at home.
## [1] 68.9
#P(A and B) = P(A) x P(B)
(pov/100) * (fl/100)
## [1] 0.030222
#P(Male Blue or Partner Blue) =
((114+108)/204) - (78/204)
## [1] 0.7058824
#P(Partner Blue given Male Blue)
(78/114)
## [1] 0.6842105
P(Partner Blue given Male Green) = (11/36)
there is definitely a possibility in selecting a partner with the same eye color. Given this analysis, the eye color of male respondents and their partners does not appear to be independent.
fBlue <- c(78,19,11)
fBrown <- c(23,23,9)
fGreen <- c(13,12,16)
df <- data.frame(fBlue, fBrown, fGreen)
df
## fBlue fBrown fGreen
## 1 78 23 13
## 2 19 23 12
## 3 11 9 16
row.names(df) <- c("mBlue", "mBrown", "mGreen")
df$sum <- c(sum(df["mBlue",]), sum(df["mBrown",]), sum(df["mGreen",]))
dfProp <- df / df$sum
dfProp
## fBlue fBrown fGreen sum
## mBlue 0.6842105 0.2017544 0.1140351 1
## mBrown 0.3518519 0.4259259 0.2222222 1
## mGreen 0.3055556 0.2500000 0.4444444 1
The table below shows the distribution of books on a bookcase: Type/Format Hardcover Paperback Total Fiction 13 59 72 Nonfiction 15 8 23 Total 28 67 95
P(Hardcover) (28/95)
P(Paperback Fiction) (59/94)
P(Hardcover and Paperback Fiction) =((28/95)*(59/94))
((28/95)*(59/94))
## [1] 0.1849944
P(Fiction) = (72/95)
P(Hardcover) = (28/94)
P(Fiction and Hardcover) = ((72/95)*(28/94))
((72/95)*(28/94))
## [1] 0.2257559
P(Fiction) = (72/95)
P(Hardcover) = (28/95)
P(Fiction and Hardcover) = ((72/95)*(28/95))
((72/95)*(28/95))
## [1] 0.2233795
ANS: The only difference is the replacement which simply changes the denominator of the second book selection by 1. Number of books on the bookcase (94 vs 95) has very little effect.
An airline changes the following baggage fees: $25 for the first bag and $35 for the second. Suppose 54% of passengers have no checked baggage, 34% have one piece of checked luggage, and 12% have two pieces. We suppose a negligible portion of people check more than two bags.
prob <- c(0.54, 0.34, 0.12)
bag <- c(0, 1, 2)
fee <- c(0, 25, 25 + 35)
df_rev <- data.frame(prob, bag, fee)
df_rev$pf <- df_rev$prob * df_rev$fee
df_rev
## prob bag fee pf
## 1 0.54 0 0 0.0
## 2 0.34 1 25 8.5
## 3 0.12 2 60 7.2
# Calculate the average revenue per passenger
avgRevPerPax <- sum(df_rev$pf)
avgRevPerPax
## [1] 15.7
# Calculate Variance
df_rev$DiffMean <- df_rev$pf - avgRevPerPax
df_rev$DMSqr <- df_rev$DiffMean ^ 2
df_rev$DMSP <- df_rev$DMSqr * df_rev$prob
df_rev
## prob bag fee pf DiffMean DMSqr DMSP
## 1 0.54 0 0 0.0 -15.7 246.49 133.1046
## 2 0.34 1 25 8.5 -7.2 51.84 17.6256
## 3 0.12 2 60 7.2 -8.5 72.25 8.6700
# Calculate standard deviation
var <- sum(df_rev$DMSP)
sd <- sqrt(var)
sd
## [1] 12.62538
# Revenue for 120 passengers
pax <- 120
avgFltRev <- avgRevPerPax * pax
avgFltRev
## [1] 1884
# Standard Deviation
var1 <- (pax ^ 2) * var
sd1 <- sqrt(var1)
sd1
## [1] 1515.046
income <- c("$1 - $9,999 or loss",
"$10,000 to $14,999",
"$15,000 to $24,999",
"$25,000 to $34,999",
"$35,000 to $49,999",
"$50,000 to $64,000",
"$65,000 to $74,999",
"$75,000 to $99,999",
"$100,000 or more")
bounds <- c(1, 10000, 15000, 25000, 35000, 50000, 65000, 75000, 100000)
size <- c(9999, 4999, 9999, 9999, 14999, 14999, 9999, 24999, 99999)
center <- bounds + (size / 2)
total <- c(0.022, 0.047, 0.158, 0.183, 0.212, 0.139, 0.058, 0.084, 0.097)
df44 <- data.frame(income, center, total)
df44
## income center total
## 1 $1 - $9,999 or loss 5000.5 0.022
## 2 $10,000 to $14,999 12499.5 0.047
## 3 $15,000 to $24,999 19999.5 0.158
## 4 $25,000 to $34,999 29999.5 0.183
## 5 $35,000 to $49,999 42499.5 0.212
## 6 $50,000 to $64,000 57499.5 0.139
## 7 $65,000 to $74,999 69999.5 0.058
## 8 $75,000 to $99,999 87499.5 0.084
## 9 $100,000 or more 149999.5 0.097
require(ggplot2)
myTheme <- theme(axis.ticks=element_blank(),
panel.border = element_rect(color="yellow", fill=NA),
panel.background=element_rect(fill="blue"),
panel.grid.major.y=element_line(color="white", size=0.5),
panel.grid.major.x=element_line(color="white", size=0.5))
g44a <- ggplot(data=df44) +
geom_bar(aes(x=center, y=total, width=size), stat='identity', position="identity") +
labs(x="Income (US$)",
y="% of Survey Sample",
title="Distribution of Total Personal Income") +
myTheme +
theme(axis.text.x = element_text(angle=45, vjust=1, hjust=1))
## Warning: Ignoring unknown aesthetics: width
g44a
prb <- sum(df44[1:5,]$total)
prb
## [1] 0.622
P(A and B) = P(A) x P(B) 41% females
prf <- 0.41 * prb
prf
## [1] 0.25502
F_50K <- 0.718 * .41
F_50K
## [1] 0.29438
while the results of d is close to c, but not equal. therefore they are independent events.