SPS_Bridge_Course_HW2 Euclid Zhang
Problem 1. Dice Rolls
If you roll a pair of fair dice, what is the probability of.. (a) getting a sum of 1? (b) getting a sum of 5? (c) getting a sum of 12?
Create a function to calculate the probability of a given sum. The DiceSum represents all possible outcomes from rolling a pair of dices. The probability is calculated by using the number apparence of the given sum, devided by the total number of outcomes
Dice1 <- rep(1:6,6)
Dice2 <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6),rep(5,6),rep(6,6))
DiceSum <- Dice1 + Dice2
dDiceSum <- function(x) length(DiceSum[DiceSum == x])/length(DiceSum)
dDiceSum(1)
## [1] 0
dDiceSum(5)
## [1] 0.1111111
dDiceSum(12)
## [1] 0.02777778
Problem 2. School absences
Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.
p1 <- 0.25
p2 <- 0.15
p3 <- 0.28
p0 <- 1 - p1 - p2 - p3
p0
## [1] 0.32
p0 + p1
## [1] 0.57
1-p0
## [1] 0.68
p0 * p0
## [1] 0.1024
1 - p0*p0 - 2*p0*(1-p0)
## [1] 0.4624
It is not resonable, the health of the 2 kids are not independent because they have the same living condition and some illness can pass from 1 person to another
Problem 3. Health coverage, relative frequencies
The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.
mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
## Excellent Very Good Good Fair Poor
## No Coverage 0.0230 0.0364 0.0427 0.0192 0.0050
## Coverage 0.2099 0.3123 0.2410 0.0817 0.0289
Are being in excellent health and having health coverage mutually exclusive?
Since being in excellent health and having health coverage can occur at the same time, they are not mutally exclusive.
What is the probability that a randomly chosen individual has excellent health?
p(excellent health) = p(excellent health with no coverage) + p(excellent health with coverage)
mat["No Coverage","Excellent"] + mat["Coverage","Excellent"]
## [1] 0.2329
mat["Coverage","Excellent"]/sum(mat["Coverage",])
## [1] 0.2402152
mat["No Coverage","Excellent"]/sum(mat["No Coverage",])
## [1] 0.1821061
Problem 4. Exit Poll.
Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker?
p(favor of Scott Walker | college degree)
= p(favor of Scott Walker with college degree)/p(college degree)
= p(favor of Scott Walker with college degree)/[p(favor of Scott Walker with college degree) + p(favor of Scott Walker without college degree]
(0.53*0.37)/((0.53*0.37)+(1-0.53)*0.44)
## [1] 0.4867213
Problem 5. Books on a bookshelf
The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE)
colnames(mymat2)=c("hard","paper")
rownames(mymat2)=c("fiction","nonfiction")
mymat2
## hard paper
## fiction 13 59
## nonfiction 15 8
(sum(mymat2[,"hard"])/sum(mymat2))*(sum(mymat2["fiction","paper"])/(sum(mymat2)-1))
## [1] 0.1849944
(sum(mymat2["fiction","paper"])/sum(mymat2))*(sum(mymat2[,"hard"])/(sum(mymat2)-1)) + (sum(mymat2["fiction","hard"])/sum(mymat2))*((sum(mymat2[,"hard"])-1)/(sum(mymat2)-1))
## [1] 0.2243001
(sum(mymat2["fiction",])/sum(mymat2))*(sum(mymat2[,"hard"])/sum(mymat2))
## [1] 0.2233795
Problem 6. Is it worth it?
Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs.
dPerGame <- c("2-10" = (4*9)/52, "Jack Queen King" = (4*3)/52, "ace of clubs" = 1/52, "other aces" = 3/52)
netWinPerGame <- c("2-10" = 0-2, "Jack Queen King" = 3-2, "ace of clubs" = 25-2, "other aces" = 20-2)
The probabilities of the outcomes are
dPerGame
## 2-10 Jack Queen King ace of clubs other aces
## 0.69230769 0.23076923 0.01923077 0.05769231
The net profit of the outcomes are
netWinPerGame
## 2-10 Jack Queen King ace of clubs other aces
## -2 1 23 18
The expected profit per game is
expectedProfit <- sum(dPerGame*netWinPerGame)
expectedProfit
## [1] 0.3269231
sum(((netWinPerGame - expectedProfit)^2)*dPerGame)
## [1] 31.75851
The game has positive expected value and large varriance. I would recommand this game to Andy if he has sufficient fund at the beginning for a large number of games. Otherwise, I don’t recommand the game to Andy since the probability of lossing all his money is still high.
Problem 7. Scooping ice cream.
Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:
mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
## mean SD Var
## X, In Box 48 1.00 1.0000
## Y, Scooped 2 0.25 0.0625
mymat3["X, In Box","mean"] + 3*mymat3["Y, Scooped","mean"]
## [1] 54
Var[X + Y + Y + Y] = Var[X] + Var[Y] + Var[Y] + Var[Y] = Var[X] + 3*Var[Y]
Std[X + Y + Y + Y] = sqrt(Var[X + Y + Y + Y])
sqrt(mymat3["X, In Box","Var"] + 3*mymat3["Y, Scooped","Var"])
## [1] 1.089725
mymat3["X, In Box","mean"] - mymat3["Y, Scooped","mean"]
## [1] 46
Var[X - Y] = Var[X] + Var[Y]
Std[X - Y] = sqrt(Var[X - Y])
sqrt(mymat3["X, In Box","Var"] + mymat3["Y, Scooped","Var"])
## [1] 1.030776