Assignment_Week_2

$Homework 2$

1.(a) as a result of rolling two dice following are the possible outcome sets that we can get: $\{(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)$ $\{(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)$ $\{(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)$ $\{(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)$ $\{(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)$ $\{(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)$ Since the lowest sum is 2 - the probability of gettign a sum of 1 is 0

1.(b) getting a sum of 5: from the above matrix of all possible outcomes we get the below: favourable outcomes=4=> (1,4) (4,1) (2,3) (3,2) total outcomes=36 therefore, probability of sum=5=> $\frac{4}{36} => \frac{1}{9}$

1.(c) getting a sum of 12: from the above matrix of all possible outcomes we get the below: favourable outcomes=1 (6,6) total outcomes=36 therefore, probability of sum=12=> $\frac{1}{36}$

2.(a) % of students missing exactly 1 day = 25% % of students missing exactly 2 days = 15% % of students missing exactly >=3 days = 28% Therefore % of students missing at least 1 day = 25+15+28=>68% Therefore % or students not missing any day = 100-68=>32% Thus 0.32 is the probability of choosing a student at random who has not missed a single day at school due to sickness

% of students missing exactly 1 day = 25% % or students not missing any day = 32% therefore, probablity of student missing no more than 1 day=>25% + 32%=> 0.57
as per above explanation in (a): % of students missing exactly 1 day = 25% % of students missing exactly 2 days = 15% % of students missing exactly >=3 days = 28% Therefore % of students missing at least 1 day = 25+15+28=>68%=>0.68
probability that neither kid will miss any school:

(prob. kid 1 not missing) * (prob. kid 2 not missing)=> (0.32)*(0.32)=>0.1024

The assumptions that are being made to answer this question are as follows: (i) The students of elementary school to which kid1 & kid2 go also follow similar behaviour of missing school as the total student population of DeKalb county does on an average. (ii) The data provided of % of students missing schools each year has been arrived at after observing student attendance behaviour for a 3 -4 years to take into account any spikes in student sickness due to prevalance of any specific illness in a given year (iii) Kid1 & Kid2 have same level of immunity as the average student of the total student population of DeKalb county and therefore they will exhibit similar probability of missing school.

(e)both kids missing some school: (prob. kid 1 missing at least 1 day) * (prob. kid 2 missing at least 1 day)=> (0.68)*(0.68)=>0.4624 same assumptions as above

The reasonableness of the

assumption (i) that the students at give elementary school in question follow similar behavior as the average student population will be based upon the % of the total student population that go to this particular elementary school. Larger the proportion of total county students that go to this particular elementary school, better the case of applying law of large numbers and therefore more reasonable is the assumption (i).
assumption (ii) - seems reasonable to assume that the year-on-year attendance behavior given the scope of the question
assumption (iii) - since no other information is given, it seems reasonable that kid1 & kid2 shall exhibit similar level of body immunity as the larger student population of the county belonging to similar age group

Being in excellent health and having health coverage are not mutually exclusive as it can be clearly seen from the distribution that around 21% of respondents have both “excellent” health staus and have health coverage also
probability that a randomly chosen individual has excellent health: As per the table, the probability of an individual having excellent health status and having no medical coverge =>0.023 and the probability of an individual having excellent health status and having medical coverge =>0.2099

Therefore, probability of a randomly chosen individual to have excellent health, P(E) = 0.023 + 0.2099=>0.2329

probability that a randomly chosen individual has excellent health given that he has health coverage:

Probability that a randomly chosen individual has coverage

P(C)=0.2099+0.312+0.2410+0.0817+0.0289=>0.8738 $P(E\cap C) = 0.2099$ Therefore, P(E|C) = 0.2099 / 0.8738=>0.2402

probability that a randomly chosen individual has excellent health given that he doesn’t have health coverage P(NC) = 0.0230+0.0364+0.0427+0.0192+0.0050=>0.1263 $P(E\cap NC) = 0.0230$ Therefore, P(E|NC) = 0.023 / 0.1263=>0.1821
Independence of excellent health and having health coverage

if the events excellent health and having health coverage are independent then:

$as\ per\ the\ provided\ table\ P(E\cap C) = P(E) X P(C)$ $=>P(E\cap C) = 0.2099$ P(E)=0.2329 P(c)=0.8738 therefore, P(E) X P(C) =0.2035 from the above it appears that E and C are independent events.

Probability of voting in favour of Scott Walker(SW), P(S) = 0.53 Probability of having college degree given voting for SW, P(CD|S) =0.37 Probability of having college degree given not voting for SW, P(CD|S’) =0.44

to calculate probability of having voted in favour of SW given the voter had a college degree:

Applying Baye’s theorem:

P(S|CD) = $\frac {P(CD|S)P(S)}{P(CD)}$ =>P(CD) = P(CD|S) * P(S) + P(CD|S’) * P(S’) =>P(CD) = 0.37x0.53 + 0.44x(1-0.53) =>0.1961 + 0.2068=> 0.4029

Therefore, $\ P(S|CD) = \frac {0.37*0.53}{0.4029}$ => 0.4867

Probability of drawing a hardcover (HC) book first, P(HC) = 28/95 Probability of drawing a fiction (F) book second w/o replacement, P(F) = 59/94 joint probability of these two events=> P(HC) * P(F) =>0.18499
P(F) = 59/95 then P(HC) w/o replacement=28/94=> P(F)*P(HC)=0.18499
P(F) = 59/95 then P(HC) w/ replacement=28/95=> P(F)*P(HC)=0.183
since in the first step of both (b) and (c) we are drawing just 1 book from the shelf, not replacing the book back is increasing our probability of selecting HC book by only (28/94 - 28/95) =>0.0031 which is not very significant - hence there is not a significant impact on the overall probabilties in scenarios (b) & (c). had we chosen a sizeable proportion of total books (95) in the first step of each scenarios, we would have seen a major difference in the overall probabilities

6.(a) probability of drawing (2-10) =$\frac {9*4}{52}= \frac{9}{13},$ probability of drawing face card =$\frac {3*4}{52}= \frac{3}{13},$ probability of drawing any ace (except ace of clubs) = $\frac {3}{52},$ probability of drawing ace of clubs = $\frac {1}{52}$

=>expected wins = P(2-10)(0) + P(face card)3 + P(ace (-clubs)) * 5 + P(ace of clubs)*25

=>9/13 (0) + (3/13)(3) + (3/52)5 + (1/52)25 =>76/52=>1.46

Thus his expected wins are $1.46

given he has to pay up dollar 2 to participate in the game and expects to make wins of only dollar 1.46 - it is advisable for him to not play the game as he is expected to make a loss (1.46 - 2)

$\mu_b = mean\ weight\ of\ ice-cream\ in\ a\ box,$ $\mu_s = mean\ weight\ of\ ice-cream\ in\ a\ scoop,$ $\sigma_b$ = standard deviation weight of ice-cream in a box, $\sigma_s$ = standard deviation weight of ice-cream in a scoop Expected ice-cream to be served 1 box + 3 scoops=> $\mu_b +3\mu_s => 48 + 3(2) => 54floz.$

Standard deviation amt served=$\sqrt{\sigma_b^2 + (3\sigma_s)^2}$ $=>\sqrt{(1^2 + 0.75^2)}$=>1.25 floz.

expected ice-cream in the box after scopping out 1 scoop=> $\mu_b - \mu_s$=>48 - 2=>46 floz.

SD=$\sqrt{1^2+(0.25)^2}$=>1.0307 floz.

Assuming that the box had 48 floz of icecream when 1 scoop was scooped out based on following different trials:

since each weight scooped out is a random estimate defined by a probability distribution which in turn is defined by a mean and standard deviation, therefore the amount that is scooped can be more or less than the mean scoop weight. this can be demonstrated as below:

trial 1 - amt scooped out= $\mu_b+0.5\sigma_s$=>2+0.5*0.25=>2.125

trial 1 - amt scooped out= $\mu_b-0.5\sigma_s$=>2-0.5*0.25=>1.875

Thus the remaining amount in the box will be: trial 1=> 48 - 2.125=>45.875 trial 2=> 48 - 1.875=>46.125

mean after trial 1&2 = 46 SD after trial 1&2 =0.17677

similarly if the one scoop was added to the box instead of scooping out: trial 3=> 48 + 2.125=>50.125 trial 4=> 48 + 1.875=>49.875

mean after trial 3&4=50 SD after trial 3&4=0.17677

Thus in both the above cases of trial 1,2,3,& 4 above of either addition or removal of scoop - the spread of the weight left in the box remains the same (although the mean of the weight is higher in the second case (as expected)), the SD remains the same. Hence irrespective of whether a random variable (in this case the scoop) is added or subtracted, it brings in more variability in the subsequent sample and hence we always add variances

Assignment_Week_2_DS

Chitrarth Kaushik

5 January 2019

\(Homework 2\)