16.7 Exercises
1. In 1999, in England, Sally Clark was found guilty of the murder of two of her sons. Both infants were found dead in the morning, one in 1996 and another in 1998. In both cases, she claimed the cause of death was sudden infant death syndrome (SIDS). No evidence of physical harm was found on the two infants so the main piece of evidence against her was the testimony of Professor Sir Roy Meadow, who testified that the chances of two infants dying of SIDS was 1 in 73 million. He arrived at this figure by finding that the rate of SIDS was 1 in 8,500 and then calculating that the chance of two SIDS cases was \(8,500×8,500≈73 million\). Which of the following do you agree with?
Sir Meadow assumed that the probability of the second son being affected by SIDS was independent of the first son being affected, thereby ignoring possible genetic causes. If genetics plays a role then is. \(Pr(2^{nd} case SIDS|1^{st} case SIDS)<Pr(1^{st} case SIDS)\).
2. Let’s assume that there is in fact a genetic component to SIDS and the probability of \(Pr(2^{nd} case SIDS|1^{st} case SIDS)=1/100\), is much higher than 1 in 8,500. What is the probability of both of her sons dying of SIDS?
PR1<-1/8500
PR2<-1/100
PR1*PR2
## [1] 1.176471e-06
3. Many press reports stated that the expert claimed the probability of Sally Clark being innocent as 1 in 73 million. Perhaps the jury and judge also interpreted the testimony this way. This probability can be written as the probability of a mother is a son-murdering psychopath given that two of her children are found dead with no evidence of physical harm. According to Bayes’ rule, what is this? Pr(two children found dead with no evidence of harm∣mother is a murderer)Pr(mother is a murderer)/(Pr(two children found dead with no evidence of harm)) \(\frac{Pr(b|a)*Pr(a)}{Pr(b)}\)
#Probability that 1st son dies of SIDS.#
Pr1<-1/8500
#Probability that 2nd son dies of SIDS.#
Pr2<-1/100
#Probability that both sons die without evidence.#
PrB<-Pr1*Pr2
#Probability that both sons don't die.#
PrnotB<-1-PrB
#Probability that mom is a son-murdering psychopath who finds a way to kill her children without leaving evidence of physical harm#
PrAB<-.5
#Probability that moms are murderers.#
PrA<-1/10^6
PrBA<-PrAB*PrA/(PrB)
5. After Sally Clark was found guilty, the Royal Statistical Society issued a statement saying that there was “no statistical basis” for the expert’s claim. They expressed concern at the “misuse of statistics in the courts”. Eventually, Sally Clark was acquitted in June 2003. What did the expert miss? He made two mistakes. First, he misused the multiplication rule and did not take into account how rare it is for a mother to murder her children. After using Bayes’ rule, we found a probability closer to 0.5 than 1 in 73 million.
6. Florida is one of the most closely watched states in the U.S. election because it has many electoral votes, and the election is generally close, and Florida tends to be a swing state that can vote either way. Create the following table with the polls taken during the last two weeks. Take the average spread of these polls. The CLT tells us this average is approximately normal. Calculate an average and provide an estimate of the standard error. Save your results in an object called results.
library(tidyverse)
## Warning: package 'stringr' was built under R version 4.3.3
## Warning: package 'lubridate' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
data(polls_us_election_2016)
polls<-polls_us_election_2016|>filter(state == "Florida" & enddate >= "2016-11-04")|> mutate(spread=rawpoll_clinton/100 - rawpoll_trump/100)
results<-summarize(polls, avg=mean(spread), sd(spread)/sqrt(n()))
8. The CLT tells us that our estimate of the spread \(d\) has normal distribution with expected value \(d\) and standard deviation \(σ\) calculated in problem 6. Use the formulas we showed for the posterior distribution to calculate the expected value of the posterior distribution if we set \(μ=0\) and \(τ=0.01\).
mu<-0
tau<-.01
std<-results[1,2]
y<-results[1,1]
B<-std^2/(std^2+tau^2)
expd<-B*mu+(1-B)*y
se<-(1/(1/std^2+1/tau^2))^.5
lower<-expd-qnorm(.975)*se
upper<-expd+qnorm(.975)*se
ci<-c(lower, upper)
pnorm(0, expd, se)
## [1] 0.3203769
sapply function to change the prior variance
from seq(0.05, 0.05, len=100) and observe how the
probability changes by making a plot.library(ggplot2)
library(dplyr)
library(dslabs)
data(polls_us_election_2016)
polls<-polls_us_election_2016|>filter(state == "Florida" & enddate >= "2016-11-04")|> mutate(spread=rawpoll_clinton/100 - rawpoll_trump/100)
results<-summarize(polls, avg=mean(spread), sd(spread)/sqrt(n()))
mtaus<-seq(.005, .05, len=100)
mu<-0
sig<-results[1, 2]
y<-results[1, 1]
pcalc<-function(tau) {
B<-sig^2/(sig^2+tau^2)
se<-sqrt(1/(1/(sig^2)+1/tau^2))
expd<-B*mu+(1-B)*y
pnorm(0, expd, se)
}
ps<-pcalc(mtaus)
chancedf<-data.frame(mtaus, ps)
chancedf
## mtaus ps
## 1 0.005000000 0.3715680
## 2 0.005454545 0.3643095
## 3 0.005909091 0.3577231
## 4 0.006363636 0.3517554
## 5 0.006818182 0.3463529
## 6 0.007272727 0.3414632
## 7 0.007727273 0.3370371
## 8 0.008181818 0.3330285
## 9 0.008636364 0.3293949
## 10 0.009090909 0.3260978
## 11 0.009545455 0.3231022
## 12 0.010000000 0.3203769
## 13 0.010454545 0.3178937
## 14 0.010909091 0.3156276
## 15 0.011363636 0.3135562
## 16 0.011818182 0.3116598
## 17 0.012272727 0.3099206
## 18 0.012727273 0.3083230
## 19 0.013181818 0.3068530
## 20 0.013636364 0.3054982
## 21 0.014090909 0.3042476
## 22 0.014545455 0.3030914
## 23 0.015000000 0.3020207
## 24 0.015454545 0.3010277
## 25 0.015909091 0.3001054
## 26 0.016363636 0.2992476
## 27 0.016818182 0.2984486
## 28 0.017272727 0.2977034
## 29 0.017727273 0.2970073
## 30 0.018181818 0.2963564
## 31 0.018636364 0.2957469
## 32 0.019090909 0.2951756
## 33 0.019545455 0.2946393
## 34 0.020000000 0.2941353
## 35 0.020454545 0.2936613
## 36 0.020909091 0.2932148
## 37 0.021363636 0.2927939
## 38 0.021818182 0.2923967
## 39 0.022272727 0.2920215
## 40 0.022727273 0.2916667
## 41 0.023181818 0.2913309
## 42 0.023636364 0.2910129
## 43 0.024090909 0.2907113
## 44 0.024545455 0.2904251
## 45 0.025000000 0.2901533
## 46 0.025454545 0.2898950
## 47 0.025909091 0.2896493
## 48 0.026363636 0.2894154
## 49 0.026818182 0.2891925
## 50 0.027272727 0.2889800
## 51 0.027727273 0.2887774
## 52 0.028181818 0.2885839
## 53 0.028636364 0.2883990
## 54 0.029090909 0.2882223
## 55 0.029545455 0.2880533
## 56 0.030000000 0.2878915
## 57 0.030454545 0.2877366
## 58 0.030909091 0.2875881
## 59 0.031363636 0.2874458
## 60 0.031818182 0.2873092
## 61 0.032272727 0.2871781
## 62 0.032727273 0.2870523
## 63 0.033181818 0.2869313
## 64 0.033636364 0.2868150
## 65 0.034090909 0.2867032
## 66 0.034545455 0.2865956
## 67 0.035000000 0.2864920
## 68 0.035454545 0.2863922
## 69 0.035909091 0.2862961
## 70 0.036363636 0.2862034
## 71 0.036818182 0.2861140
## 72 0.037272727 0.2860278
## 73 0.037727273 0.2859445
## 74 0.038181818 0.2858641
## 75 0.038636364 0.2857865
## 76 0.039090909 0.2857115
## 77 0.039545455 0.2856389
## 78 0.040000000 0.2855688
## 79 0.040454545 0.2855009
## 80 0.040909091 0.2854353
## 81 0.041363636 0.2853717
## 82 0.041818182 0.2853101
## 83 0.042272727 0.2852505
## 84 0.042727273 0.2851927
## 85 0.043181818 0.2851367
## 86 0.043636364 0.2850823
## 87 0.044090909 0.2850296
## 88 0.044545455 0.2849785
## 89 0.045000000 0.2849289
## 90 0.045454545 0.2848807
## 91 0.045909091 0.2848340
## 92 0.046363636 0.2847885
## 93 0.046818182 0.2847444
## 94 0.047272727 0.2847015
## 95 0.047727273 0.2846598
## 96 0.048181818 0.2846192
## 97 0.048636364 0.2845798
## 98 0.049090909 0.2845414
## 99 0.049545455 0.2845040
## 100 0.050000000 0.2844677
plot(chancedf$mtaus, chancedf$ps)