Mammography and breast cancer.
Let prevalence be 1%. If you have a positive mammogram, what's the probability of breast cancer.
Just the Bayesian Approach
How to travel to the different worlds …
We've already done several examples using resampling — the idea is to see how much sampling variation there would be if the world were just like our sample.
Now to do shuffling.
Work with model coefficients and \( R^2 \) from a few models, with shuffling of an explanatory variable, or shuffling of the response.
Example: Sector of the economy and wages
mod = lm(wage ~ sector * sex + educ + exper, data = CPS85)
real = r.squared(mod)
real
## [1] 0.3113
Now go to Planet Null
s = do(1000) * lm(wage ~ shuffle(sector) * sex + educ + exper, data = CPS85)
densityplot(~r.squared, data = s)
tally(~r.squared >= real, data = s)
## Error: comparison (5) is possible only for atomic and list types
What's the p-value?
Do the same on a coefficient and look at the two-tailed test.
fetchData("getDJIAdata.R")
## Retrieving from http://www.mosaic-web.org/go/datasets/getDJIAdata.R
## [1] TRUE
djia = getDJIAdata() # djia-2011.csv is the basic file
## Retrieving from http://www.mosaic-web.org/go/datasets/djia-2011.csv
xyplot(Close ~ Date, data = djia)
Look at the day-to-day differences in log prices:
dd = with(djia, diff(log(Close)))
mean(dd)
## [1] 0.000191
Subtract out the mean, shuffle, cumulative sum, and exponentiate to create a realization:
ddnull = dd - mean(dd)
sim = exp(cumsum(shuffle(ddnull)))
xyplot(sim ~ Date, data = djia)
Planet Null:
Planet Alt
What's the decision threshold?
What's the
fetchData("mHypTest.R")
mHypTest() # by default, a coefficient