Show the t-statistic and the translation into a p-value.
Show the p-value reported on R2.
Notice that the alternative plays no role whatsoever in the regression report.
Motto: Always know what world you are thinking about.
We want to know which hypotheses are true on Earth and which are false.
The planets involved in statistical inference are:
| Planet Sample | Planet Null | Planet Alt |
|---|---|---|
How to travel to the different worlds …
Side-by-side comparison: http://en.wikipedia.org/wiki/File:PileatedIvoryWoodpecker.svg
Some of the photographic evidence The Ivory Billed woodpecker from a hypothesis testing perspective second page.
fetchData("mHypTest.R")
mHypTest() # by default, a coefficient
fetchData("getDJIAdata.R")
## Retrieving from http://www.mosaic-web.org/go/datasets/getDJIAdata.R
## [1] TRUE
djia = getDJIAdata() # djia-2011.csv is the basic file
## Retrieving from http://www.mosaic-web.org/go/datasets/djia-2011.csv
xyplot(Close ~ Date, data = djia)
Look at the day-to-day differences in log prices:
dd = with(djia, diff(log(Close)))
mean(dd)
## [1] 0.000191
Subtract out the mean, shuffle, cumulative sum, and exponentiate to create a realization:
ddnull = dd - mean(dd)
sim = exp(cumsum(shuffle(ddnull)))
xyplot(sim ~ Date, data = djia)
A proof for the existence of Extra-Sensory Perception! If I can get you to focus on a number, I can predict, to some extent, your thought process.
Your birthday is a number that plays an important part in your thought process. Generate a random number between 0 and your birday.
Spreadsheet reading command:
esp = fetchGoogle("https://docs.google.com/spreadsheet/pub?key=0Am13enSalO74dE5iMjZrcGFjTUtJSjg0T05NLW84Mmc&single=true&gid=0&output=csv")
## Loading required package: RCurl
## Warning: package 'RCurl' was built under R version 2.15.2
## Loading required package: bitops
How did I know that I could reject the Null in the shuffling problem? I did a little simulation.
mysim <- function(n = 15) {
days = resample(1:31, size = n)
nums = ceiling(runif(n, min = 0, max = days))
mod = lm(nums ~ days)
list(r2 = r.squared(mod), p = summary(mod)$coef[2, 4])
}
s15 = do(1000) * mysim(24) # typical R^2 is about 0.4
mean(~r2, data = s15)
## [1] 0.4169
tally(~p < 0.05, data = s15, format = "proportion")
##
## TRUE FALSE Total
## 0.97 0.03 1.00
Do mHypTest(TRUE) setting the “effect size” to about 0.4