Stats 155 Class Notes 2012-11-21

A Day on Planet Alt

Background Software

fetchData("m155development.R")

## Retrieving from http://www.mosaic-web.org/go/datasets/m155development.R

## [1] TRUE

fetchData("simulate.r")

## Retrieving from http://www.mosaic-web.org/go/datasets/simulate.r

## [1] TRUE

Theme for the Day: The Alternative Hypothesis

A Simple Experiment

Suppose that we've just invented a device that makes cars safer, and which we expect to reduce deaths by 10%. That would be a big improvement. We're going to do an experiment:

Select n sites where the device will be installed.
Each site will select 200 cars.
Randomly assign half the cars to have our device, and do sham installation on the other cars to avoid the placebo effect.
Track the cars for a year. Compare the number of deaths in the treatment and in the control group. In designing the question we need to decide how big should n be?

We already know something about Planet Alt: that the death rate will be lower by 10%. Now we need to do a little bit more work to make a Planet Alt that's relevant to the data we will actually be creating on Planet Earth. Death rates for vehicular accidents are given at the NHTSA web site http://www-fars.nhtsa.dot.gov/Main/index.aspx.

The site indicates that in 2010, there were 15.65 vehicle accident deaths per 100,000 drivers per year. We don't actually know that this rate will be relevant to our experiment.

We're not conducting the experiment in 2010.
The death rate might vary from one type of car to another, and the sorts of device we install may not be applicable to all cars. Nonetheless, we'll construct Planet Alt with this rate in mind.

Imagine that we collect data with \( n=1000 \) sites. (This corresponds to 100,000 cars in each of the treatment and control groups.) The death rate at each site should be 15.65/1000.

ourdata = groupwiseData(treatment = rpois(1000, lambda = 0.9 * 15.65/1000), 
    control = rpois(1000, lambda = 15.65/1000))
summary(lm(val ~ group, data = ourdata))

## 
## Call:
## lm(formula = val ~ group, data = ourdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.015 -0.015 -0.012 -0.012  0.988 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   0.01200    0.00365    3.29    0.001 **
## groupcontrol  0.00300    0.00516    0.58    0.561   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 0.115 on 1998 degrees of freedom
## Multiple R-squared: 0.000169,    Adjusted R-squared: -0.000331 
## F-statistic: 0.338 on 1 and 1998 DF,  p-value: 0.561

Have everyone in the class do this and see how many of them were able to reject the null hypothesis.

How many sites do we need to use to get 90% power?

Looking at the Standard Error

Another way to think about this problem is to ask what is the precision of our estimate of the number of deaths. It needs to be substantially smaller than the anticipated reduction in the death rate. That reduction is 10% of 15.65/100 or 0.0016

summary(lm(rpois(1000, lambda = 0.9 * 15.65/1000) ~ 1))

## 
## Call:
## lm(formula = rpois(1000, lambda = 0.9 * 15.65/1000) ~ 1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.014 -0.014 -0.014 -0.014  0.986 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.01400    0.00372    3.77  0.00018 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 0.118 on 999 degrees of freedom

This is too large. We should try to get it much smaller, perhaps one-sixth as large. That will necessitate collecting 36 times as much data.

Do the power calculation for that size.

QUESTION: Given the number of sites, is there a practical way to do the experiment?
How could we do the experiment in a less costly way? (Find a high-risk group of cars. Find an endpoint other than deaths, say injuries.)

Here's a bit of information for Pennsylvania.

87,949 people were injured in car crashes in PA in 2010.
1324 people were killed.

How big would the study need to be if studied injuries rather than deaths?

A More Complicated Alternative

Aspirin and stroke.

The aspirin simulation gives a (not very detailed) model of the relationship between aspirin consumption and stroke. It has a confounder, sick, which represents how sick the patient is.

You can see the overall structure of the simulation:

aspirin

## Causal Network with  3  vars:  sick, mgPerDay, stroke 
## ===============================================
## sick is exogenous
## mgPerDay <== sick 
## stroke <== sick & mgPerDay

The simulation itself is a little bit complicated. For what it's worth, you can look inside:

equations(aspirin)

## sick <== sample(c(rep(0,20),1:10,1:3,8:10), nsamps,replace=TRUE) 
## mgPerDay <== 5*(sick<2)*sample(c(0,0,0,0:10,0:20),nsamps, replace=TRUE) + (sick>=2)*5*sample(c(0,0,0,0:20,10:20,15:20),nsamps,replace=TRUE) 
## stroke <== cut(rnorm(nsamps,mean=0,sd=5) + 2*sick - mgPerDay/10, c(-Inf,5,Inf), labels=c('N','Y'))

Imagine that your research team has constructed this model. You've been asked to design the observational study. Two questions:

Do you need to measure sick to get a meaningful result?
How big should the study be to detect the beneficial influence of aspirin on stroke?

Assuming that it's important to include sick as a covariate, here's one trial in a power calculation for a study of size \( n=50 \).

f = run.sim(aspirin, 50)
summary(glm(stroke == "Y" ~ mgPerDay + sick, data = f))$coef[2, 4]

## [1] 0.009233

Generate many trials and find the power.

Then set the sample size as required to create a power of, say, 80%.

How much larger would the sample size need to be to create a power of 95%.

Group activity

The power of vitamin D. Handout.

Why Most Published Research Findings Are False

Paper and book by John Ioannidis

Example: Intercessory Prayer

Wikipedia entry
some papers along with lists of p-values. Why all the small p-values in Table 2? Pneumonia, intubation, anti-biotics are all related.
A meta-analysis How big are the p-values?

Abstracts

*CCU outcomes
*Retroactive prayer. How could prayer work retroactively? Look at the last paragraph of the paper.

How to avoid this problem? Insist on a p-value of 0.0000001. Repeat the aspirin study to find out how large it would need to be. The sample size is surprisingly small.

vitaminD notes

activity

What sample size do we need to get 90 percent power? (About 300.)

s = do(100) * summary(lm(systolic ~ D + race, data = run.sim(vitaminD, 30)))$coef[2, 
    4]
tally(~result < 0.05, data = s)

## 
##  TRUE FALSE Total 
##    16    84   100