Arrival rates of “Atheist”-labeled packages

This is an analysis of data from http://www.atheistberlin.com/study

I looked at the frequency of lost packages, and not the delivery time. They didn't supply the raw data for the delivery time (only the average of three days longer, and the one 37-day outlier), so I didn't analyze that part.

For the package arrivals, the data is much simpler; we just need to know whether a package arrived or not, and they do provide those numbers:

# Enter data
data <- read.table(header=TRUE, text='
 condition  result count
   control    lost     1
   control arrived    88
   atheist    lost     9
   atheist arrived    80
')

library(ggplot2)

ggplot(data, aes(x=condition, y=count, fill=result)) +
   geom_bar(stat="identity", colour="black") +
   scale_fill_manual(values=c(arrived="grey80", lost="#CC3300")) +
   theme_bw()

plot of chunk unnamed-chunk-1


# Convert the data to contingency table format for statistical tests
ct <- xtabs(count ~ condition + result, data=data)

# Print it out
ct
##          result
## condition arrived lost
##   atheist      80    9
##   control      88    1

We want to know whether or not the condition variable (atheist vs. control) is independent from the result variable (arrived vs. lost). In other words, is there any relationship between the label and whether the package arrives?

The usual way to test this is to use a Fisher's exact test (or a Chi-Square test of independence for larger data sets). The null hypothesis is that there is in reality no relationship between the variables; in other words, there is no difference between the arrival rates of the Atheist-labeled and control packages. If this is the true state of the world, then we might see a difference between the groups in our data sample, but it would just be due to random chance.

What the frequency tests tell you is this: assuming the null hypothesis is true (there isn't a real difference), what is the probability that we would see data that looks like what observed? You'll want to look at the p-values below:

fisher.test(ct)
## 
##  Fisher's Exact Test for Count Data
## 
## data:  ct 
## p-value = 0.01813
## alternative hypothesis: true odds ratio is not equal to 1 
## 95 percent confidence interval:
##  0.002284 0.764368 
## sample estimates:
## odds ratio 
##      0.102

chisq.test(ct)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  ct 
## X-squared = 5.192, df = 1, p-value = 0.0227

# chisq.test defaults to using Yates continuity correction; other programs
# default to _not_ using it. If you want to compare results against those other
# programs, the correction can be disabled:
chisq.test(ct, correct=FALSE)
## 
##  Pearson's Chi-squared test
## 
## data:  ct 
## X-squared = 6.781, df = 1, p-value = 0.009214

Note that the test tells us, assuming that there isn't a real difference between groups, what is the probability of the observed data. This is not the same as telling us, given the observed data, what is the probability that there isn't a difference between the groups. There's an important distinction between the two, and there's no clear-cut way to get the answer to the latter. (If you're really interested in this, you might want to look into Bayesian statistics.)

One more thing… I realized after writing this that they actually included these stats on their page, at the bottom of the infographic.