This document is motivated by the desire to analyse data that carry only *relative* information.

The first section highlights one of the pitfalls in ignoring the need to treat this kind of data with care, and shows that the trusty correlation coefficient is not at all trusty or appropriate as a measure of association for relative data.

Proportionality *is* an appropriate measure of association for relative data, and the second section of this document explores testing the hypothesis that two variables are proportional. It also highlights why this too must be approached with care.

To illustrate the phenomenon of *spurious correlation* (a term coined by Karl Pearson in 1896) let's look at a year's sales (in thousands) of three different kinds of books (`Cookbooks`

, `Thrillers`

and `Mysteries`

) across a number of book stores (`nstores`

):

```
n.stores <- 250
sales.df <- data.frame(Cookbooks=rnorm(n.stores, mean=10, sd=1),
Thrillers=rnorm(n.stores, mean=20, sd=1),
Mysteries=rnorm(n.stores, mean=30, sd=3))
```

Our sales data are just random numbers sampled from Normal distrubitions with different means and standard deviations.

Spurious correlation can arise when we form *ratios* of random variables, so lets do that:

```
sales.df <- transform(sales.df,
Cookbooks.over.Mysteries= Cookbooks/Mysteries,
Thrillers.over.Mysteries= Thrillers/Mysteries)
```

Let's plot the data pairwise to reassure ourselves visually of the lack of relationship between `Cookbooks`

, `Thrillers`

and `Mysteries`

Now let's look at what happens when we start working with ratios of these random, and statistically independent variables. Specifically, `Cookbooks/Mysteries`

and `Thrillers/Mysteries`

:

The distribution of points has a definite “tilt” to it and we can see the pairwise correlation indicates that something is going on even though all three variables are independent.

If we use a bit of colour, we can see that this “tilt” is created by `Mysteries`

being in the denominator of both variables:

…and this phenomenon is what Karl Pearson named *spurious correlation*.

Clearly, correlation is not a good measure of association for these relative data because a value of `rho == 0.62`

suggests that there is a relationship between `Cookbooks/Mysteries`

and `Thrillers/Mysteries`

even though all three variables are independent.

So, if correlation is not an appropriate measure of association for data carrying only relative information, what is?

Aitchison (1986) proposed the variance of the logratios of two variables \( x \) and \( y \): \[ {\mathrm Var}(\log x/y). \] If \( x \) and \( y \) are always proportional, the logarithm of their ratio will be constant, and the variance of the logarithm will be zero.

When there is some variability in the relationship between \( x \) and \( y \) we need a means to test whether the probability that a proportional relationship exists between them. We suggest the following approach for testing proportionality, after Warton et al.

To test for proportionality of \( x \) and \( y \) we test to see whether the slope of \( \log x \) versus \( \log y \) is significantly different from 1. Warton et al. describe a resampling approach to this page 290, which amounts to

- Constructing the residual plot \( (\log y - 1\cdot\log x) \) versus \( (\log y + 1\cdot\log x) \).

In other words, plot \( \log x/y \) against \( \log xy \). See the link to the variance of logratios? - Calculate the test statistic \( r_{rf}(b)^2 = \rho(\log x/y, \log xy)^2 \)
- Accumulate that statistic for permuted versions of that data
- Form the p-value as the proportion of times in which the “permuted test statistic” exceeds the test statistic that was actually obeserved

Let's try this with our `Cookbooks`

and `Thrillers`

```
log.x <- log(sales.df$Cookbooks)
log.y <- log(sales.df$Thrillers)
log.xy <- log.x + log.y
log.x.on.y <- log.x - log.y
test.statistic <- cor(log.xy, log.x.on.y)^2
permutations <- 10000
p.value <- sum(replicate(permutations, cor(log.xy, sample(log.x.on.y))^2 > test.statistic))/permutations
```

This gives a p-value of `0.0000`

and we reject the hypothesis that the slope is 1, i.e., `Cookbooks`

and `Thrillers`

don't appear to be proportional.

Just to be clear, the null hypothesis \( H_0 \) is that the slope of \( \log x \) versus \( \log y \) is 1. So, the probability of obtaining a value of the test statistic more extreme than we observed *assuming the slope really is 1* is `0`

.

- If the p-value is small (say < 0.01), we reject the hypothesis that the slope is 1
- If the p-value is not small (say > 0.01) we
*fail to reject*the null hypothesis

When we try the same with `Cookbooks/Mysteries`

and `Thrillers/Mysteries`

, we get a p-value of `0.0000`

and, again, we reject the hypothesis that `Cookbooks`

and `Thrillers`

are proportional.

Now let's see what happens when we apply the test to a pair of variables that *are* proportional, though with some variability in their relationship. We create these “roughly proportional” variables by multiplying `Cookbooks`

with two different vectors of noise:

```
log.x <- log(sales.df$Cookbooks * rnorm(n.stores, 1, 0.01))
log.y <- log(sales.df$Cookbooks * rnorm(n.stores, 5, 0.01))
```

This time, the test gives a p-value of `0.2056`

, and we fail to reject the hypothesis that the variables are proportional.

This all seems pretty good, but there are a few traps for the unwary. First, it is important to remember that we are testing the *statistical* significance of a relationship, not effect size. We can see this if we apply the test to `Cookbooks`

and `Cookbooks`

raised to the power of 1.001:

```
x <- jitter(sales.df$Cookbooks)
y <- jitter(sales.df$Cookbooks^1.001)
```

Let's plot these two and add the line of equality in green:

Despite what our eyes tell us, the resampling test gives a p-value of `0.0000`

telling us to reject the hypothesis that “`Cookbooks`

” and “`Cookbooks`

raised to the power of 1.001” are proportional. And so it should — they're not! However, I think it is pretty natural (at least for non-statisticians) to be a little bit surprised by this.

To increase our confidence that we haven't made a programming error, we check that using the `smatr`

package's `slope.test`

which says that, even though the slope of the relationship between `log x`

and `log y`

(`0.9967`

) is near as dammit to 1, we still reject the hypothesis that they are proportional with a p-value of `0.0000`

.

The second trap for the unwary comes when we apply the slope test to a pair of statistically independent variables with the same standard deviation:

```
x <- rnorm(100)
y <- rnorm(100)
result <- slope.test(x, y)
unlist(result)
```

```
## r p test.value b ci1 ci2
## 0.01447 0.88635 1.00000 -1.01453 -0.83191 -1.23723
```

In this case, the p-value of `0.8864`

means we fail to reject the hypothesis that the slope of the x-y pairs is 1. This is, in part, to do with the standardised major axis estimate of slope: \( {\mathrm sign}(s_{xy}) s_y/s_x \) where \( s_{xy} \) is the sample covariance, and \( s_x \) and \( s_y \) the sample standard deviations of \( x \) and \( y \). This issue goes away when the standard deviations of x and y differ:

```
x <- rnorm(100)
y <- rnorm(100, sd = 1.5)
result <- slope.test(x, y)
unlist(result)
```

```
## r p test.value b ci1 ci2
## -5.404e-01 6.507e-09 1.000e+00 5.463e-01 4.476e-01 6.666e-01
```

Now, the p-value of `0.0000`

suggests we reject the hypothesis that x and y have a slope of 1.

- Pearson, Karl. 1896. “Mathematical Contributions to the Theory of Evolution.–On a Form of Spurious Correlation Which May Arise When Indices Are Used in the Measurement of Organs.” Proceedings of the Royal Society of London 60: 489-498. http://www.jstor.org/stable/115879
- Aitchison, J. 1986. The Statistical Analysis of Compositional Data. Chapman & Hall, Ltd.
- Warton, David I., Ian J. Wright, Daniel S. Falster, and Mark Westoby. 2006. “Bivariate Line-fitting Methods for Allometry.” Biological Reviews 81 (2): 259-291. doi:10.1017/S1464793106007007. http://onlinelibrary.wiley.com/doi/10.1017/S1464793106007007/abstract
- Warton, David I., Duursma, Remko A., Falster, Daniel S. and Taskinen, Sara. 2012. smatr 3 - an R package for estimation and inference about allometric lines Methods in Ecology and Evolution, 3(2), 257-259