This notebook explores the options to transform rankit normalized values to a positive scale. The motivation behind this transformation stems from conversations with computational biologists, as they mentioned that the combination of negative and positive values in the rankit scale can be confusing to non-computational biologists.
Since rankit normalized values are mapped to the standard normal distribution, then it’s easy to shift values to a positive scale while preserving the variance by adopting a normal distribution with a positive mean.
Here I explore what a good mean is for the normal distribution. A sufficient distribution should fulfill the following:
So let’s take a look at what’s the probability of observing values below zero as a function of the mean in a normal distribution with a variance of 1.
means <- seq(0, 7, length.out=100)
means <- data.frame(mean=means, prob=pnorm(0, means))
means$ideal <- means$prob < 0.001
best <- means[means$ideal,]
best <- best[1, , drop=F]
best$prob_lab <- best$prob
best$prob <- best$prob + 0.02
ggplot(means, aes(x=mean, y=prob)) +
geom_point(aes(color = ideal)) +
scale_color_discrete(name="p(X<0) < 0.001") +
ylab("p(X<0)") +
geom_text(data = best, hjust = 0,
label=paste("p(X<0) =", signif(best$prob_lab,3),
"mean =", signif(best$mean, 3))) +
theme_bw()
This indicates that an ideal normal distribution would have a mean = 3.1 and variance = 1. For readability and intuitive purposes rounding down to mean = 3 would be a great pick. Below is a table of probabilities for low values using this distribution
| value | prob_greater_than | prob_smaller_than |
|---|---|---|
| -0.2 | 0.9993129 | 0.0006871 |
| -0.1 | 0.9990324 | 0.0009676 |
| 0.0 | 0.9986501 | 0.0013499 |
| 0.1 | 0.9981342 | 0.0018658 |
| 0.2 | 0.9974449 | 0.0025551 |
| 0.3 | 0.9965330 | 0.0034670 |
| 0.4 | 0.9953388 | 0.0046612 |
| 0.5 | 0.9937903 | 0.0062097 |
| 0.6 | 0.9918025 | 0.0081975 |
| 0.7 | 0.9892759 | 0.0107241 |
| 0.8 | 0.9860966 | 0.0139034 |
| 0.9 | 0.9821356 | 0.0178644 |
| 1.0 | 0.9772499 | 0.0227501 |
| 1.1 | 0.9712834 | 0.0287166 |
| 1.2 | 0.9640697 | 0.0359303 |
| 1.3 | 0.9554345 | 0.0445655 |
| 1.4 | 0.9452007 | 0.0547993 |
| 1.5 | 0.9331928 | 0.0668072 |
| 1.6 | 0.9192433 | 0.0807567 |
| 1.7 | 0.9031995 | 0.0968005 |
| 1.8 | 0.8849303 | 0.1150697 |
| 1.9 | 0.8643339 | 0.1356661 |
| 2.0 | 0.8413447 | 0.1586553 |