## Spearman vs. Pearson vs. Kendall for polynomials

For a parabola in the first quadrant, Pearson Coeffiecent is 0.9626907, Sprearman and Kendall Coeffiecents are 1. For a cubic Pearson is 0.9182171, Spearman and Kendall Coeffiecents are 1.

parabola = data.frame(x = c(0,1,2,3,4,5,6,7,8,9), y = c(0,1,4,9,16,25,36,49,64,81))
plot(parabola$y ~ parabola$x)

cor(parabola, method = 'pearson')
##           x         y
## x 1.0000000 0.9626907
## y 0.9626907 1.0000000
cor(parabola, method = 'spearman')
##   x y
## x 1 1
## y 1 1
cor(parabola, method = 'kendall')
##   x y
## x 1 1
## y 1 1
cubic = data.frame(x = c(-9,-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7,8,9), y = c(-9,-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7,8,9)^3)
plot(cubic$y~cubic$x)

cor(cubic, method = 'pearson')
##           x         y
## x 1.0000000 0.9182171
## y 0.9182171 1.0000000
cor(cubic, method = 'spearman')
##   x y
## x 1 1
## y 1 1
cor(parabola, method = 'kendall')
##   x y
## x 1 1
## y 1 1

We can see that Spearman and Kendall Ranks are better at ‘picking out’ non-linear behavior between variables.

## Data with Outliers

Two high-leverage outliers have been added. We see that the Spearman Rank is least affected by the presence of an outlier, but the Kendall Rank is more consistant.

outlier_1 = data.frame(x = c(0:100,50), y = c(0:100,200)+rnorm(102,0,5))
plot(outlier_1$y ~ outlier_1$x)

cor(outlier_1, method = 'pearson')
##           x         y
## x 1.0000000 0.8956903
## y 0.8956903 1.0000000
cor(outlier_1, method = 'spearman')
##           x         y
## x 1.0000000 0.9721658
## y 0.9721658 1.0000000
cor(outlier_1, method = 'kendall')
##           x         y
## x 1.0000000 0.8834094
## y 0.8834094 1.0000000
outlier_2 = data.frame(x = c(0:100,100), y = c(0:100,2)+rnorm(102,0,5))
plot(outlier_2$y ~ outlier_2$x)

cor(outlier_2, method = 'pearson')
##           x         y
## x 1.0000000 0.9459572
## y 0.9459572 1.0000000
cor(outlier_2, method = 'spearman')
##          x        y
## x 1.000000 0.946794
## y 0.946794 1.000000
cor(outlier_2, method = 'kendall')
##           x         y
## x 1.0000000 0.8985535
## y 0.8985535 1.0000000