If you restrict the range of one variable, does the correlation remain unchanged?
#### correlation unchanged? ####
set.seed(100) # to make sure we can reproduce same values
x <- rnorm(100, mean=10, sd=5)
y <- x + rnorm(100, 0, 10)
plot(x, y)
cor(x, y)
## [1] 0.4547479
greater.than.10 <- x > 10
x.sub <- x[greater.than.10]
y.sub <- y[greater.than.10]
plot(x.sub, y.sub)
cor(x.sub, y.sub)
## [1] 0.4229429
# correlations are not equal
# cor(x, y) != cor(x.sub, y.sub)
Does z-scoring this restricted-range subset leads to a smaller correlation?
#### how does z-score change? ####
z.score = function(g) {
return((g - mean(g)) / sd(g))
}
x.z <- z.score(x)
y.z <- z.score(y)
cor(x.z, y.z)
## [1] 0.4547479
# correlation unchanged
x.sub.z <- z.score(x.sub)
y.sub.z <- z.score(y.sub)
cor(x.sub.z, y.sub.z)
## [1] 0.4229429
# correlation unchanged