We analyze Fisher’s iris data set (which has been used to illustrate various classification methods). The data can be obtained directly in R. First, we rescale the variables by dividing with their standard deviations.
data(iris)
variris <- apply(iris[,-5],2,var)
iris.adjusted <- sweep(iris[,-5],2,sqrt(variris),"/")
Then we perform a classical Torgerson-Gower scaling. Here, we request a 2-dimensinal solution by minimizing the loss function “STRAIN”:
iris.scal <- cmdscale(dist(iris.adjusted),k=2,eig=T)
Let’s plot the results now:
library(MASS)
eqscplot(iris.scal$points,type="n")
text(iris.scal$point,row.names(iris),cex=.8)
eqscplot(iris.scal$points,type="n")
text(iris.scal$point,labels = iris[row.names(iris),5],cex=.8)
Since we use “rescaled Euclidean distance” to represent dissimilarity, by double centering the classical Torgerson-Gower MDS is comparable to PCA by using the correlation matrix (see my class slides). In this case, all the eigenvalues solved from the “doubly centered squared distance matrix” are clearly nonnegative. Let us check out the goodness-of-fit by looking at the proportion of variance explained by the first 2 dimensions (same as how we explain in PCA):
iris.scal$GOF
## [1] 0.9581321 0.9581321
which shows a good fit.
Remark: Note that there are two different ways for computing goodness of fit (see help(cmdscale)) when the eigenvalue is “negative”. This may happen when the dissimilarities do not represent Euclidean distances (e.g. some of them are negative). In this case, the doubly centered squared dissimilarity matrix can not be precisely expressed as XX’. The first output above uses the absolute value of the eigenvalue, while the second output uses the max(eigenvalue, 0) (i.e., use zero when it is negative). However, the two outputs here are the same since all eigenvalues are positive now.
The dimension reduction can be used to examine the “dissimilarity” between variables as well.
Here we use the correlation coefficient as a measure of “proximity” (similarity) between two variables.
Furthermore, the dissimilarity is taken to be the reciprocal of the correlation coefficient.
variable.scal <- cmdscale(1/cor(iris[,-5]),k=2,eig=T)
eqscplot(variable.scal$points,type="n")
text(variable.scal$point,row.names(cor(iris[,-5])),cex=.8)
A version of nonmetric scaling due to Kruskal and Shepard (using isotonic regression) is implemented in the isoMDS function in R.
library(MASS)
iris.iso <- isoMDS(dist(iris.adjusted[-102,]))
## initial value 4.822105
## iter 5 value 4.183883
## iter 5 value 4.182693
## iter 5 value 4.182693
## final value 4.182693
## converged
We are forced to exclude object 102 since the program complains that it is very close to object 143 (almost zero distance). Let’s plot the results.
eqscplot(iris.iso$points,type="n")
text(iris.iso$points,label=row.names(iris[-102,]),cex=.8)
Let’s calculate a measure of fit of the solution, the minimum value of the stress (in percentage), which is the square root of the ratio of the sum of squared differences between the input distances and those of the configuration to the sum of configuration distances squared (i.e. Kruskal’s Stress-1).
iris.iso$stress
## [1] 4.182693
which shows a fairly good fit.
We next take look at the scree plot of the “stress” (in percent): Let us run the following source code (copy and paste in the R prompt, press “enter”) for checking the stress up to dimension k.
================================
scree.plot = function(d, k) {
stresses=isoMDS(d, k=k)$stress
for(i in rev(seq(k-1)))
stresses=append(stresses,isoMDS(d, k=i)$stress)
plot(seq(k),rev(stresses), type="b", xaxp=c(1,k, k-1), ylab="Stress", xlab="Number of dimensions")
}
================================
Check out the scree plot up to 6 dimensions:
scree.plot(dist(iris.adjusted[-102,]), k=6)
## initial value 0.000000
## final value 0.000000
## converged
## initial value 0.000000
## final value 0.000000
## converged
## initial value 0.000000
## final value 0.000000
## converged
## initial value 0.758598
## iter 5 value 0.641253
## final value 0.636005
## converged
## initial value 4.822105
## iter 5 value 4.183883
## iter 5 value 4.182693
## iter 5 value 4.182693
## final value 4.182693
## converged
## initial value 22.498863
## iter 5 value 18.308776
## iter 5 value 18.304023
## iter 5 value 18.304023
## final value 18.304023
## converged
The scree plot shows a clear elbow at dimension = 3. However, it seems that a 2D solution is also adequate. Now we check out the Shepard diagram for a 2D solution:
iris.sh<-Shepard(dist(iris.adjusted[-102,]), iris.iso$points, p=2)
plot(iris.sh, pch=".")
lines(iris.sh$x, iris.sh$yf, type = "S")
The plot shows very little amount of spread around the isotonic step function, which also indicates a good fit of the 2D solution.
Another version of nonmetric scaling is implemented in the sammon function in R, which uses “nonlinear mapping” instead of isotonic regression.
iris.sammon <- sammon(dist(iris.adjusted[-102,]),k=2)
## Initial stress : 0.00963
## stress after 10 iters: 0.00622, magic = 0.500
## stress after 20 iters: 0.00621, magic = 0.500
eqscplot(iris.sammon$points,type="n")
text(iris.sammon$points,label=row.names(iris[-102,]),cex=.8)
Checking goodness of fit:
iris.sammon$stress
## [1] 0.006206077
which shows almost a perfect fit for a 2D solution.
Let us look at more on the scree plot and Shepard diagram. Run the following source code: ================================
scree.plot = function(d, k) {
stresses=sammon(d, k=k)$stress
for(i in rev(seq(k-1)))
stresses=append(stresses,sammon(d, k=i)$stress)
plot(seq(k),rev(stresses), type="b", xaxp=c(1,k, k-1), ylab="Stress", xlab="Number of dimensions")
}
================================
Check out the scree plot up to 6 dimensions:
scree.plot(dist(iris.adjusted[-102,]), k=6)
## Initial stress : 0.00000
## stress after 10 iters: 0.00000, magic = 0.092
## Initial stress : 0.00000
## stress after 10 iters: 0.00000, magic = 0.500
## Initial stress : 0.00000
## stress after 10 iters: 0.00000, magic = 0.500
## Initial stress : 0.00035
## stress after 10 iters: 0.00022, magic = 0.500
## stress after 20 iters: 0.00021, magic = 0.500
## Initial stress : 0.00963
## stress after 10 iters: 0.00622, magic = 0.500
## stress after 20 iters: 0.00621, magic = 0.500
## Initial stress : 0.12602
## stress after 10 iters: 0.08035, magic = 0.500
## stress after 20 iters: 0.07863, magic = 0.500
## stress after 30 iters: 0.07798, magic = 0.500
## stress after 40 iters: 0.07771, magic = 0.500
## stress after 50 iters: 0.07766, magic = 0.500
The scree plot shows a clear elbow at dimension = 2, which suggests that a 2D solution should be adequate. Now we check out the Shepard diagram for a 2D solution:
iris.sh<-Shepard(dist(iris.adjusted[-102,]), iris.sammon$points, p=2)
plot(iris.sh, pch=".")
lines(iris.sh$x, iris.sh$yf, type = "S")
The plot shows very little amount of spread around the fitted function, which also indicates a good fit of the 2D solution.