Exploratory mapping: correspondence analysis and multidimensional scaling

Anna Shirokanova

3/11/2019

Part 1. Correspondence Analysis

Facts:

A famous example

from La Distinction (1979). .

Example: Household Chores in Families

Load the libraries and data

library(FactoMineR)
library(factoextra)
library(ade4)
data(housetasks)
library(ca)
library(ggplot2)
library(ggrepel)
library(gplots)
library(graphics)
library(gridExtra)

2. Check if the variables are independent

chisqresults <- chisq.test(housetasks)
chisqresults
## 
##  Pearson's Chi-squared test
## 
## data:  housetasks
## X-squared = 1944.5, df = 36, p-value < 2.2e-16

If p<.05, it is a statistically significant deviation from the independence

dt <- as.table(as.matrix(housetasks))
balloonplot(t(dt), main = "Housetasks", xlab = "", ylab = "",
            label = F, show.margins = F)

balloonplot(t(dt), main = "Housetasks", xlab = "", ylab = "",
            label = F)

mosaicplot(dt, shade = TRUE, las=2,
           main = "housetasks") 

Residuals are the difference between observed and expected.

Laundry, Main_meal, Dinner and breakfeast (blue color) are mainly done by the wife in our example.

4. Do the calculations to get the CoA axes

ca <- ca(housetasks)
names(ca)
##  [1] "sv"         "nd"         "rownames"   "rowmass"    "rowdist"   
##  [6] "rowinertia" "rowcoord"   "rowsup"     "colnames"   "colmass"   
## [11] "coldist"    "colinertia" "colcoord"   "colsup"     "N"         
## [16] "call"
summary(ca)
## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.542889  48.7  48.7  ************             
##  2      0.445003  39.9  88.6  **********               
##  3      0.127048  11.4 100.0  ***                      
##         -------- -----                                 
##  Total: 1.114940 100.0                                 
## 
## 
## Rows:
##      name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1  | Lndr |  101  925  120 | -992 740 183 | -495 185  56 |
## 2  | Mn_m |   88  974   81 | -876 742 124 | -490 232  47 |
## 3  | Dnnr |   62  930   34 | -693 777  55 | -308 154  13 |
## 4  | Brkf |   80  905   37 | -509 505  38 | -453 400  37 |
## 5  | Tdyn |   70  975   22 | -394 440  20 |  434 535  30 |
## 6  | Dshs |   65  764   18 | -189 118   4 |  442 646  28 |
## 7  | Shpp |   69  811   13 | -118  64   2 |  403 748  25 |
## 8  | Offc |   55  119   48 |  227  53   5 | -254  66   8 |
## 9  | Drvn |   80  767   91 |  742 432  81 | -653 335  76 |
## 10 | Fnnc |   65  997   27 |  271 161   9 |  618 837  56 |
## 11 | Insr |   80  885   52 |  647 576  61 |  474 309  40 |
## 12 | Rprs |   95  933  281 | 1529 707 407 | -864 226 159 |
## 13 | Hldy |   92  992  176 |  252  30  11 | 1435 962 425 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 | Wife |  344  954  270 | -838 802 445 | -365 152 103 |
## 2 | Altr |  146  110  106 |  -62   5   1 | -292 105  28 |
## 3 | Hsbn |  218  980  342 | 1161 772 542 | -602 208 178 |
## 4 | Jntl |  292  998  282 |  149  21  12 | 1027 977 691 |
ca2 <- CA(housetasks)

summary.CA(ca2)
## 
## Call:
## CA(X = housetasks) 
## 
## The chi square of independence between the two variables is equal to 1944.456 (p-value =  0 ).
## 
## Eigenvalues
##                        Dim.1   Dim.2   Dim.3
## Variance               0.543   0.445   0.127
## % of var.             48.692  39.913  11.395
## Cumulative % of var.  48.692  88.605 100.000
## 
## Rows (the 10 first)
##               Iner*1000     Dim.1     ctr    cos2     Dim.2     ctr
## Laundry     |   134.160 |  -0.992  18.287   0.740 |   0.495   5.564
## Main_meal   |    90.692 |  -0.876  12.389   0.742 |   0.490   4.736
## Dinner      |    38.246 |  -0.693   5.471   0.777 |   0.308   1.321
## Breakfeast  |    41.124 |  -0.509   3.825   0.505 |   0.453   3.699
## Tidying     |    24.667 |  -0.394   1.998   0.440 |  -0.434   2.966
## Dishes      |    19.587 |  -0.189   0.426   0.118 |  -0.442   2.844
## Shopping    |    14.970 |  -0.118   0.176   0.064 |  -0.403   2.515
## Official    |    53.300 |   0.227   0.521   0.053 |   0.254   0.796
## Driving     |   101.509 |   0.742   8.078   0.432 |   0.653   7.647
## Finances    |    29.564 |   0.271   0.875   0.161 |  -0.618   5.559
##                cos2     Dim.3     ctr    cos2  
## Laundry       0.185 |  -0.317   7.968   0.075 |
## Main_meal     0.232 |  -0.164   1.859   0.026 |
## Dinner        0.154 |  -0.207   2.097   0.070 |
## Breakfeast    0.400 |   0.220   3.069   0.095 |
## Tidying       0.535 |  -0.094   0.489   0.025 |
## Dishes        0.646 |   0.267   3.634   0.236 |
## Shopping      0.748 |   0.203   2.223   0.189 |
## Official      0.066 |   0.923  36.940   0.881 |
## Driving       0.335 |   0.544  18.596   0.233 |
## Finances      0.837 |   0.035   0.062   0.003 |
## 
## Columns
##               Iner*1000     Dim.1     ctr    cos2     Dim.2     ctr
## Wife        |   301.019 |  -0.838  44.462   0.802 |   0.365  10.312
## Alternating |   117.824 |  -0.062   0.104   0.005 |   0.292   2.783
## Husband     |   381.373 |   1.161  54.234   0.772 |   0.602  17.787
## Jointly     |   314.725 |   0.149   1.200   0.021 |  -1.027  69.118
##                cos2     Dim.3     ctr    cos2  
## Wife          0.152 |  -0.200  10.822   0.046 |
## Alternating   0.105 |   0.849  82.549   0.890 |
## Husband       0.208 |  -0.189   6.133   0.020 |
## Jointly       0.977 |  -0.046   0.495   0.002 |
ca3 <- CA(housetasks, ncp = 3, graph = T)

Two main functions used for CoA are ca and CA.

They extract 3 dimensions as there are 4 columns (and 4-1 < 13-1).

Notes

Results for row variables: The principal coordinates for the first 3 dimensions (k = 1, k = 2 and k = 3).

In our example, it is the sum of the squared correlations over the three included dimensions.

5. How many dimensions (1, 2?) should you retain and what do they explain?

fviz_screeplot(ca2)#scree plot

fviz_screeplot(ca, geom = "line")

eigenvalues <- factoextra::get_eigenvalue(ca2)
head(round(eigenvalues, 2))
##       eigenvalue variance.percent cumulative.variance.percent
## Dim.1       0.54            48.69                       48.69
## Dim.2       0.45            39.91                       88.60
## Dim.3       0.13            11.40                      100.00

The average axis should account for 1/(ncol(housetasks)-1) = 1/3 = 33.33% in terms of the 4 columns.

fviz_screeplot(ca) +
  geom_hline(yintercept = 33.33, linetype = 2, color = "red")

fviz_screeplot(ca2, geom = "line") +
  geom_hline(yintercept = 33.33, linetype = 2, color = "blue")

summary(ca)$rows
##    name mass  qlt  inr  k=1 cor ctr  k=2 cor ctr
## 1  Lndr  101  925  120 -992 740 183 -495 185  56
## 2  Mn_m   88  974   81 -876 742 124 -490 232  47
## 3  Dnnr   62  930   34 -693 777  55 -308 154  13
## 4  Brkf   80  905   37 -509 505  38 -453 400  37
## 5  Tdyn   70  975   22 -394 440  20  434 535  30
## 6  Dshs   65  764   18 -189 118   4  442 646  28
## 7  Shpp   69  811   13 -118  64   2  403 748  25
## 8  Offc   55  119   48  227  53   5 -254  66   8
## 9  Drvn   80  767   91  742 432  81 -653 335  76
## 10 Fnnc   65  997   27  271 161   9  618 837  56
## 11 Insr   80  885   52  647 576  61  474 309  40
## 12 Rprs   95  933  281 1529 707 407 -864 226 159
## 13 Hldy   92  992  176  252  30  11 1435 962 425
eig <- as.data.frame(get_eigenvalue(ca2))
trace <- sum(eig$eigenvalue) 
cor.coef <- sqrt(trace)
cor.coef
## [1] 1.055907

As a rule of thumb, 0.2 is the threshold above which the correlation can be considered important (Bendixen 1995, 576; Healey 2013, 289-290).

1.0559074 indicates a strong association between row and column variables.

6. Plot and interpret the results

fviz_ca_biplot(ca2, title = "Add your title here, human", repel = T)

Additionally:

plot(ca, mass = TRUE, contrib = "relative",#relative contributions are indicated by colour intensities
     map = "rowgreen", arrows = c(FALSE, TRUE))

Points with zero contribution are displayed in white. The higher the contribution of a point, the closer the corresponding colour to the one specified by the col option

Compare solutions from ca and FactoMineR:

plot1 <- fviz_ca_biplot(ca, arrows = c(F,T), title = "Output from CA")
plot2 <- fviz_ca_biplot(ca2, arrows = c(F,T),  title = "Output from FactoMineR")
grid.arrange(plot1, plot2, ncol = 2)

Contributions of rows to dimensions

row <- get_ca_row(ca2)
head(row$contrib)
##                 Dim 1    Dim 2    Dim 3
## Laundry    18.2867003 5.563891 7.968424
## Main_meal  12.3888433 4.735523 1.858689
## Dinner      5.4713982 1.321022 2.096926
## Breakfeast  3.8249284 3.698613 3.069399
## Tidying     1.9983518 2.965644 0.488734
## Dishes      0.4261663 2.844117 3.634294
fviz_contrib(ca, choice = "row", axes = 1) +  geom_hline(yintercept=7.69, linetype = 2, color="red")

fviz_contrib(ca, choice = "row", axes = 2) +  geom_hline(yintercept=7.69, linetype = 2, color="red")

The red line: if the row contributions were uniform, the expected value would be 1/nrow(housetasks) = 1/13 = 7.69%. The red dashed line on the graph above indicates the expected average contribution.

Average contribution: The total contribution of a row, on explaining the variations retained by Dim.1 and Dim.2, is calculated as follow : (C1 * Eig1) + (C2 * Eig2). C1 and C2 are the contributions of the row to dimensions 1 and 2 The expected average contribution of a row for Dim.1 and Dim.2 is : (7.69 * Eig1) + (7.69 * Eig2) = (7.690.54) + (7.690.44) = 7.53%

Dimension 1 is mainly defined by the opposition of Repair and Driving (positive pole), and Laundry and Main_meal (negative pole).

highlight columns according to the amount of their contributions and select the top 5 contributing rows and columns

fviz_ca_col(ca, col.col="contrib")+
  scale_color_gradient2(low="white", mid="blue", 
                        high="red", midpoint=24.5)+theme_minimal()

fviz_ca_col(ca, alpha.col="contrib") # Possible values for the argument alpha.col are : "cos2", "contrib", "coord", "x", "y"

fviz_cos2(ca, choice = "col", axes = 1:2)

fviz_ca_row(ca, alpha.row = "contrib")+
  theme_minimal()

fviz_ca_biplot(ca, select.row = list(contrib = 5), 
               select.col = list(contrib = 5)) +
  theme_minimal()

Sources and handouts:

http://www.sthda.com/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining

https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Correspondence_Analysis.pdf

A very short example of correspondence analysis

library(FactoMineR)
data(tea)#load the data
summary(tea)
##          breakfast           tea.time          evening          lunch    
##  breakfast    :144   Not.tea time:131   evening    :103   lunch    : 44  
##  Not.breakfast:156   tea time    :169   Not.evening:197   Not.lunch:256  
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##         dinner           always          home           work    
##  dinner    : 21   always    :103   home    :291   Not.work:213  
##  Not.dinner:279   Not.always:197   Not.home:  9   work    : 87  
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##         tearoom           friends          resto          pub     
##  Not.tearoom:242   friends    :196   Not.resto:221   Not.pub:237  
##  tearoom    : 58   Not.friends:104   resto    : 79   pub    : 63  
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##         Tea         How           sugar                     how     
##  black    : 74   alone:195   No.sugar:155   tea bag           :170  
##  Earl Grey:193   lemon: 33   sugar   :145   tea bag+unpackaged: 94  
##  green    : 33   milk : 63                  unpackaged        : 36  
##                  other:  9                                          
##                                                                     
##                                                                     
##                                                                     
##                   where                 price          age        sex    
##  chain store         :192   p_branded      : 95   Min.   :15.00   F:178  
##  chain store+tea shop: 78   p_cheap        :  7   1st Qu.:23.00   M:122  
##  tea shop            : 30   p_private label: 21   Median :32.00          
##                             p_unknown      : 12   Mean   :37.05          
##                             p_upscale      : 53   3rd Qu.:48.00          
##                             p_variable     :112   Max.   :90.00          
##                                                                          
##            SPC               Sport       age_Q          frequency  
##  employee    :59   Not.sportsman:121   15-24:92   1/day      : 95  
##  middle      :40   sportsman    :179   25-34:69   1 to 2/week: 44  
##  non-worker  :64                       35-44:40   +2/day     :127  
##  other worker:20                       45-59:61   3 to 6/week: 34  
##  senior      :35                       +60  :38                    
##  student     :70                                                   
##  workman     :12                                                   
##              escape.exoticism           spirituality        healthy   
##  escape-exoticism    :142     Not.spirituality:206   healthy    :210  
##  Not.escape-exoticism:158     spirituality    : 94   Not.healthy: 90  
##                                                                       
##                                                                       
##                                                                       
##                                                                       
##                                                                       
##          diuretic             friendliness            iron.absorption
##  diuretic    :174   friendliness    :242   iron absorption    : 31   
##  Not.diuretic:126   Not.friendliness: 58   Not.iron absorption:269   
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##          feminine             sophisticated        slimming  
##  feminine    :129   Not.sophisticated: 85   No.slimming:255  
##  Not.feminine:171   sophisticated    :215   slimming   : 45  
##                                                              
##                                                              
##                                                              
##                                                              
##                                                              
##         exciting          relaxing              effect.on.health
##  exciting   :116   No.relaxing:113   effect on health   : 66    
##  No.exciting:184   relaxing   :187   No.effect on health:234    
##                                                                 
##                                                                 
##                                                                 
##                                                                 
## 
newtea <- tea[,c("Tea", "How", "how", "sugar", "where", "always")]#shrink
head(newtea)#look at the shrinked data
##         Tea   How     how    sugar       where     always
## 1     black alone tea bag    sugar chain store Not.always
## 2     black  milk tea bag No.sugar chain store Not.always
## 3 Earl Grey alone tea bag No.sugar chain store Not.always
## 4 Earl Grey alone tea bag    sugar chain store Not.always
## 5 Earl Grey alone tea bag No.sugar chain store     always
## 6 Earl Grey alone tea bag No.sugar chain store Not.always
cats <- apply(newtea, 2, function(x) nlevels(as.factor(x)))
cats#number of levels per variable
##    Tea    How    how  sugar  where always 
##      3      4      3      2      3      2
mca <- FactoMineR::MCA(newtea, graph = T)#get the results

mca#list of results
## **Results of the Multiple Correspondence Analysis (MCA)**
## The analysis was performed on 300 individuals, described by 6 variables
## *The results are available in the following objects:
## 
##    name              description                       
## 1  "$eig"            "eigenvalues"                     
## 2  "$var"            "results for the variables"       
## 3  "$var$coord"      "coord. of the categories"        
## 4  "$var$cos2"       "cos2 for the categories"         
## 5  "$var$contrib"    "contributions of the categories" 
## 6  "$var$v.test"     "v-test for the categories"       
## 7  "$ind"            "results for the individuals"     
## 8  "$ind$coord"      "coord. for the individuals"      
## 9  "$ind$cos2"       "cos2 for the individuals"        
## 10 "$ind$contrib"    "contributions of the individuals"
## 11 "$call"           "intermediate results"            
## 12 "$call$marge.col" "weights of columns"              
## 13 "$call$marge.li"  "weights of rows"
mca$eig # we need dimensions with eigenvalues > 1/n=variables, i.e. 1/6~0.17, 4 dimensions
##        eigenvalue percentage of variance cumulative percentage of variance
## dim 1  0.27976178              15.259733                          15.25973
## dim 2  0.25774772              14.058967                          29.31870
## dim 3  0.22013794              12.007524                          41.32622
## dim 4  0.18792961              10.250706                          51.57693
## dim 5  0.16876495               9.205361                          60.78229
## dim 6  0.16368666               8.928363                          69.71065
## dim 7  0.15288834               8.339364                          78.05002
## dim 8  0.13838682               7.548372                          85.59839
## dim 9  0.11569167               6.310455                          91.90885
## dim 10 0.08612637               4.697802                          96.60665
## dim 11 0.06221147               3.393353                         100.00000

Now let’s plot the results:

library(ggplot2)
mca_vars_df <- data.frame(mca$var$coord, Variable = rep(names(cats), cats))# column coordinates data frame
mca_obs_df <- data.frame(mca$ind$coord)#row coordinates df
ggplot(data = mca_vars_df,
       aes(x = Dim.1, y = Dim.2, label = rownames(mca_vars_df))) +
  geom_hline(yintercept = 0, colour = "red") +
  geom_vline(xintercept = 0, colour = "red") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA Plot of Tea+How it is Drunk") +
  labs(x = "Dimension 1 15%", y = "Dimension 2 14%")

mca$eig[1,2]
## [1] 15.25973
mca$eig[2,2]
## [1] 14.05897

Multidimensional scaling

Example from research: Schwartz’s basic values

Basic Values circle.

Basic Values circle.

library(MASS)
library(igraph)
library(vegan)
library(haven)

Metric MDS

Example: European cities (distance in km)

data(eurodist)
as.matrix(eurodist)[6:10, 1:5] # travel distance b/w cities in km
##            Athens Barcelona Brussels Calais Cherbourg
## Cologne      2762      1498      206    409       785
## Copenhagen   3276      2218      966   1136      1545
## Geneva       2610       803      677    747       853
## Gibraltar    4485      1172     2256   2224      2047
## Hamburg      2977      2018      597    714      1115
mds <- cmdscale(eurodist)
plot(mds, type="n")+
text(mds[,1], mds[,2], labels(eurodist))

## integer(0)
plot(mds, type="n")+
text(mds[,1], mds[,2]*(-1), labels(eurodist))

## integer(0)
rm(mds, eurodist)

Non-metric MDS

Let’s look at students’ preferences in choosing NISes in 2015.

nis<-read_spss("NIS_MDS2015.sav")
nis<-nis[,c(1,2,10,3,4,5,6,7,8,9)]
head(nis)
## # A tibble: 6 x 10
##      id group gen        city civil  cult youth  body  asbd  econ
##   <dbl> <dbl> <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1   131     2 0             7     2     1     6     4     5     3
## 2   132     2 0             3     6     4     1     7     2     5
## 3   133     2 0             4     6     3     1     2     7     5
## 4   134     2 1             2     7     6     1     4     5     3
## 5   135     2 0             4     6     2     1     3     7     5
## 6   136     2 0             7     6     1     2     3     4     5
names(nis)<-c("id","year","sex", "city","polit","cult","youth","body","bigdata","econ")
head(nis)
## # A tibble: 6 x 10
##      id  year sex        city polit  cult youth  body bigdata  econ
##   <dbl> <dbl> <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl>
## 1   131     2 0             7     2     1     6     4       5     3
## 2   132     2 0             3     6     4     1     7       2     5
## 3   133     2 0             4     6     3     1     2       7     5
## 4   134     2 1             2     7     6     1     4       5     3
## 5   135     2 0             4     6     2     1     3       7     5
## 6   136     2 0             7     6     1     2     3       4     5
nis<-nis[,c(1,2,3,4,6,7,8,9,10, 5)]
nis$year<-as.factor(nis$year)
nis$sex<-as.factor(nis$sex)
nis <- na.omit(nis)
nmds2 <- metaMDS(nis[,4:10], k=2, trymax=100)
## Run 0 stress 0.2147611 
## Run 1 stress 0.2149404 
## ... Procrustes: rmse 0.008630338  max resid 0.08177757 
## Run 2 stress 0.2147618 
## ... Procrustes: rmse 0.0003278392  max resid 0.003414574 
## ... Similar to previous best
## Run 3 stress 0.2153116 
## Run 4 stress 0.2155001 
## Run 5 stress 0.2149033 
## ... Procrustes: rmse 0.007933108  max resid 0.1061095 
## Run 6 stress 0.215689 
## Run 7 stress 0.2147622 
## ... Procrustes: rmse 0.0003430299  max resid 0.003418326 
## ... Similar to previous best
## Run 8 stress 0.2155981 
## Run 9 stress 0.2148746 
## ... Procrustes: rmse 0.01210575  max resid 0.1053541 
## Run 10 stress 0.214798 
## ... Procrustes: rmse 0.002534361  max resid 0.03293606 
## Run 11 stress 0.2146994 
## ... New best solution
## ... Procrustes: rmse 0.01027872  max resid 0.1058776 
## Run 12 stress 0.2147611 
## ... Procrustes: rmse 0.01027813  max resid 0.1064005 
## Run 13 stress 0.2152897 
## Run 14 stress 0.2147615 
## ... Procrustes: rmse 0.01031016  max resid 0.1064215 
## Run 15 stress 0.2288929 
## Run 16 stress 0.2164164 
## Run 17 stress 0.2149342 
## ... Procrustes: rmse 0.005436101  max resid 0.06899523 
## Run 18 stress 0.215011 
## ... Procrustes: rmse 0.009216866  max resid 0.1047025 
## Run 19 stress 0.214602 
## ... New best solution
## ... Procrustes: rmse 0.008844197  max resid 0.1065491 
## Run 20 stress 0.2313235 
## Run 21 stress 0.2147618 
## ... Procrustes: rmse 0.005188564  max resid 0.07040212 
## Run 22 stress 0.2147615 
## ... Procrustes: rmse 0.005185983  max resid 0.07035811 
## Run 23 stress 0.2145961 
## ... New best solution
## ... Procrustes: rmse 0.001182936  max resid 0.01098416 
## Run 24 stress 0.2145946 
## ... New best solution
## ... Procrustes: rmse 0.008657589  max resid 0.1062222 
## Run 25 stress 0.2149035 
## ... Procrustes: rmse 0.006036924  max resid 0.06919665 
## Run 26 stress 0.214761 
## ... Procrustes: rmse 0.01003512  max resid 0.1063332 
## Run 27 stress 0.2146549 
## ... Procrustes: rmse 0.002437282  max resid 0.0327501 
## Run 28 stress 0.2157982 
## Run 29 stress 0.2157119 
## Run 30 stress 0.2146034 
## ... Procrustes: rmse 0.008434247  max resid 0.1065123 
## Run 31 stress 0.214761 
## ... Procrustes: rmse 0.01005101  max resid 0.1063539 
## Run 32 stress 0.215122 
## Run 33 stress 0.215617 
## Run 34 stress 0.2193506 
## Run 35 stress 0.2163142 
## Run 36 stress 0.2153219 
## Run 37 stress 0.2310069 
## Run 38 stress 0.2312564 
## Run 39 stress 0.2155989 
## Run 40 stress 0.2162609 
## Run 41 stress 0.2148626 
## ... Procrustes: rmse 0.005580967  max resid 0.06363713 
## Run 42 stress 0.2156393 
## Run 43 stress 0.2147814 
## ... Procrustes: rmse 0.009931416  max resid 0.1060335 
## Run 44 stress 0.2286831 
## Run 45 stress 0.2148718 
## ... Procrustes: rmse 0.005828139  max resid 0.06375893 
## Run 46 stress 0.2145576 
## ... New best solution
## ... Procrustes: rmse 0.007315256  max resid 0.1052255 
## Run 47 stress 0.2145929 
## ... Procrustes: rmse 0.007317937  max resid 0.1049761 
## Run 48 stress 0.231385 
## Run 49 stress 0.2147968 
## ... Procrustes: rmse 0.005704325  max resid 0.06967537 
## Run 50 stress 0.2158008 
## Run 51 stress 0.2147617 
## ... Procrustes: rmse 0.006590453  max resid 0.07073478 
## Run 52 stress 0.2376889 
## Run 53 stress 0.2149279 
## ... Procrustes: rmse 0.009547062  max resid 0.1046612 
## Run 54 stress 0.2294389 
## Run 55 stress 0.2156894 
## Run 56 stress 0.2149861 
## ... Procrustes: rmse 0.005684492  max resid 0.06505048 
## Run 57 stress 0.214761 
## ... Procrustes: rmse 0.006519188  max resid 0.07056082 
## Run 58 stress 0.2148756 
## ... Procrustes: rmse 0.009472834  max resid 0.1045488 
## Run 59 stress 0.2331814 
## Run 60 stress 0.2163192 
## Run 61 stress 0.2156442 
## Run 62 stress 0.2147615 
## ... Procrustes: rmse 0.006573808  max resid 0.07070563 
## Run 63 stress 0.2147614 
## ... Procrustes: rmse 0.006564516  max resid 0.07067816 
## Run 64 stress 0.2145893 
## ... Procrustes: rmse 0.007412029  max resid 0.1050797 
## Run 65 stress 0.2145574 
## ... New best solution
## ... Procrustes: rmse 3.479808e-05  max resid 0.000200598 
## ... Similar to previous best
## *** Solution reached
stressplot(nmds2)

Stress is the function that is minimized; it is good when stress is <.2 :)

Large scatter around the line suggests that original dissimilarities are not well preserved in the reduced number of dimensions.

nmds3 <- metaMDS(nis[,4:10], k = 3, trymax=100)
## Run 0 stress 0.1409921 
## Run 1 stress 0.1409961 
## ... Procrustes: rmse 0.001493876  max resid 0.009450459 
## ... Similar to previous best
## Run 2 stress 0.1444306 
## Run 3 stress 0.1463538 
## Run 4 stress 0.1409928 
## ... Procrustes: rmse 0.0004933147  max resid 0.002854043 
## ... Similar to previous best
## Run 5 stress 0.1461007 
## Run 6 stress 0.1420583 
## Run 7 stress 0.1460868 
## Run 8 stress 0.1409906 
## ... New best solution
## ... Procrustes: rmse 0.0002942407  max resid 0.00152511 
## ... Similar to previous best
## Run 9 stress 0.1410043 
## ... Procrustes: rmse 0.0007619995  max resid 0.006876658 
## ... Similar to previous best
## Run 10 stress 0.1453617 
## Run 11 stress 0.1409921 
## ... Procrustes: rmse 0.0004504055  max resid 0.002404194 
## ... Similar to previous best
## Run 12 stress 0.1444132 
## Run 13 stress 0.1409949 
## ... Procrustes: rmse 0.0003383447  max resid 0.002978687 
## ... Similar to previous best
## Run 14 stress 0.146113 
## Run 15 stress 0.1453702 
## Run 16 stress 0.1456707 
## Run 17 stress 0.1409912 
## ... Procrustes: rmse 0.0003255643  max resid 0.002239785 
## ... Similar to previous best
## Run 18 stress 0.1409888 
## ... New best solution
## ... Procrustes: rmse 0.0007818997  max resid 0.006430355 
## ... Similar to previous best
## Run 19 stress 0.145683 
## Run 20 stress 0.1409929 
## ... Procrustes: rmse 0.0008266151  max resid 0.006453891 
## ... Similar to previous best
## *** Solution reached
stressplot(nmds3)

plot(nmds3)

ordiplot(nmds3,type="n")
orditorp(nmds3,display="sites",col="red",air=0.01)
orditorp(nmds3,display="species",cex=1.25,air=0.01)
orditorp(nmds3,display="species",col=c(rep("brown",4),rep("blue",3)),
         air=0.01,cex=1.25)

colors=c(rep("red",4),rep("blue",3))
ordiplot(nmds3,type="n")
for(i in unique(nis$year)) {
  ordihull(nmds3$point[grep(i,nis$year),],draw="polygon",
           groups=nis$year[nis$year==i],col=colors[grep(i,nis$year)],label=F) 
} 
orditorp(nmds3,display="sites",col="red",air=0.01)
orditorp(nmds3,display="species",col=c(rep("purple",4),
                                      rep("blue",3)),air=0.01,cex=1.25)

Conclusions: