4. Do the calculations to get the CoA axes

ca <- ca(housetasks)
names(ca)

##  [1] "sv"         "nd"         "rownames"   "rowmass"    "rowdist"   
##  [6] "rowinertia" "rowcoord"   "rowsup"     "colnames"   "colmass"   
## [11] "coldist"    "colinertia" "colcoord"   "colsup"     "N"         
## [16] "call"

summary(ca)

## 
## Principal inertias (eigenvalues):
## 
##  dim    value      %   cum%   scree plot               
##  1      0.542889  48.7  48.7  ************             
##  2      0.445003  39.9  88.6  **********               
##  3      0.127048  11.4 100.0  ***                      
##         -------- -----                                 
##  Total: 1.114940 100.0                                 
## 
## 
## Rows:
##      name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1  | Lndr |  101  925  120 | -992 740 183 | -495 185  56 |
## 2  | Mn_m |   88  974   81 | -876 742 124 | -490 232  47 |
## 3  | Dnnr |   62  930   34 | -693 777  55 | -308 154  13 |
## 4  | Brkf |   80  905   37 | -509 505  38 | -453 400  37 |
## 5  | Tdyn |   70  975   22 | -394 440  20 |  434 535  30 |
## 6  | Dshs |   65  764   18 | -189 118   4 |  442 646  28 |
## 7  | Shpp |   69  811   13 | -118  64   2 |  403 748  25 |
## 8  | Offc |   55  119   48 |  227  53   5 | -254  66   8 |
## 9  | Drvn |   80  767   91 |  742 432  81 | -653 335  76 |
## 10 | Fnnc |   65  997   27 |  271 161   9 |  618 837  56 |
## 11 | Insr |   80  885   52 |  647 576  61 |  474 309  40 |
## 12 | Rprs |   95  933  281 | 1529 707 407 | -864 226 159 |
## 13 | Hldy |   92  992  176 |  252  30  11 | 1435 962 425 |
## 
## Columns:
##     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr  
## 1 | Wife |  344  954  270 | -838 802 445 | -365 152 103 |
## 2 | Altr |  146  110  106 |  -62   5   1 | -292 105  28 |
## 3 | Hsbn |  218  980  342 | 1161 772 542 | -602 208 178 |
## 4 | Jntl |  292  998  282 |  149  21  12 | 1027 977 691 |

ca2 <- CA(housetasks)

summary.CA(ca2)

## 
## Call:
## CA(X = housetasks) 
## 
## The chi square of independence between the two variables is equal to 1944.456 (p-value =  0 ).
## 
## Eigenvalues
##                        Dim.1   Dim.2   Dim.3
## Variance               0.543   0.445   0.127
## % of var.             48.692  39.913  11.395
## Cumulative % of var.  48.692  88.605 100.000
## 
## Rows (the 10 first)
##               Iner*1000     Dim.1     ctr    cos2     Dim.2     ctr
## Laundry     |   134.160 |  -0.992  18.287   0.740 |   0.495   5.564
## Main_meal   |    90.692 |  -0.876  12.389   0.742 |   0.490   4.736
## Dinner      |    38.246 |  -0.693   5.471   0.777 |   0.308   1.321
## Breakfeast  |    41.124 |  -0.509   3.825   0.505 |   0.453   3.699
## Tidying     |    24.667 |  -0.394   1.998   0.440 |  -0.434   2.966
## Dishes      |    19.587 |  -0.189   0.426   0.118 |  -0.442   2.844
## Shopping    |    14.970 |  -0.118   0.176   0.064 |  -0.403   2.515
## Official    |    53.300 |   0.227   0.521   0.053 |   0.254   0.796
## Driving     |   101.509 |   0.742   8.078   0.432 |   0.653   7.647
## Finances    |    29.564 |   0.271   0.875   0.161 |  -0.618   5.559
##                cos2     Dim.3     ctr    cos2  
## Laundry       0.185 |  -0.317   7.968   0.075 |
## Main_meal     0.232 |  -0.164   1.859   0.026 |
## Dinner        0.154 |  -0.207   2.097   0.070 |
## Breakfeast    0.400 |   0.220   3.069   0.095 |
## Tidying       0.535 |  -0.094   0.489   0.025 |
## Dishes        0.646 |   0.267   3.634   0.236 |
## Shopping      0.748 |   0.203   2.223   0.189 |
## Official      0.066 |   0.923  36.940   0.881 |
## Driving       0.335 |   0.544  18.596   0.233 |
## Finances      0.837 |   0.035   0.062   0.003 |
## 
## Columns
##               Iner*1000     Dim.1     ctr    cos2     Dim.2     ctr
## Wife        |   301.019 |  -0.838  44.462   0.802 |   0.365  10.312
## Alternating |   117.824 |  -0.062   0.104   0.005 |   0.292   2.783
## Husband     |   381.373 |   1.161  54.234   0.772 |   0.602  17.787
## Jointly     |   314.725 |   0.149   1.200   0.021 |  -1.027  69.118
##                cos2     Dim.3     ctr    cos2  
## Wife          0.152 |  -0.200  10.822   0.046 |
## Alternating   0.105 |   0.849  82.549   0.890 |
## Husband       0.208 |  -0.189   6.133   0.020 |
## Jointly       0.977 |  -0.046   0.495   0.002 |

ca3 <- CA(housetasks, ncp = 3, graph = T)

Two main functions used for CoA are ca and CA.

They extract 3 dimensions as there are 4 columns (and 4-1 < 13-1).

Notes

Results for row variables: The principal coordinates for the first 3 dimensions (k = 1, k = 2 and k = 3).

mass: the mass (or total frequency) of each point.
qlt is the total quality of representation of points by the 3 included dimensions.

In our example, it is the sum of the squared correlations over the three included dimensions.

inr: the inertia of the point (in per mills of the total inertia). The higher, the more influential this row.
cor or cos2 (Squared correlations) and ctr (contributions) of the points. Note that cor and ctr are expressed in per mills.
standard coordinates of rows(for distances in common space): ca$rowcoord
standard coordinates of columns: ca$colcoord

5. How many dimensions (1, 2?) should you retain and what do they explain?

fviz_screeplot(ca2)#scree plot

fviz_screeplot(ca, geom = "line")

eigenvalues <- factoextra::get_eigenvalue(ca2)
head(round(eigenvalues, 2))

##       eigenvalue variance.percent cumulative.variance.percent
## Dim.1       0.54            48.69                       48.69
## Dim.2       0.45            39.91                       88.60
## Dim.3       0.13            11.40                      100.00

Eigenvalues correspond to the amount of information retained by each axis. If the data were independent, the expected value of the eigenvalue for each axis would be 1/(nrow(housetasks)-1) = 1/12 = 8.33% in terms of rows. Likewise, the average axis should account for 1/(ncol(housetasks)-1) = 1/3 = 33.33% in terms of the 4 columns.
Dimensions 1 and 2 explain approximately 48.7% and 39.9% of the total inertia, respectively. Inertia is ‘deviation from independence’ (remember the chi-square?) The larger inertia, the higher the role of this dimension.
This corresponds to a cumulative total of 88.6% of total inertia retained by the 2 dimensions.

The average axis should account for 1/(ncol(housetasks)-1) = 1/3 = 33.33% in terms of the 4 columns.

fviz_screeplot(ca) +
  geom_hline(yintercept = 33.33, linetype = 2, color = "red")

fviz_screeplot(ca2, geom = "line") +
  geom_hline(yintercept = 33.33, linetype = 2, color = "blue")

summary(ca)$rows

##    name mass  qlt  inr  k=1 cor ctr  k=2 cor ctr
## 1  Lndr  101  925  120 -992 740 183 -495 185  56
## 2  Mn_m   88  974   81 -876 742 124 -490 232  47
## 3  Dnnr   62  930   34 -693 777  55 -308 154  13
## 4  Brkf   80  905   37 -509 505  38 -453 400  37
## 5  Tdyn   70  975   22 -394 440  20  434 535  30
## 6  Dshs   65  764   18 -189 118   4  442 646  28
## 7  Shpp   69  811   13 -118  64   2  403 748  25
## 8  Offc   55  119   48  227  53   5 -254  66   8
## 9  Drvn   80  767   91  742 432  81 -653 335  76
## 10 Fnnc   65  997   27  271 161   9  618 837  56
## 11 Insr   80  885   52  647 576  61  474 309  40
## 12 Rprs   95  933  281 1529 707 407 -864 226 159
## 13 Hldy   92  992  176  252  30  11 1435 962 425

eig <- as.data.frame(get_eigenvalue(ca2))
trace <- sum(eig$eigenvalue) 
cor.coef <- sqrt(trace)
cor.coef

## [1] 1.055907

As a rule of thumb, 0.2 is the threshold above which the correlation can be considered important (Bendixen 1995, 576; Healey 2013, 289-290).

1.0559074 indicates a strong association between row and column variables.

6. Plot and interpret the results

fviz_ca_biplot(ca2, title = "Add your title here, human", repel = T)

Additionally:

the contribution of rows and columns to dimensions can be analyzed
highly contributing rows and columns can be identified

plot(ca, mass = TRUE, contrib = "relative",#relative contributions are indicated by colour intensities
     map = "rowgreen", arrows = c(FALSE, TRUE))

Points with zero contribution are displayed in white. The higher the contribution of a point, the closer the corresponding colour to the one specified by the col option

Compare solutions from ca and FactoMineR:

plot1 <- fviz_ca_biplot(ca, arrows = c(F,T), title = "Output from CA")
plot2 <- fviz_ca_biplot(ca2, arrows = c(F,T),  title = "Output from FactoMineR")
grid.arrange(plot1, plot2, ncol = 2)

Contributions of rows to dimensions

row <- get_ca_row(ca2)
head(row$contrib)

##                 Dim 1    Dim 2    Dim 3
## Laundry    18.2867003 5.563891 7.968424
## Main_meal  12.3888433 4.735523 1.858689
## Dinner      5.4713982 1.321022 2.096926
## Breakfeast  3.8249284 3.698613 3.069399
## Tidying     1.9983518 2.965644 0.488734
## Dishes      0.4261663 2.844117 3.634294

fviz_contrib(ca, choice = "row", axes = 1) +  geom_hline(yintercept=7.69, linetype = 2, color="red")

fviz_contrib(ca, choice = "row", axes = 2) +  geom_hline(yintercept=7.69, linetype = 2, color="red")

The red line: if the row contributions were uniform, the expected value would be 1/nrow(housetasks) = 1/13 = 7.69%. The red dashed line on the graph above indicates the expected average contribution.

Average contribution: The total contribution of a row, on explaining the variations retained by Dim.1 and Dim.2, is calculated as follow : (C1 * Eig1) + (C2 * Eig2). C1 and C2 are the contributions of the row to dimensions 1 and 2 The expected average contribution of a row for Dim.1 and Dim.2 is : (7.69 * Eig1) + (7.69 * Eig2) = (7.690.54) + (7.690.44) = 7.53%

Dimension 1 is mainly defined by the opposition of Repair and Driving (positive pole), and Laundry and Main_meal (negative pole).

highlight columns according to the amount of their contributions and select the top 5 contributing rows and columns

fviz_ca_col(ca, col.col="contrib")+
  scale_color_gradient2(low="white", mid="blue", 
                        high="red", midpoint=24.5)+theme_minimal()

fviz_ca_col(ca, alpha.col="contrib") # Possible values for the argument alpha.col are : "cos2", "contrib", "coord", "x", "y"

fviz_cos2(ca, choice = "col", axes = 1:2)

fviz_ca_row(ca, alpha.row = "contrib")+
  theme_minimal()

fviz_ca_biplot(ca, select.row = list(contrib = 5), 
               select.col = list(contrib = 5)) +
  theme_minimal()

Sources and handouts:

http://www.sthda.com/english/wiki/correspondence-analysis-in-r-the-ultimate-guide-for-the-analysis-the-visualization-and-the-interpretation-r-software-and-data-mining

https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Correspondence_Analysis.pdf

A very short example of correspondence analysis

library(FactoMineR)
data(tea)#load the data
summary(tea)

##          breakfast           tea.time          evening          lunch    
##  breakfast    :144   Not.tea time:131   evening    :103   lunch    : 44  
##  Not.breakfast:156   tea time    :169   Not.evening:197   Not.lunch:256  
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##                                                                          
##         dinner           always          home           work    
##  dinner    : 21   always    :103   home    :291   Not.work:213  
##  Not.dinner:279   Not.always:197   Not.home:  9   work    : 87  
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##         tearoom           friends          resto          pub     
##  Not.tearoom:242   friends    :196   Not.resto:221   Not.pub:237  
##  tearoom    : 58   Not.friends:104   resto    : 79   pub    : 63  
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##         Tea         How           sugar                     how     
##  black    : 74   alone:195   No.sugar:155   tea bag           :170  
##  Earl Grey:193   lemon: 33   sugar   :145   tea bag+unpackaged: 94  
##  green    : 33   milk : 63                  unpackaged        : 36  
##                  other:  9                                          
##                                                                     
##                                                                     
##                                                                     
##                   where                 price          age        sex    
##  chain store         :192   p_branded      : 95   Min.   :15.00   F:178  
##  chain store+tea shop: 78   p_cheap        :  7   1st Qu.:23.00   M:122  
##  tea shop            : 30   p_private label: 21   Median :32.00          
##                             p_unknown      : 12   Mean   :37.05          
##                             p_upscale      : 53   3rd Qu.:48.00          
##                             p_variable     :112   Max.   :90.00          
##                                                                          
##            SPC               Sport       age_Q          frequency  
##  employee    :59   Not.sportsman:121   15-24:92   1/day      : 95  
##  middle      :40   sportsman    :179   25-34:69   1 to 2/week: 44  
##  non-worker  :64                       35-44:40   +2/day     :127  
##  other worker:20                       45-59:61   3 to 6/week: 34  
##  senior      :35                       +60  :38                    
##  student     :70                                                   
##  workman     :12                                                   
##              escape.exoticism           spirituality        healthy   
##  escape-exoticism    :142     Not.spirituality:206   healthy    :210  
##  Not.escape-exoticism:158     spirituality    : 94   Not.healthy: 90  
##                                                                       
##                                                                       
##                                                                       
##                                                                       
##                                                                       
##          diuretic             friendliness            iron.absorption
##  diuretic    :174   friendliness    :242   iron absorption    : 31   
##  Not.diuretic:126   Not.friendliness: 58   Not.iron absorption:269   
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##                                                                      
##          feminine             sophisticated        slimming  
##  feminine    :129   Not.sophisticated: 85   No.slimming:255  
##  Not.feminine:171   sophisticated    :215   slimming   : 45  
##                                                              
##                                                              
##                                                              
##                                                              
##                                                              
##         exciting          relaxing              effect.on.health
##  exciting   :116   No.relaxing:113   effect on health   : 66    
##  No.exciting:184   relaxing   :187   No.effect on health:234    
##                                                                 
##                                                                 
##                                                                 
##                                                                 
##

newtea <- tea[,c("Tea", "How", "how", "sugar", "where", "always")]#shrink
head(newtea)#look at the shrinked data

##         Tea   How     how    sugar       where     always
## 1     black alone tea bag    sugar chain store Not.always
## 2     black  milk tea bag No.sugar chain store Not.always
## 3 Earl Grey alone tea bag No.sugar chain store Not.always
## 4 Earl Grey alone tea bag    sugar chain store Not.always
## 5 Earl Grey alone tea bag No.sugar chain store     always
## 6 Earl Grey alone tea bag No.sugar chain store Not.always

cats <- apply(newtea, 2, function(x) nlevels(as.factor(x)))
cats#number of levels per variable

##    Tea    How    how  sugar  where always 
##      3      4      3      2      3      2

mca <- FactoMineR::MCA(newtea, graph = T)#get the results

mca#list of results

## **Results of the Multiple Correspondence Analysis (MCA)**
## The analysis was performed on 300 individuals, described by 6 variables
## *The results are available in the following objects:
## 
##    name              description                       
## 1  "$eig"            "eigenvalues"                     
## 2  "$var"            "results for the variables"       
## 3  "$var$coord"      "coord. of the categories"        
## 4  "$var$cos2"       "cos2 for the categories"         
## 5  "$var$contrib"    "contributions of the categories" 
## 6  "$var$v.test"     "v-test for the categories"       
## 7  "$ind"            "results for the individuals"     
## 8  "$ind$coord"      "coord. for the individuals"      
## 9  "$ind$cos2"       "cos2 for the individuals"        
## 10 "$ind$contrib"    "contributions of the individuals"
## 11 "$call"           "intermediate results"            
## 12 "$call$marge.col" "weights of columns"              
## 13 "$call$marge.li"  "weights of rows"

mca$eig # we need dimensions with eigenvalues > 1/n=variables, i.e. 1/6~0.17, 4 dimensions

##        eigenvalue percentage of variance cumulative percentage of variance
## dim 1  0.27976178              15.259733                          15.25973
## dim 2  0.25774772              14.058967                          29.31870
## dim 3  0.22013794              12.007524                          41.32622
## dim 4  0.18792961              10.250706                          51.57693
## dim 5  0.16876495               9.205361                          60.78229
## dim 6  0.16368666               8.928363                          69.71065
## dim 7  0.15288834               8.339364                          78.05002
## dim 8  0.13838682               7.548372                          85.59839
## dim 9  0.11569167               6.310455                          91.90885
## dim 10 0.08612637               4.697802                          96.60665
## dim 11 0.06221147               3.393353                         100.00000

Now let’s plot the results:

library(ggplot2)
mca_vars_df <- data.frame(mca$var$coord, Variable = rep(names(cats), cats))# column coordinates data frame
mca_obs_df <- data.frame(mca$ind$coord)#row coordinates df
ggplot(data = mca_vars_df,
       aes(x = Dim.1, y = Dim.2, label = rownames(mca_vars_df))) +
  geom_hline(yintercept = 0, colour = "red") +
  geom_vline(xintercept = 0, colour = "red") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA Plot of Tea+How it is Drunk") +
  labs(x = "Dimension 1 15%", y = "Dimension 2 14%")

mca$eig[1,2]

## [1] 15.25973

mca$eig[2,2]

## [1] 14.05897

Multidimensional scaling

plots similarities between individual cases or variables in the abstract Cartesian space
Metric MDS deals with interval or ratio level data, Nonmetric MDS deals with ordinal data

Example from research: Schwartz’s basic values

Basic Values circle.

library(MASS)
library(igraph)
library(vegan)
library(haven)

Metric MDS

Example: European cities (distance in km)

data(eurodist)
as.matrix(eurodist)[6:10, 1:5] # travel distance b/w cities in km

##            Athens Barcelona Brussels Calais Cherbourg
## Cologne      2762      1498      206    409       785
## Copenhagen   3276      2218      966   1136      1545
## Geneva       2610       803      677    747       853
## Gibraltar    4485      1172     2256   2224      2047
## Hamburg      2977      2018      597    714      1115

mds <- cmdscale(eurodist)

plot(mds, type="n")+
text(mds[,1], mds[,2], labels(eurodist))

## integer(0)

plot(mds, type="n")+
text(mds[,1], mds[,2]*(-1), labels(eurodist))

## integer(0)

rm(mds, eurodist)

Non-metric MDS

Let’s look at students’ preferences in choosing NISes in 2015.

nis<-read_spss("NIS_MDS2015.sav")
nis<-nis[,c(1,2,10,3,4,5,6,7,8,9)]
head(nis)

## # A tibble: 6 x 10
##      id group gen        city civil  cult youth  body  asbd  econ
##   <dbl> <dbl> <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1   131     2 0             7     2     1     6     4     5     3
## 2   132     2 0             3     6     4     1     7     2     5
## 3   133     2 0             4     6     3     1     2     7     5
## 4   134     2 1             2     7     6     1     4     5     3
## 5   135     2 0             4     6     2     1     3     7     5
## 6   136     2 0             7     6     1     2     3     4     5

names(nis)<-c("id","year","sex", "city","polit","cult","youth","body","bigdata","econ")
head(nis)

## # A tibble: 6 x 10
##      id  year sex        city polit  cult youth  body bigdata  econ
##   <dbl> <dbl> <dbl+lbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl>
## 1   131     2 0             7     2     1     6     4       5     3
## 2   132     2 0             3     6     4     1     7       2     5
## 3   133     2 0             4     6     3     1     2       7     5
## 4   134     2 1             2     7     6     1     4       5     3
## 5   135     2 0             4     6     2     1     3       7     5
## 6   136     2 0             7     6     1     2     3       4     5

nis<-nis[,c(1,2,3,4,6,7,8,9,10, 5)]
nis$year<-as.factor(nis$year)
nis$sex<-as.factor(nis$sex)
nis <- na.omit(nis)

nmds2 <- metaMDS(nis[,4:10], k=2, trymax=100)

## Run 0 stress 0.2147611 
## Run 1 stress 0.2149404 
## ... Procrustes: rmse 0.008630338  max resid 0.08177757 
## Run 2 stress 0.2147618 
## ... Procrustes: rmse 0.0003278392  max resid 0.003414574 
## ... Similar to previous best
## Run 3 stress 0.2153116 
## Run 4 stress 0.2155001 
## Run 5 stress 0.2149033 
## ... Procrustes: rmse 0.007933108  max resid 0.1061095 
## Run 6 stress 0.215689 
## Run 7 stress 0.2147622 
## ... Procrustes: rmse 0.0003430299  max resid 0.003418326 
## ... Similar to previous best
## Run 8 stress 0.2155981 
## Run 9 stress 0.2148746 
## ... Procrustes: rmse 0.01210575  max resid 0.1053541 
## Run 10 stress 0.214798 
## ... Procrustes: rmse 0.002534361  max resid 0.03293606 
## Run 11 stress 0.2146994 
## ... New best solution
## ... Procrustes: rmse 0.01027872  max resid 0.1058776 
## Run 12 stress 0.2147611 
## ... Procrustes: rmse 0.01027813  max resid 0.1064005 
## Run 13 stress 0.2152897 
## Run 14 stress 0.2147615 
## ... Procrustes: rmse 0.01031016  max resid 0.1064215 
## Run 15 stress 0.2288929 
## Run 16 stress 0.2164164 
## Run 17 stress 0.2149342 
## ... Procrustes: rmse 0.005436101  max resid 0.06899523 
## Run 18 stress 0.215011 
## ... Procrustes: rmse 0.009216866  max resid 0.1047025 
## Run 19 stress 0.214602 
## ... New best solution
## ... Procrustes: rmse 0.008844197  max resid 0.1065491 
## Run 20 stress 0.2313235 
## Run 21 stress 0.2147618 
## ... Procrustes: rmse 0.005188564  max resid 0.07040212 
## Run 22 stress 0.2147615 
## ... Procrustes: rmse 0.005185983  max resid 0.07035811 
## Run 23 stress 0.2145961 
## ... New best solution
## ... Procrustes: rmse 0.001182936  max resid 0.01098416 
## Run 24 stress 0.2145946 
## ... New best solution
## ... Procrustes: rmse 0.008657589  max resid 0.1062222 
## Run 25 stress 0.2149035 
## ... Procrustes: rmse 0.006036924  max resid 0.06919665 
## Run 26 stress 0.214761 
## ... Procrustes: rmse 0.01003512  max resid 0.1063332 
## Run 27 stress 0.2146549 
## ... Procrustes: rmse 0.002437282  max resid 0.0327501 
## Run 28 stress 0.2157982 
## Run 29 stress 0.2157119 
## Run 30 stress 0.2146034 
## ... Procrustes: rmse 0.008434247  max resid 0.1065123 
## Run 31 stress 0.214761 
## ... Procrustes: rmse 0.01005101  max resid 0.1063539 
## Run 32 stress 0.215122 
## Run 33 stress 0.215617 
## Run 34 stress 0.2193506 
## Run 35 stress 0.2163142 
## Run 36 stress 0.2153219 
## Run 37 stress 0.2310069 
## Run 38 stress 0.2312564 
## Run 39 stress 0.2155989 
## Run 40 stress 0.2162609 
## Run 41 stress 0.2148626 
## ... Procrustes: rmse 0.005580967  max resid 0.06363713 
## Run 42 stress 0.2156393 
## Run 43 stress 0.2147814 
## ... Procrustes: rmse 0.009931416  max resid 0.1060335 
## Run 44 stress 0.2286831 
## Run 45 stress 0.2148718 
## ... Procrustes: rmse 0.005828139  max resid 0.06375893 
## Run 46 stress 0.2145576 
## ... New best solution
## ... Procrustes: rmse 0.007315256  max resid 0.1052255 
## Run 47 stress 0.2145929 
## ... Procrustes: rmse 0.007317937  max resid 0.1049761 
## Run 48 stress 0.231385 
## Run 49 stress 0.2147968 
## ... Procrustes: rmse 0.005704325  max resid 0.06967537 
## Run 50 stress 0.2158008 
## Run 51 stress 0.2147617 
## ... Procrustes: rmse 0.006590453  max resid 0.07073478 
## Run 52 stress 0.2376889 
## Run 53 stress 0.2149279 
## ... Procrustes: rmse 0.009547062  max resid 0.1046612 
## Run 54 stress 0.2294389 
## Run 55 stress 0.2156894 
## Run 56 stress 0.2149861 
## ... Procrustes: rmse 0.005684492  max resid 0.06505048 
## Run 57 stress 0.214761 
## ... Procrustes: rmse 0.006519188  max resid 0.07056082 
## Run 58 stress 0.2148756 
## ... Procrustes: rmse 0.009472834  max resid 0.1045488 
## Run 59 stress 0.2331814 
## Run 60 stress 0.2163192 
## Run 61 stress 0.2156442 
## Run 62 stress 0.2147615 
## ... Procrustes: rmse 0.006573808  max resid 0.07070563 
## Run 63 stress 0.2147614 
## ... Procrustes: rmse 0.006564516  max resid 0.07067816 
## Run 64 stress 0.2145893 
## ... Procrustes: rmse 0.007412029  max resid 0.1050797 
## Run 65 stress 0.2145574 
## ... New best solution
## ... Procrustes: rmse 3.479808e-05  max resid 0.000200598 
## ... Similar to previous best
## *** Solution reached

stressplot(nmds2)

Stress is the function that is minimized; it is good when stress is <.2 :)

Large scatter around the line suggests that original dissimilarities are not well preserved in the reduced number of dimensions.

nmds3 <- metaMDS(nis[,4:10], k = 3, trymax=100)

## Run 0 stress 0.1409921 
## Run 1 stress 0.1409961 
## ... Procrustes: rmse 0.001493876  max resid 0.009450459 
## ... Similar to previous best
## Run 2 stress 0.1444306 
## Run 3 stress 0.1463538 
## Run 4 stress 0.1409928 
## ... Procrustes: rmse 0.0004933147  max resid 0.002854043 
## ... Similar to previous best
## Run 5 stress 0.1461007 
## Run 6 stress 0.1420583 
## Run 7 stress 0.1460868 
## Run 8 stress 0.1409906 
## ... New best solution
## ... Procrustes: rmse 0.0002942407  max resid 0.00152511 
## ... Similar to previous best
## Run 9 stress 0.1410043 
## ... Procrustes: rmse 0.0007619995  max resid 0.006876658 
## ... Similar to previous best
## Run 10 stress 0.1453617 
## Run 11 stress 0.1409921 
## ... Procrustes: rmse 0.0004504055  max resid 0.002404194 
## ... Similar to previous best
## Run 12 stress 0.1444132 
## Run 13 stress 0.1409949 
## ... Procrustes: rmse 0.0003383447  max resid 0.002978687 
## ... Similar to previous best
## Run 14 stress 0.146113 
## Run 15 stress 0.1453702 
## Run 16 stress 0.1456707 
## Run 17 stress 0.1409912 
## ... Procrustes: rmse 0.0003255643  max resid 0.002239785 
## ... Similar to previous best
## Run 18 stress 0.1409888 
## ... New best solution
## ... Procrustes: rmse 0.0007818997  max resid 0.006430355 
## ... Similar to previous best
## Run 19 stress 0.145683 
## Run 20 stress 0.1409929 
## ... Procrustes: rmse 0.0008266151  max resid 0.006453891 
## ... Similar to previous best
## *** Solution reached

stressplot(nmds3)

plot(nmds3)

ordiplot(nmds3,type="n")
orditorp(nmds3,display="sites",col="red",air=0.01)
orditorp(nmds3,display="species",cex=1.25,air=0.01)
orditorp(nmds3,display="species",col=c(rep("brown",4),rep("blue",3)),
         air=0.01,cex=1.25)

colors=c(rep("red",4),rep("blue",3))
ordiplot(nmds3,type="n")
for(i in unique(nis$year)) {
  ordihull(nmds3$point[grep(i,nis$year),],draw="polygon",
           groups=nis$year[nis$year==i],col=colors[grep(i,nis$year)],label=F) 
} 
orditorp(nmds3,display="sites",col="red",air=0.01)
orditorp(nmds3,display="species",col=c(rep("purple",4),
                                      rep("blue",3)),air=0.01,cex=1.25)

Exploratory mapping: correspondence analysis and multidimensional scaling

Part 1. Correspondence Analysis

A famous example

Example: Household Chores in Families

Load the libraries and data

2. Check if the variables are independent