Preliminaries

Download the vegan package

Only do this once, then comment out of the script. You may have already done this this for a previous assignment.

#install.packages("vegan")

Load the libraries

library(ggplot2)
library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.6-4

Load the msleep package

The mammals dataset is a classic dataset in the MASS package. msleep is an updated version of the data in the ggplot2 package that includes more numeric data (e.g. hours of sleep) and categorical data (e.g. if a species is endangered).

#make sure you've loaded ggplot2 w/ library(ggplot2_
data(msleep)

Subset numeric variables

msleep has a mixture of numeric and categorical variables. We want only the numeric data and a few key labels, which I’ll also have to set up a special way. Don’t worry about the details.

First, we will subset all the columns we may want:

msleep_num <- msleep[,c("sleep_total","sleep_rem",
                        "sleep_cycle","awake","brainwt" ,
                        "bodywt",
                        "vore","order")]

All data should be continuous numeric that can take on decimal values, except the last 2

summary(msleep_num[,-c(7:8)])
##   sleep_total      sleep_rem      sleep_cycle         awake      
##  Min.   : 1.90   Min.   :0.100   Min.   :0.1167   Min.   : 4.10  
##  1st Qu.: 7.85   1st Qu.:0.900   1st Qu.:0.1833   1st Qu.:10.25  
##  Median :10.10   Median :1.500   Median :0.3333   Median :13.90  
##  Mean   :10.43   Mean   :1.875   Mean   :0.4396   Mean   :13.57  
##  3rd Qu.:13.75   3rd Qu.:2.400   3rd Qu.:0.5792   3rd Qu.:16.15  
##  Max.   :19.90   Max.   :6.600   Max.   :1.5000   Max.   :22.10  
##                  NA's   :22      NA's   :51                      
##     brainwt            bodywt        
##  Min.   :0.00014   Min.   :   0.005  
##  1st Qu.:0.00290   1st Qu.:   0.174  
##  Median :0.01240   Median :   1.670  
##  Mean   :0.28158   Mean   : 166.136  
##  3rd Qu.:0.12550   3rd Qu.:  41.750  
##  Max.   :5.71200   Max.   :6654.000  
##  NA's   :27

We can look at these data with a scatterplot matrix

# add pairs()
pairs(msleep_num[,-c(7:8)])

The data are in tibble format (tbl_df)

is(msleep_num)
## [1] "tbl_df"     "tbl"        "data.frame" "list"       "oldClass"  
## [6] "vector"

Convert to dataframe because I don’t like tibbles!

# add data.frame()
data.frame(msleep_num <- (msleep_num))
##    sleep_total sleep_rem sleep_cycle awake brainwt   bodywt    vore
## 1         12.1        NA          NA 11.90      NA   50.000   carni
## 2         17.0       1.8          NA  7.00 0.01550    0.480    omni
## 3         14.4       2.4          NA  9.60      NA    1.350   herbi
## 4         14.9       2.3   0.1333333  9.10 0.00029    0.019    omni
## 5          4.0       0.7   0.6666667 20.00 0.42300  600.000   herbi
## 6         14.4       2.2   0.7666667  9.60      NA    3.850   herbi
## 7          8.7       1.4   0.3833333 15.30      NA   20.490   carni
## 8          7.0        NA          NA 17.00      NA    0.045    <NA>
## 9         10.1       2.9   0.3333333 13.90 0.07000   14.000   carni
## 10         3.0        NA          NA 21.00 0.09820   14.800   herbi
## 11         5.3       0.6          NA 18.70 0.11500   33.500   herbi
## 12         9.4       0.8   0.2166667 14.60 0.00550    0.728   herbi
## 13        10.0       0.7          NA 14.00      NA    4.750    omni
## 14        12.5       1.5   0.1166667 11.50 0.00640    0.420   herbi
## 15        10.3       2.2          NA 13.70 0.00100    0.060    omni
## 16         8.3       2.0          NA 15.70 0.00660    1.000    omni
## 17         9.1       1.4   0.1500000 14.90 0.00014    0.005    omni
## 18        17.4       3.1   0.3833333  6.60 0.01080    3.500   carni
## 19         5.3       0.5          NA 18.70 0.01230    2.950   herbi
## 20        18.0       4.9   0.3333333  6.00 0.00630    1.700    omni
## 21         3.9        NA          NA 20.10 4.60300 2547.000   herbi
## 22        19.7       3.9   0.1166667  4.30 0.00030    0.023 insecti
## 23         2.9       0.6   1.0000000 21.10 0.65500  521.000   herbi
## 24         3.1       0.4          NA 20.90 0.41900  187.000   herbi
## 25        10.1       3.5   0.2833333 13.90 0.00350    0.770    omni
## 26        10.9       1.1          NA 13.10 0.11500   10.000    omni
## 27        14.9        NA          NA  9.10      NA    0.071   herbi
## 28        12.5       3.2   0.4166667 11.50 0.02560    3.300   carni
## 29         9.8       1.1   0.5500000 14.20 0.00500    0.200    omni
## 30         1.9       0.4          NA 22.10      NA  899.995   herbi
## 31         2.7       0.1          NA 21.35      NA  800.000   carni
## 32         6.2       1.5          NA 17.80 0.32500   85.000   carni
## 33         6.3       0.6          NA 17.70 0.01227    2.625   herbi
## 34         8.0       1.9   1.5000000 16.00 1.32000   62.000    omni
## 35         9.5       0.9          NA 14.50      NA    1.670   herbi
## 36         3.3        NA          NA 20.70 5.71200 6654.000   herbi
## 37        19.4       6.6          NA  4.60      NA    0.370   carni
## 38        10.1       1.2   0.7500000 13.90 0.17900    6.800    omni
## 39        14.2       1.9          NA  9.80      NA    0.053   herbi
## 40        14.3       3.1   0.2000000  9.70 0.00100    0.120   herbi
## 41        12.8        NA          NA 11.20      NA    0.035   herbi
## 42        12.5       1.4   0.1833333 11.50 0.00040    0.022   herbi
## 43        19.9       2.0   0.2000000  4.10 0.00025    0.010 insecti
## 44        14.6        NA          NA  9.40      NA    0.266   herbi
## 45        11.0        NA          NA 13.00 0.01250    1.400   carni
## 46         7.7       0.9          NA 16.30      NA    0.210   herbi
## 47        14.5        NA          NA  9.50      NA    0.028   carni
## 48         8.4       0.9   0.4166667 15.60 0.01210    2.500   herbi
## 49         3.8       0.6          NA 20.20 0.17500   55.500   herbi
## 50         9.7       1.4   1.4166667 14.30 0.44000   52.200    omni
## 51        15.8        NA          NA  8.20      NA  162.564   carni
## 52        10.4        NA          NA 13.60 0.15700  100.000   carni
## 53        13.5        NA          NA 10.50      NA  161.499   carni
## 54         9.4       1.0   0.6666667 14.60 0.18000   25.235    omni
## 55        10.3       2.7          NA 13.70 0.00240    0.550    <NA>
## 56        11.0        NA          NA 13.00      NA    1.100    omni
## 57        11.5        NA          NA 12.50      NA    0.021    <NA>
## 58        13.7       1.8          NA 10.30 0.01140    1.620    <NA>
## 59         3.5       0.4          NA 20.50      NA   86.000   carni
## 60         5.6        NA          NA 18.45      NA   53.180   carni
## 61        11.1       1.5          NA 12.90      NA    1.100   herbi
## 62        18.1       6.1          NA  5.90 0.08100   60.000 insecti
## 63         5.4       0.5          NA 18.60 0.02100    3.600    <NA>
## 64        13.0       2.4   0.1833333 11.00 0.00190    0.320   herbi
## 65         8.7        NA          NA 15.30      NA    0.044    omni
## 66         9.6       1.4          NA 14.40 0.02000    0.743    omni
## 67         8.4       2.1   0.1666667 15.60 0.00120    0.075 insecti
## 68        11.3       1.1   0.1500000 12.70 0.00118    0.148   herbi
## 69        10.6       2.4          NA 13.40 0.00300    0.122    <NA>
## 70        16.6        NA          NA  7.40 0.00570    0.920   herbi
## 71        13.8       3.4   0.2166667 10.20 0.00400    0.101   herbi
## 72        15.9       3.0          NA  8.10      NA    0.205   herbi
## 73        12.8       2.0   0.1833333 11.20 0.00033    0.048    <NA>
## 74         9.1       2.4   0.5000000 14.90 0.18000   86.250    omni
## 75         8.6        NA          NA 15.40 0.02500    4.500 insecti
## 76        15.8        NA          NA  8.20      NA    0.112   herbi
## 77         4.4       1.0   0.9000000 19.60 0.16900  207.501   herbi
## 78        15.6       2.3          NA  8.40 0.00260    0.900    omni
## 79         8.9       2.6   0.2333333 15.10 0.00250    0.104    omni
## 80         5.2        NA          NA 18.80      NA  173.330   carni
## 81         6.3       1.3          NA 17.70 0.01750    2.000   carni
## 82        12.5        NA          NA 11.50 0.04450    3.380   carni
## 83         9.8       2.4   0.3500000 14.20 0.05040    4.230   carni
##              order
## 1        Carnivora
## 2         Primates
## 3         Rodentia
## 4     Soricomorpha
## 5     Artiodactyla
## 6           Pilosa
## 7        Carnivora
## 8         Rodentia
## 9        Carnivora
## 10    Artiodactyla
## 11    Artiodactyla
## 12        Rodentia
## 13        Primates
## 14        Rodentia
## 15    Soricomorpha
## 16        Rodentia
## 17    Soricomorpha
## 18       Cingulata
## 19      Hyracoidea
## 20 Didelphimorphia
## 21     Proboscidea
## 22      Chiroptera
## 23  Perissodactyla
## 24  Perissodactyla
## 25  Erinaceomorpha
## 26        Primates
## 27        Rodentia
## 28       Carnivora
## 29        Primates
## 30    Artiodactyla
## 31         Cetacea
## 32       Carnivora
## 33      Hyracoidea
## 34        Primates
## 35        Primates
## 36     Proboscidea
## 37 Didelphimorphia
## 38        Primates
## 39        Rodentia
## 40        Rodentia
## 41        Rodentia
## 42        Rodentia
## 43      Chiroptera
## 44        Rodentia
## 45        Primates
## 46        Rodentia
## 47        Rodentia
## 48      Lagomorpha
## 49    Artiodactyla
## 50        Primates
## 51       Carnivora
## 52       Carnivora
## 53       Carnivora
## 54        Primates
## 55  Erinaceomorpha
## 56        Primates
## 57        Rodentia
## 58   Diprotodontia
## 59       Carnivora
## 60         Cetacea
## 61   Diprotodontia
## 62       Cingulata
## 63      Hyracoidea
## 64        Rodentia
## 65        Rodentia
## 66        Primates
## 67    Soricomorpha
## 68        Rodentia
## 69        Rodentia
## 70        Rodentia
## 71        Rodentia
## 72        Rodentia
## 73    Soricomorpha
## 74    Artiodactyla
## 75     Monotremata
## 76        Rodentia
## 77  Perissodactyla
## 78    Afrosoricida
## 79      Scandentia
## 80         Cetacea
## 81       Carnivora
## 82       Carnivora
## 83       Carnivora

Remove missing values with na.omit()

#add na.omit()
msleep_num <- na.omit(msleep_num)

Principal components analysis - base R

Principal component analysis is typically done using base R functions using the prcomp() function.

Run the PCA.

# add prcomp()
pca.out <- prcomp(msleep_num[,-c(7,8)], scale = TRUE)

Plot the biplot with biplot()

# add biplot()
biplot(pca.out)

Principal components analysis - vegan

The base R PCA output isn’t very flexible. The R package vegan is has a function rda() which does PCA and has many more nice features.

For another example see https://rpubs.com/brouwern/veganpca

Run the PCA using vegan::rda()

rda.out <- vegan::rda(msleep_num[,-c(7,8)], scale = TRUE)

Extract what are known as the PCA scores using the scores() function.

# add scores()
rda_scores <- scores(rda.out)

vegan let’s you plot a scatterplot of the scores or a full biplot with the arrows (vectors). The displays the 2D scatterplot plot without the arrows. For more info on what this code does, see the RPubs document linked above.

# add biplot()
biplot(rda.out, 
       display = "sites")

vegan has some nice tools for grouping things.

In this dataset I don’t expect there to be any interesting groups, but I’ll check anyway.

PCA is an exploratory method. First I’ll see if there are any groupings based on diet (“vore”). Not really.

#make the plot with biplot()
# add biplot()
biplot(rda.out, display = "sites")

#add "hulls" with the ordihull()function
# add ordihull()
ordihull(rda.out,
         group = msleep_num$vore,
         col = 1:7,
         lty = 1:7,
         lwd = c(3,6))

Task

Copy the previous code chunk and change the group so that the taxonomic order is plotted (column “order”). Upload the image to this assignment. Consider if there are any meaningful groups.

## add your code here
#make the plot with biplot()
# add biplot()
biplot(rda.out, display = "sites")

#add "hulls" with the ordihull()function
# add ordihull()
ordihull(rda.out,
         group = msleep_num$order,
         col = 1:7,
         lty = 1:7,
         lwd = c(3,6))