Read in the packages. The working directory is wherever the R Notebook is located. Note that RStoolbox has some dependencies that are kind of difficult to install on a Mac. The package makes running a PCA on total climate space super easy and has a lot of other functionality, so I think it’s worth the headache.
packages <- c("raster", "ggbiplot", "tidyverse", "sf", "RStoolbox", "leaflet", "maps", "plotly") #RStoolbox has some dependencies like openMP that can be difficult to compile on a Mac (needed for the dependent package "caret"). If you have High Sierra OS or newer, search for instructions specific to your OS- it's a lot easier than older OS's.
lapply(packages, require, character.only = TRUE)
Read in the spreadsheet and take a look at the data.
###read in spreadsheet
loc <- read.csv("worksheets/all-species.csv")
#make locality shape file and assign WGS coord system
coord.points <- st_as_sf(loc, coords = c("longitude", "latitude"),
crs = 4326, agr = "constant")
#Take a gander at the head and structure of the data
head(loc)
str(loc)
'data.frame': 714 obs. of 9 variables:
$ genus : Factor w/ 9 levels "acanthoxyla",..: 1 1 1 1 1 1 1 1 1 1 ...
$ species : Factor w/ 17 levels "acornutus","annulata",..: 16 16 16 16 16 16 16 16 16 16 ...
$ id : Factor w/ 714 levels "acan-sp_1","acan-sp_10",..: 1 46 57 68 79 90 101 112 123 2 ...
$ latitude : num -35.3 -35.7 -35.7 -35.3 -35.9 ...
$ longitude: num 173 174 174 174 174 ...
$ locality : Factor w/ 608 levels "0.5 km S of carpark, Puhipuhi Scenic Reserve, Kaikoura",..: 585 578 539 379 98 104 106 15 172 30 ...
$ X : logi NA NA NA NA NA NA ...
$ X.1 : logi NA NA NA NA NA NA ...
$ X.2 : logi NA NA NA NA NA NA ...
Plot a leaflet map of the localities. The leaflet map is interactive. You can click on the localities and a flag with some metadata will pop up!
#load a world map for plotting
w.map <- map("world", plot = FALSE)
#make color palette
pal <- colorFactor(palette = "inferno", domain = NULL)
#create popup labels for points
popup.label <- paste(coord.points$genus, coord.points$species, "<br>",
coord.points$id, "<br>",
coord.points$locality)
#plot in leaflet
l <- leaflet(data = w.map) %>%
addTiles() %>%
fitBounds(lng1 = 165, lng2 = 185, lat1 = -30, lat2 = -55) %>%
addCircleMarkers(data = coord.points,
color = ~pal(genus),
radius = 3,
popup = popup.label) %>%
addLegend(data = coord.points, "bottomright", pal = pal, values = ~genus, opacity = .9)
l
Obtain the bioclim layers for analysis. I’m using all 19 for this preliminary exploration. I plotted the first bioclim just to make sure nothing seems wonky.
##get worldclim data
w <- getData("worldclim", var = "bio", res = 0.5, lon = 173, lat = -35)
#check it out
plot(w,1, xlim = c(165, 185), ylim= c(-55, -33))
This is a PCA of the climate data extracted for each locality, rather than a PCA of the total climate space.
Run the pca and check out variable loadings and proportion of variance explained by components.
#extract data from worldclim for each locality. Making this into a data frame with columns labeled so the row labeling lines up after I remove the NAs.
#extract data from worldclim for each locality.
loc.clim <- data.frame(genus = coord.points$genus,
species = coord.points$species,
id = coord.points$id,
locality = coord.points$locality,
coords = as.character(coord.points$geometry),
raster::extract(w, coord.points, method = "simple")) %>% na.omit()
#make a matrix of only bioclim values
clim.mat <- loc.clim[,grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca <- prcomp(clim.mat, scale = TRUE)
summary.pca <- summary(clim.pca) #check out the components
Two plots: One plot of the PCA colored according to genus, with convex hulls surrounding the genera. It looks like this reflects a latitudinal gradient in temperature! You can interact with the PCA plot by clicking on points to view associated metadata. You can isolate the genus you want to view by double clicking the genus in the legend! You can also remove a genus by clicking on it once. There’s some other functionality you can explore in the toolbar at the top of the plot.
#add pca results to loc.clim data frame
loc.clim <- data.frame(loc.clim, clim.pca$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1, df$PC2), ]
hulls <- plyr::ddply(loc.clim, "genus", find_hull)
#make an interactive plot in plotly
p <- ggplot(data = loc.clim, aes(x = PC1, y = PC2, col = genus, fill = genus)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "inferno") +
scale_fill_viridis_d(option = "inferno") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly <- ggplotly(p, tooltip = c("text"))
#plot biplot
b <- ggbiplot(clim.pca, groups = loc.clim$genus, alpha = .7) +
scale_color_viridis_d(option = "inferno") +
ggtitle("Biplot of extracted climate")
summary.pca
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.8205 2.5959 1.29925 1.03634 0.86397 0.75029 0.35359 0.22905 0.13868 0.11203
Proportion of Variance 0.4187 0.3547 0.08885 0.05653 0.03929 0.02963 0.00658 0.00276 0.00101 0.00066
Cumulative Proportion 0.4187 0.7734 0.86222 0.91874 0.95803 0.98766 0.99424 0.99700 0.99801 0.99867
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.09749 0.0756 0.07171 0.04711 0.03566 0.03063 0.01582 0.01221 5.341e-16
Proportion of Variance 0.00050 0.0003 0.00027 0.00012 0.00007 0.00005 0.00001 0.00001 0.000e+00
Cumulative Proportion 0.99917 0.9995 0.99975 0.99986 0.99993 0.99998 0.99999 1.00000 1.000e+00
knitr::kable(round(clim.pca$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.343 | -0.023 | -0.148 |
bio2_411 | 0.248 | 0.051 | -0.528 |
bio3_411 | -0.133 | 0.040 | -0.425 |
bio4_411 | 0.302 | 0.062 | -0.295 |
bio5_411 | -0.297 | -0.009 | -0.376 |
bio6_411 | -0.352 | -0.029 | 0.000 |
bio7_411 | 0.284 | 0.041 | -0.420 |
bio8_411 | -0.191 | 0.046 | -0.205 |
bio9_411 | -0.297 | -0.060 | 0.060 |
bio10_411 | -0.332 | -0.017 | -0.220 |
bio11_411 | -0.349 | -0.034 | -0.072 |
bio12_411 | 0.048 | -0.381 | -0.005 |
bio13_411 | 0.009 | -0.381 | -0.033 |
bio14_411 | 0.050 | -0.372 | -0.064 |
bio15_411 | -0.208 | -0.006 | -0.057 |
bio16_411 | 0.004 | -0.382 | -0.016 |
bio17_411 | 0.076 | -0.372 | -0.024 |
bio18_411 | 0.081 | -0.364 | -0.073 |
bio19_411 | -0.038 | -0.369 | 0.002 |
I conducted a PCA of the total climate space in New Zealand and extracted values for each locality.
Run the pca and check out variable loadings and proportion of variance explained by components.
#subset the world map to just New Zealand to make the analysis tractable
lims<-c(165,185, -55,-33)
submap<-crop(w,lims)
#run a scaled PCA on the climate data
pcamap<-rasterPCA(submap, spca=TRUE)
summary.pca.total <- summary(pcamap$model) #check out the components
Plot the principle components in geographic space.
#plot the PCs
plot(pcamap$map, 1:3)
Plot the first two PCs as a scatter plot. I couldn’t figure out how to plot a biplot with the data structure rasterPCA outputs. It seems like the latitudinal gradient in climate is spread across PC axes.
#extract data from pca for each locality. Making this into a data drame with columns labeled so the row labeling lines up after I remove the NAs.
loc.total.clim <- data.frame(genus = coord.points$genus,
species = coord.points$species,
id = coord.points$id,
locality = coord.points$locality,
coords = as.character(coord.points$geometry),
raster::extract(pcamap$map, coord.points, method = "simple")) %>% na.omit()
#make convex hull of genera
find.hull.total <- function(df) df[chull(df$PC1, df$PC2), ]
hulls.total <- plyr::ddply(loc.total.clim, "genus", find.hull.total)
#make an interactive plot in plotly
p.total <- ggplot(data = loc.total.clim, aes(x = PC1, y = PC2, col = genus, fill = genus)) +
geom_polygon(data = hulls.total, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords, "<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "inferno") +
scale_fill_viridis_d(option = "inferno") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.total$importance[2,1] * 100, 1),"%)"), x = paste("PC2 (", round(summary.pca.total$importance[2,2] * 100, 1),"%)")) +
theme_minimal()
Ignoring unknown aesthetics: textThe plyr::rename operation has created duplicates for the following name(s): (`x`)
p.plotly.total <- ggplotly(p.total, tooltip = c("text"))
p.plotly.total
summary.pca.total
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
Standard deviation 2.9368052 2.4375805 1.33337693 1.02359474 0.89722816 0.77573444 0.336790738
Proportion of Variance 0.4539382 0.3127262 0.09357337 0.05514454 0.04236939 0.03167179 0.005969895
Cumulative Proportion 0.4539382 0.7666644 0.86023777 0.91538230 0.95775169 0.98942348 0.995393373
Comp.8 Comp.9 Comp.10 Comp.11 Comp.12 Comp.13
Standard deviation 0.195236299 0.1360079252 0.1101757666 0.0819089406 0.0635551390 0.0609975574
Proportion of Variance 0.002006169 0.0009735871 0.0006388789 0.0003531092 0.0002125924 0.0001958264
Cumulative Proportion 0.997399542 0.9983731292 0.9990120082 0.9993651173 0.9995777097 0.9997735362
Comp.14 Comp.15 Comp.16 Comp.17 Comp.18 Comp.19
Standard deviation 0.0458378300 3.345402e-02 2.790399e-02 1.307653e-02 1.152850e-02 0
Proportion of Variance 0.0001105846 5.890375e-05 4.098068e-05 8.999776e-06 6.995071e-06 0
Cumulative Proportion 0.9998841207 9.999430e-01 9.999840e-01 9.999930e-01 1.000000e+00 1
#Table of loading scores for the first 3 PCs.
knitr::kable(round(pcamap$model$loadings[,1:3],3))
Comp.1 | Comp.2 | Comp.3 | |
---|---|---|---|
bio1_411 | 0.287 | 0.191 | 0.179 |
bio2_411 | -0.112 | -0.238 | 0.512 |
bio3_411 | 0.186 | 0.057 | 0.034 |
bio4_411 | -0.189 | -0.260 | 0.352 |
bio5_411 | 0.263 | 0.136 | 0.384 |
bio6_411 | 0.282 | 0.225 | 0.047 |
bio7_411 | -0.174 | -0.241 | 0.453 |
bio8_411 | 0.189 | 0.012 | 0.292 |
bio9_411 | 0.204 | 0.219 | -0.058 |
bio10_411 | 0.283 | 0.170 | 0.256 |
bio11_411 | 0.284 | 0.215 | 0.097 |
bio12_411 | -0.247 | 0.279 | 0.065 |
bio13_411 | -0.225 | 0.302 | 0.088 |
bio14_411 | -0.244 | 0.269 | 0.098 |
bio15_411 | 0.148 | 0.156 | 0.125 |
bio16_411 | -0.222 | 0.307 | 0.074 |
bio17_411 | -0.264 | 0.249 | 0.075 |
bio18_411 | -0.262 | 0.241 | 0.114 |
bio19_411 | -0.191 | 0.325 | 0.028 |
These are PCAs of environmental space for species within genera. Each climate PCA is of localities for a single genus, colored by species. I’m doing this even for genera with one species, so it’s easy to see if certain localities group together.
#make a matrix of only bioclim values
clim.mat.acan <- loc.clim[loc.clim$genus == "acanthoxyla",grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.acan <- prcomp(clim.mat.acan, scale = TRUE)
summary.pca.acan <- summary(clim.pca.acan) #check out the components
#add pca results to loc.clim data frame
loc.clim.acan <- data.frame(loc.clim[loc.clim$genus == "acanthoxyla",], clim.pca.acan$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.acan, "genus", find_hull)
#make an interactive plot in plotly
p.acan <- ggplot(data = loc.clim.acan, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "inferno") +
scale_fill_viridis_d(option = "inferno") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.acan <- ggplotly(p.acan, tooltip = c("text"))
#plot biplot
b.acan <- ggbiplot(clim.pca.acan, groups = loc.clim.acan$species, alpha = .7) +
scale_color_viridis_d(option = "inferno") +
ggtitle("Biplot of extracted climate")
p.plotly.acan
b.acan
summary.pca.acan
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.8976 2.4081 1.4499 1.03416 0.99391 0.64837 0.37316 0.17888 0.15065 0.10865
Proportion of Variance 0.4419 0.3052 0.1106 0.05629 0.05199 0.02213 0.00733 0.00168 0.00119 0.00062
Cumulative Proportion 0.4419 0.7471 0.8578 0.91404 0.96603 0.98816 0.99549 0.99717 0.99836 0.99899
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.07958 0.07073 0.05713 0.05208 0.03016 0.02200 0.01953 0.01351 1.003e-16
Proportion of Variance 0.00033 0.00026 0.00017 0.00014 0.00005 0.00003 0.00002 0.00001 0.000e+00
Cumulative Proportion 0.99932 0.99958 0.99975 0.99990 0.99994 0.99997 0.99999 1.00000 1.000e+00
knitr::kable(round(clim.pca.acan$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.220 | -0.303 | 0.131 |
bio2_411 | 0.191 | 0.155 | 0.499 |
bio3_411 | -0.062 | 0.141 | 0.324 |
bio4_411 | 0.244 | 0.106 | 0.354 |
bio5_411 | -0.152 | -0.279 | 0.381 |
bio6_411 | -0.256 | -0.274 | -0.004 |
bio7_411 | 0.232 | 0.117 | 0.434 |
bio8_411 | -0.027 | -0.275 | -0.091 |
bio9_411 | -0.251 | -0.172 | 0.116 |
bio10_411 | -0.197 | -0.308 | 0.204 |
bio11_411 | -0.243 | -0.289 | 0.059 |
bio12_411 | 0.284 | -0.232 | -0.046 |
bio13_411 | 0.268 | -0.253 | 0.009 |
bio14_411 | 0.289 | -0.219 | -0.049 |
bio15_411 | -0.146 | -0.087 | 0.300 |
bio16_411 | 0.264 | -0.256 | -0.004 |
bio17_411 | 0.293 | -0.211 | -0.065 |
bio18_411 | 0.295 | -0.205 | -0.054 |
bio19_411 | 0.220 | -0.277 | 0.039 |
#make a matrix of only bioclim values
clim.mat.argo <- loc.clim[loc.clim$genus == "argosarchus", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.argo <- prcomp(clim.mat.argo, scale = TRUE)
summary.pca.argo <- summary(clim.pca.argo) #check out the components
#add pca results to loc.clim data frame
loc.clim.argo <- data.frame(loc.clim[loc.clim$genus == "argosarchus",], clim.pca.argo$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.argo, "genus", find_hull)
#make an interactive plot in plotly
p.argo <- ggplot(data = loc.clim.argo, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "inferno") +
scale_fill_viridis_d(option = "inferno") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.argo$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.argo$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.argo <- ggplotly(p.argo, tooltip = c("text"))
#plot biplot
b.argo <- ggbiplot(clim.pca.argo, groups = loc.clim.argo$species, alpha = .7) +
scale_color_viridis_d(option = "inferno") +
ggtitle("Biplot of extracted climate")
p.plotly.argo
b.argo
summary.pca.argo
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.6107 2.5769 1.5880 1.08155 0.94245 0.84133 0.3698 0.2092 0.1509 0.13650
Proportion of Variance 0.3587 0.3495 0.1327 0.06157 0.04675 0.03725 0.0072 0.0023 0.0012 0.00098
Cumulative Proportion 0.3587 0.7082 0.8409 0.90251 0.94926 0.98652 0.9937 0.9960 0.9972 0.99820
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.12389 0.08684 0.07717 0.05097 0.03317 0.02797 0.02345 0.02016 2.68e-16
Proportion of Variance 0.00081 0.00040 0.00031 0.00014 0.00006 0.00004 0.00003 0.00002 0.00e+00
Cumulative Proportion 0.99900 0.99940 0.99971 0.99985 0.99991 0.99995 0.99998 1.00000 1.00e+00
knitr::kable(round(clim.pca.argo$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | 0.253 | -0.268 | 0.137 |
bio2_411 | -0.139 | 0.198 | 0.487 |
bio3_411 | 0.018 | 0.020 | 0.421 |
bio4_411 | -0.210 | 0.213 | 0.300 |
bio5_411 | 0.192 | -0.174 | 0.437 |
bio6_411 | 0.254 | -0.286 | -0.040 |
bio7_411 | -0.166 | 0.222 | 0.418 |
bio8_411 | -0.038 | -0.077 | -0.082 |
bio9_411 | 0.223 | -0.227 | 0.115 |
bio10_411 | 0.239 | -0.247 | 0.233 |
bio11_411 | 0.265 | -0.272 | 0.049 |
bio12_411 | 0.281 | 0.261 | -0.038 |
bio13_411 | 0.303 | 0.232 | 0.003 |
bio14_411 | 0.246 | 0.289 | -0.045 |
bio15_411 | 0.133 | -0.160 | 0.152 |
bio16_411 | 0.307 | 0.228 | -0.007 |
bio17_411 | 0.248 | 0.289 | -0.046 |
bio18_411 | 0.225 | 0.305 | -0.047 |
bio19_411 | 0.331 | 0.173 | 0.032 |
#make a matrix of only bioclim values
clim.mat.aste <- loc.clim[loc.clim$genus == "asteliaphasma", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.aste <- prcomp(clim.mat.aste, scale = TRUE)
summary.pca.aste <- summary(clim.pca.aste) #check out the components
#add pca results to loc.clim data frame
loc.clim.aste <- data.frame(loc.clim[loc.clim$genus == "asteliaphasma",], clim.pca.aste$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.aste, "genus", find_hull)
#make an interactive plot in plotly
p.aste <- ggplot(data = loc.clim.aste, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.aste$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.aste$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.aste <- ggplotly(p.aste, tooltip = c("text"))
#plot biplot
b.aste <- ggbiplot(clim.pca.aste, groups = loc.clim.aste$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.aste
b.aste
summary.pca.aste
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 3.5289 1.9449 1.08847 0.93035 0.68419 0.40024 0.19212 0.13894 0.10016 0.09172
Proportion of Variance 0.6554 0.1991 0.06236 0.04556 0.02464 0.00843 0.00194 0.00102 0.00053 0.00044
Cumulative Proportion 0.6554 0.8545 0.91687 0.96242 0.98706 0.99549 0.99743 0.99845 0.99898 0.99942
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.05687 0.05344 0.04780 0.03677 0.02653 0.02020 0.01274 9.185e-16 1.8e-16
Proportion of Variance 0.00017 0.00015 0.00012 0.00007 0.00004 0.00002 0.00001 0.000e+00 0.0e+00
Cumulative Proportion 0.99959 0.99974 0.99986 0.99993 0.99997 0.99999 1.00000 1.000e+00 1.0e+00
knitr::kable(round(clim.pca.aste$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.264 | -0.159 | -0.068 |
bio2_411 | 0.226 | 0.164 | -0.445 |
bio3_411 | -0.069 | -0.207 | -0.688 |
bio4_411 | 0.243 | 0.193 | -0.241 |
bio5_411 | -0.245 | -0.123 | -0.322 |
bio6_411 | -0.261 | -0.190 | 0.087 |
bio7_411 | 0.234 | 0.203 | -0.314 |
bio8_411 | -0.264 | -0.175 | 0.009 |
bio9_411 | -0.225 | -0.102 | -0.134 |
bio10_411 | -0.260 | -0.151 | -0.114 |
bio11_411 | -0.264 | -0.177 | -0.001 |
bio12_411 | 0.239 | -0.272 | 0.017 |
bio13_411 | 0.203 | -0.349 | -0.021 |
bio14_411 | 0.230 | -0.260 | 0.112 |
bio15_411 | -0.148 | -0.237 | -0.093 |
bio16_411 | 0.210 | -0.341 | 0.022 |
bio17_411 | 0.246 | -0.227 | 0.041 |
bio18_411 | 0.229 | -0.275 | -0.022 |
bio19_411 | 0.210 | -0.341 | 0.021 |
#make a matrix of only bioclim values
clim.mat.clita <- loc.clim[loc.clim$genus == "clitarchus", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.clita <- prcomp(clim.mat.clita, scale = TRUE)
summary.pca.clita <- summary(clim.pca.clita) #check out the components
#add pca results to loc.clim data frame
loc.clim.clita <- data.frame(loc.clim[loc.clim$genus == "clitarchus",], clim.pca.clita$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.clita, "species", find_hull)
#make an interactive plot in plotly
p.clita <- ggplot(data = loc.clim.clita, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.clita$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.clita$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.clita <- ggplotly(p.clita, tooltip = c("text"))
#plot biplot
b.clita <- ggbiplot(clim.pca.clita, groups = loc.clim.clita$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.clita
b.clita
summary.pca.clita
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.8287 2.6276 1.32815 1.1156 0.73394 0.53892 0.3776 0.22177 0.14594 0.13870
Proportion of Variance 0.4211 0.3634 0.09284 0.0655 0.02835 0.01529 0.0075 0.00259 0.00112 0.00101
Cumulative Proportion 0.4211 0.7845 0.87737 0.9429 0.97121 0.98650 0.9940 0.99659 0.99771 0.99873
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.10155 0.07717 0.06308 0.04870 0.02999 0.01809 0.01515 0.01187 4.963e-16
Proportion of Variance 0.00054 0.00031 0.00021 0.00012 0.00005 0.00002 0.00001 0.00001 0.000e+00
Cumulative Proportion 0.99927 0.99958 0.99979 0.99992 0.99996 0.99998 0.99999 1.00000 1.000e+00
knitr::kable(round(clim.pca.clita$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.317 | -0.142 | -0.099 |
bio2_411 | 0.129 | 0.256 | -0.457 |
bio3_411 | -0.166 | 0.081 | -0.530 |
bio4_411 | 0.252 | 0.216 | -0.236 |
bio5_411 | -0.305 | -0.055 | -0.299 |
bio6_411 | -0.304 | -0.191 | 0.037 |
bio7_411 | 0.182 | 0.251 | -0.355 |
bio8_411 | -0.189 | -0.209 | 0.105 |
bio9_411 | -0.280 | -0.066 | -0.167 |
bio10_411 | -0.314 | -0.119 | -0.152 |
bio11_411 | -0.318 | -0.157 | -0.040 |
bio12_411 | -0.176 | 0.322 | 0.126 |
bio13_411 | -0.207 | 0.294 | 0.060 |
bio14_411 | -0.180 | 0.303 | 0.161 |
bio15_411 | -0.140 | -0.061 | -0.250 |
bio16_411 | -0.211 | 0.293 | 0.078 |
bio17_411 | -0.145 | 0.332 | 0.151 |
bio18_411 | -0.149 | 0.325 | 0.149 |
bio19_411 | -0.216 | 0.289 | 0.067 |
#make a matrix of only bioclim values
clim.mat.micra <- loc.clim[loc.clim$genus == "micrarchus", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.micra <- prcomp(clim.mat.micra, scale = TRUE)
summary.pca.micra <- summary(clim.pca.micra) #check out the components
#add pca results to loc.clim data frame
loc.clim.micra <- data.frame(loc.clim[loc.clim$genus == "micrarchus",], clim.pca.micra$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.micra, "species", find_hull)
#make an interactive plot in plotly
p.micra <- ggplot(data = loc.clim.micra, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.micra$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.micra$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.micra <- ggplotly(p.micra, tooltip = c("text"))
#plot biplot
b.micra <- ggbiplot(clim.pca.micra, groups = loc.clim.micra$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.micra
b.micra
summary.pca.micra
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 3.4602 1.8070 1.4755 1.04628 0.47265 0.37427 0.28044 0.13523 0.12413 0.08040
Proportion of Variance 0.6302 0.1719 0.1146 0.05762 0.01176 0.00737 0.00414 0.00096 0.00081 0.00034
Cumulative Proportion 0.6302 0.8020 0.9166 0.97422 0.98598 0.99335 0.99749 0.99846 0.99927 0.99961
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.04868 0.04396 0.03954 0.02696 0.01967 0.01488 0.01171 0.01059 5.976e-17
Proportion of Variance 0.00012 0.00010 0.00008 0.00004 0.00002 0.00001 0.00001 0.00001 0.000e+00
Cumulative Proportion 0.99973 0.99983 0.99992 0.99995 0.99998 0.99999 0.99999 1.00000 1.000e+00
knitr::kable(round(clim.pca.micra$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | 0.265 | 0.140 | 0.187 |
bio2_411 | 0.044 | -0.384 | 0.474 |
bio3_411 | 0.215 | -0.052 | 0.376 |
bio4_411 | -0.145 | -0.454 | 0.113 |
bio5_411 | 0.264 | 0.036 | 0.260 |
bio6_411 | 0.264 | 0.203 | 0.092 |
bio7_411 | -0.069 | -0.439 | 0.365 |
bio8_411 | 0.183 | 0.271 | 0.080 |
bio9_411 | 0.265 | 0.137 | 0.127 |
bio10_411 | 0.266 | 0.105 | 0.207 |
bio11_411 | 0.264 | 0.171 | 0.164 |
bio12_411 | -0.259 | 0.189 | 0.187 |
bio13_411 | -0.253 | 0.205 | 0.199 |
bio14_411 | -0.256 | 0.177 | 0.201 |
bio15_411 | 0.130 | 0.077 | 0.062 |
bio16_411 | -0.252 | 0.206 | 0.204 |
bio17_411 | -0.256 | 0.184 | 0.209 |
bio18_411 | -0.255 | 0.186 | 0.211 |
bio19_411 | -0.254 | 0.175 | 0.189 |
#make a matrix of only bioclim values
clim.mat.nive <- loc.clim[loc.clim$genus == "niveaphasma", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.nive <- prcomp(clim.mat.nive, scale = TRUE)
summary.pca.nive <- summary(clim.pca.nive) #check out the components
#add pca results to loc.clim data frame
loc.clim.nive <- data.frame(loc.clim[loc.clim$genus == "niveaphasma",], clim.pca.nive$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.nive, "species", find_hull)
#make an interactive plot in plotly
p.nive <- ggplot(data = loc.clim.nive, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.nive$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.nive$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.nive <- ggplotly(p.nive, tooltip = c("text"))
#plot biplot
b.nive <- ggbiplot(clim.pca.nive, groups = loc.clim.nive$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.nive
b.nive
summary.pca.nive
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.9075 2.3439 1.6667 1.03074 0.77498 0.58305 0.48456 0.11336 0.09999 0.07972
Proportion of Variance 0.4449 0.2892 0.1462 0.05592 0.03161 0.01789 0.01236 0.00068 0.00053 0.00033
Cumulative Proportion 0.4449 0.7341 0.8803 0.93621 0.96782 0.98571 0.99807 0.99874 0.99927 0.99960
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.05147 0.04226 0.03754 0.02765 0.02184 0.01340 0.01209 0.01062 6.88e-16
Proportion of Variance 0.00014 0.00009 0.00007 0.00004 0.00003 0.00001 0.00001 0.00001 0.00e+00
Cumulative Proportion 0.99974 0.99984 0.99991 0.99995 0.99998 0.99999 0.99999 1.00000 1.00e+00
knitr::kable(round(clim.pca.nive$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.212 | -0.286 | 0.234 |
bio2_411 | -0.053 | 0.355 | 0.221 |
bio3_411 | -0.050 | -0.292 | -0.117 |
bio4_411 | -0.042 | 0.388 | 0.209 |
bio5_411 | -0.218 | 0.082 | 0.424 |
bio6_411 | -0.139 | -0.385 | 0.066 |
bio7_411 | -0.027 | 0.387 | 0.230 |
bio8_411 | -0.218 | -0.008 | 0.258 |
bio9_411 | -0.002 | -0.238 | 0.191 |
bio10_411 | -0.245 | -0.148 | 0.347 |
bio11_411 | -0.155 | -0.369 | 0.118 |
bio12_411 | 0.322 | -0.071 | 0.184 |
bio13_411 | 0.314 | -0.077 | 0.217 |
bio14_411 | 0.329 | -0.061 | 0.130 |
bio15_411 | -0.150 | 0.011 | 0.384 |
bio16_411 | 0.315 | -0.087 | 0.205 |
bio17_411 | 0.330 | -0.054 | 0.145 |
bio18_411 | 0.319 | -0.067 | 0.185 |
bio19_411 | 0.326 | -0.079 | 0.148 |
#make a matrix of only bioclim values
clim.mat.spin <- loc.clim[loc.clim$genus == "spinotectarchus", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.spin <- prcomp(clim.mat.spin, scale = TRUE)
summary.pca.spin <- summary(clim.pca.spin) #check out the components
#add pca results to loc.clim data frame
loc.clim.spin <- data.frame(loc.clim[loc.clim$genus == "spinotectarchus",], clim.pca.spin$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.spin, "species", find_hull)
#make an interactive plot in plotly
p.spin <- ggplot(data = loc.clim.spin, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.spin$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.spin$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.spin <- ggplotly(p.spin, tooltip = c("text"))
#plot biplot
b.spin <- ggbiplot(clim.pca.spin, groups = loc.clim.spin$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.spin
b.spin
summary.pca.spin
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 3.6740 1.7717 1.04961 0.74836 0.67601 0.41693 0.17179 0.13411 0.09313 0.07134
Proportion of Variance 0.7105 0.1652 0.05798 0.02948 0.02405 0.00915 0.00155 0.00095 0.00046 0.00027
Cumulative Proportion 0.7105 0.8757 0.93363 0.96311 0.98716 0.99631 0.99786 0.99881 0.99927 0.99954
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.05938 0.05019 0.03604 0.02482 0.02193 0.01971 6.536e-16 1.269e-16 3.144e-18
Proportion of Variance 0.00019 0.00013 0.00007 0.00003 0.00003 0.00002 0.000e+00 0.000e+00 0.000e+00
Cumulative Proportion 0.99972 0.99985 0.99992 0.99995 0.99998 1.00000 1.000e+00 1.000e+00 1.000e+00
knitr::kable(round(clim.pca.spin$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | -0.257 | -0.130 | -0.194 |
bio2_411 | 0.230 | 0.165 | -0.351 |
bio3_411 | -0.106 | -0.360 | -0.197 |
bio4_411 | 0.230 | 0.233 | -0.239 |
bio5_411 | -0.239 | -0.097 | -0.390 |
bio6_411 | -0.254 | -0.197 | 0.014 |
bio7_411 | 0.228 | 0.235 | -0.273 |
bio8_411 | -0.258 | -0.165 | -0.104 |
bio9_411 | -0.215 | -0.014 | -0.395 |
bio10_411 | -0.252 | -0.109 | -0.265 |
bio11_411 | -0.258 | -0.165 | -0.104 |
bio12_411 | 0.239 | -0.264 | -0.037 |
bio13_411 | 0.220 | -0.323 | 0.010 |
bio14_411 | 0.240 | -0.237 | -0.137 |
bio15_411 | -0.152 | -0.249 | 0.447 |
bio16_411 | 0.221 | -0.322 | 0.029 |
bio17_411 | 0.247 | -0.211 | -0.129 |
bio18_411 | 0.238 | -0.241 | -0.172 |
bio19_411 | 0.221 | -0.322 | 0.029 |
#make a matrix of only bioclim values
clim.mat.tect <- loc.clim[loc.clim$genus == "tectarchus", grep("bio", names(loc.clim))] %>% as.matrix()
#run pca on climate variables
clim.pca.tect <- prcomp(clim.mat.tect, scale = TRUE)
summary.pca.tect <- summary(clim.pca.tect) #check out the components
#add pca results to loc.clim data frame
loc.clim.tect <- data.frame(loc.clim[loc.clim$genus == "tectarchus",], clim.pca.tect$x)
#make convex hull of genera
find_hull <- function(df) df[chull(df$PC1.1, df$PC2.1), ]
hulls <- plyr::ddply(loc.clim.tect, "species", find_hull)
#make an interactive plot in plotly
p.tect <- ggplot(data = loc.clim.tect, aes(x = PC1.1, y = PC2.1, col = species, fill = species)) +
geom_polygon(data = hulls, alpha = 0.4) +
geom_point(alpha = .7, size = 1, aes(text = paste("ID:", id, "<br>",
genus, species, "<br>",
"Long-Lat:", coords,"<br>",
"Locality:", locality))) +
scale_color_viridis_d(option = "viridis") +
scale_fill_viridis_d(option = "viridis") +
labs(title = "PCA of extracted climate", x = paste("PC1 (", round(summary.pca.tect$importance[2,1] * 100, 1),"%)"), y = paste("PC2 (", round(summary.pca.tect$importance[2,2] * 100, 1),"%)")) + theme_minimal()
Ignoring unknown aesthetics: text
p.plotly.tect <- ggplotly(p.tect, tooltip = c("text"))
#plot biplot
b.tect <- ggbiplot(clim.pca.tect, groups = loc.clim.tect$species, alpha = .7) +
scale_color_viridis_d(option = "viridis") +
ggtitle("Biplot of extracted climate")
p.plotly.tect
b.tect
summary.pca.tect
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
Standard deviation 2.7649 2.5672 1.5947 1.13698 0.63754 0.57252 0.35428 0.1689 0.11908 0.11043
Proportion of Variance 0.4024 0.3469 0.1338 0.06804 0.02139 0.01725 0.00661 0.0015 0.00075 0.00064
Cumulative Proportion 0.4024 0.7492 0.8831 0.95112 0.97251 0.98976 0.99637 0.9979 0.99862 0.99926
PC11 PC12 PC13 PC14 PC15 PC16 PC17 PC18 PC19
Standard deviation 0.07835 0.05082 0.04736 0.04012 0.02414 0.02156 0.01715 0.01214 1.15e-16
Proportion of Variance 0.00032 0.00014 0.00012 0.00008 0.00003 0.00002 0.00002 0.00001 0.00e+00
Cumulative Proportion 0.99958 0.99972 0.99984 0.99992 0.99995 0.99998 0.99999 1.00000 1.00e+00
knitr::kable(round(clim.pca.tect$rotation[,1:3],3)) #Table of loading scores for the first 3 PCs.
PC1 | PC2 | PC3 | |
---|---|---|---|
bio1_411 | 0.309 | -0.182 | 0.117 |
bio2_411 | -0.122 | 0.137 | 0.539 |
bio3_411 | 0.159 | -0.033 | 0.464 |
bio4_411 | -0.214 | 0.247 | 0.289 |
bio5_411 | 0.273 | -0.146 | 0.318 |
bio6_411 | 0.306 | -0.201 | -0.020 |
bio7_411 | -0.175 | 0.158 | 0.465 |
bio8_411 | 0.221 | -0.137 | -0.105 |
bio9_411 | 0.268 | -0.200 | 0.100 |
bio10_411 | 0.305 | -0.167 | 0.181 |
bio11_411 | 0.304 | -0.205 | 0.060 |
bio12_411 | -0.211 | -0.315 | 0.021 |
bio13_411 | -0.209 | -0.310 | 0.040 |
bio14_411 | -0.214 | -0.304 | 0.069 |
bio15_411 | 0.058 | 0.027 | 0.082 |
bio16_411 | -0.204 | -0.316 | 0.036 |
bio17_411 | -0.216 | -0.306 | 0.038 |
bio18_411 | -0.223 | -0.299 | 0.038 |
bio19_411 | -0.197 | -0.314 | 0.061 |
Nothing. Only one locality.