Dissertation Research: The purpose of this RPub is to aid in partitioning the data set for Fst tests. The vegan-generated PCs may aid in grouping samples according to climate similarity for Fst testing. For example, I could partition samples into high,moderate, and low values on PC1,PC2,PC3 etc.
PCs were generated with climate data from WorldClim, Bioclim variables. Fick, S.E. and R.J. Hijmans, 2017. Worldclim 2: New 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology.
The Cedrela climate space was estimated with GBIF data. GBIF.org (4th August 2017) GBIF Occurrence Download http://doi.org/10.15468/dl.tl0eqz
Data included:
Latitude and Longitude were included in the variable to produce the PCA.
These plots also show the portion of Cedrela spp climate space that is captured by out Cedrela spp samples.
Options:
A: Partition my samples according to their position on one or more of the climate PCs for Fst tests as described above.
B: Partition samples according to one or more climate variable (forming three ~balanced groups based on Temperature etc.)
These partitions would be in addition to species and regional groups.
library(raster)
library(sp)
library(ggplot2)
library(vegan)
library(gridExtra)
#load data from worldclim
r <- getData("worldclim",var="bio",res=10)
names(r)<-c("a.m.t", "m.d.r", "iso", "t.s", "max.t.wm", "min.t.cm", "t.a.r", "m.t.wetq", "m.t.dryq", "m.t.warmq", "m.t.coldq", "a.pre", "pre.wetm", "pre.drym", "pre.s", "pre.wetq", "pre.dryq", "pre.warmq", "pre.coldq")
#load my data and get climate data for these locations
all.spp<-read.csv("20170803_mapping.csv",header=1)
names(all.spp)
## [1] "Sample.ID" "Origin" "simplified.name"
## [4] "Spp" "Tube_ID" "ElevationDisplay"
## [7] "LatitudeDecimal" "LongitudeDecimal" "Accession.Numbers"
## [10] "altitude..ft."
coords<-all.spp[c(8,7)]
samps<-all.spp[c(1:6,9:10)]
points <- SpatialPoints(coords, proj4string = r@crs)
values <- extract(r,points)
my.data <- cbind.data.frame(coordinates(points),values)
my.data<-cbind.data.frame(samps, my.data)
#faster to load the dataset
df<-read.csv("20170804_ced_spp_gbif_clim.csv",header=1)
#combine the gbif data with my data
group<-rep("gbif",3754)
df<-cbind.data.frame(df,group)
names(my.data)[4]<-"group"
mine<-my.data[c(4,9,10,8,11:29)]
names(mine)[2]<-"long"
names(mine)[3]<-"lat"
names(mine)[4]<-"alt"
gbif<-df[c(27,6,7,5,8:26)]
comb.clim<-rbind.data.frame(gbif,mine)
summary(comb.clim) #make sure there are no NAs
## group long lat alt
## gbif:3754 Min. :-110.95 Min. :-35.3869 Min. : 0.0
## CEAN: 18 1st Qu.: -89.63 1st Qu.:-13.5833 1st Qu.: 576.4
## CEFI: 39 Median : -76.45 Median : 4.0840 Median : 1428.6
## CEMO: 22 Mean : -75.45 Mean : 0.9195 Mean : 2477.3
## CENE: 8 3rd Qu.: -63.74 3rd Qu.: 17.1204 3rd Qu.: 3673.4
## CEOD: 97 Max. : 32.57 Max. : 26.8167 Max. :16057.4
## CESA: 8
## a.m.t m.d.r iso t.s
## Min. : 48 Min. : 60.0 Min. :44.00 Min. : 129
## 1st Qu.:195 1st Qu.: 99.0 1st Qu.:63.00 1st Qu.: 626
## Median :231 Median :111.0 Median :70.00 Median :1287
## Mean :221 Mean :114.6 Mean :70.22 Mean :1494
## 3rd Qu.:255 3rd Qu.:128.0 3rd Qu.:77.00 3rd Qu.:2114
## Max. :285 Max. :184.0 Max. :93.00 Max. :5744
##
## max.t.wm min.t.cm t.a.r m.t.wetq
## Min. : 94.0 Min. :-73.0 Min. : 74.0 Min. : 45.0
## 1st Qu.:282.0 1st Qu.:100.0 1st Qu.:134.0 1st Qu.:209.0
## Median :314.0 Median :145.0 Median :164.0 Median :242.0
## Mean :302.3 Mean :136.1 Mean :166.2 Mean :229.5
## 3rd Qu.:331.0 3rd Qu.:174.0 3rd Qu.:195.0 3rd Qu.:260.0
## Max. :403.0 Max. :230.0 Max. :322.0 Max. :296.0
##
## m.t.dryq m.t.warmq m.t.coldq a.pre
## Min. : 34.0 Min. : 50.0 Min. : 30.0 Min. : 223
## 1st Qu.:174.0 1st Qu.:216.0 1st Qu.:166.0 1st Qu.:1157
## Median :218.0 Median :251.0 Median :208.0 Median :1542
## Mean :209.4 Mean :237.7 Mean :200.1 Mean :1705
## 3rd Qu.:247.0 3rd Qu.:268.0 3rd Qu.:234.0 3rd Qu.:2126
## Max. :295.0 Max. :309.0 Max. :277.0 Max. :7962
##
## pre.wetm pre.drym pre.s pre.wetq
## Min. : 55.0 Min. : 0.00 Min. : 6.00 Min. : 154.0
## 1st Qu.:188.2 1st Qu.: 9.00 1st Qu.: 43.00 1st Qu.: 502.2
## Median :247.5 Median : 28.00 Median : 61.00 Median : 664.0
## Mean :271.1 Mean : 43.68 Mean : 61.24 Mean : 728.4
## 3rd Qu.:344.0 3rd Qu.: 57.00 3rd Qu.: 81.75 3rd Qu.: 934.0
## Max. :826.0 Max. :471.00 Max. :125.00 Max. :2356.0
##
## pre.dryq pre.warmq pre.coldq
## Min. : 0.0 Min. : 37.0 Min. : 3.0
## 1st Qu.: 38.0 1st Qu.: 298.0 1st Qu.: 66.0
## Median : 112.0 Median : 421.0 Median : 176.5
## Mean : 154.4 Mean : 441.4 Mean : 293.9
## 3rd Qu.: 204.0 3rd Qu.: 548.0 3rd Qu.: 394.0
## Max. :1521.0 Max. :1845.0 Max. :2125.0
##
#generate the PCA
comb.clim.pca <- rda(comb.clim[c(2:23)])
PCs<-data.frame(scores(comb.clim.pca, c(1,2,3,4,5), display=c("sites")))
gbif.mine.clim.pcs<-cbind.data.frame(comb.clim,PCs)
#plot
finch <- gbif.mine.clim.pcs[ which(gbif.mine.clim.pcs$group!='gbif'), ]
pc1.2<-ggplot(data=gbif.mine.clim.pcs,aes(PC1,PC2))+geom_hex(bins=50)+theme_bw()+
scale_fill_gradient(low = "grey", high = "black",
name="Sample\nDensity")+
geom_point(data=finch, aes(PC1,PC2, color=group))+
scale_color_brewer(palette = "Set1",
name="Species")+
labs(x="PC1",y="PC2")+
theme(panel.grid = element_blank(),
plot.title = element_text(hjust = 0),
axis.title=element_text(size=8))+ggtitle("A")
pc1.3<-ggplot(data=gbif.mine.clim.pcs,aes(PC1,PC3))+geom_hex(bins=50)+theme_bw()+
scale_fill_gradient(low = "grey", high = "black",
name="Sample\nDensity")+
geom_point(data=finch, aes(PC1,PC3, color=group))+
scale_color_brewer(palette = "Set1",
name="Species")+
labs(x="PC1",y="PC3")+
theme(panel.grid = element_blank(),
plot.title = element_text(hjust = 0),
axis.title=element_text(size=8))+ggtitle("B")
pc2.3<-ggplot(data=gbif.mine.clim.pcs,aes(PC2,PC3))+geom_hex(bins=50)+theme_bw()+
scale_fill_gradient(low = "grey", high = "black",
name="Sample\nDensity")+
geom_point(data=finch, aes(PC2,PC3, color=group))+
scale_color_brewer(palette = "Set1",
name="Species")+
labs(x="PC2",y="PC3")+
theme(panel.grid = element_blank(),
plot.title = element_text(hjust = 0),
axis.title=element_text(size=8))+ggtitle("C")
pc1.2
pc1.3
pc2.3