Joyce Ong 29 October 2019
This is an R Markdown document containing the results for the AR1 analyses of FAO catch data within 59 LMEs for the period 1955-2014 (60 year timeseries), consisting of 3789 species, with at least 55 years of data for each species timeseries.
library(ggplot2)
library(dplyr, warn.conflicts = F, quietly = T)
## Warning: package 'dplyr' was built under R version 3.5.3
library(reshape2)
FAO catch data was taken from Sea Around Us Project, which organized the catch data into 65 LMEs globally. This was taken from Chris Free’s Github. This dataset contained only marine finfish and invertebrates. Data were filtered according to the following conditions:
Time period 1955-2014 (60 years)
Each species time series contained <1/4 NAs
Unidentified and miscellaneous groups were removed
Latitudes for each LME were taken from centroids.
Function ar() was used with order=1 to generate coefficients for each species timeseries
lme.ar.lat2<-read.csv("lme_species_lat.csv")
lme.ar.lat2<-lme.ar.lat2[,-1]
lme.ar.lat2$lat_fct<-as.factor(lme.ar.lat2$lat_dd2)
str(lme.ar.lat2)
## 'data.frame': 3789 obs. of 7 variables:
## $ species: Factor w/ 927 levels "Ablennes hians",..: 3 4 9 41 46 55 69 74 76 84 ...
## $ arcoef : num 0.627 0.7 0.787 0.777 0.784 ...
## $ lme2 : Factor w/ 59 levels "Agulhas Current",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ lat_dd : num -22.1 -22.1 -22.1 -22.1 -22.1 ...
## $ lat_fct: Factor w/ 59 levels "-46.33","-41.03",..: 9 9 9 9 9 9 9 9 9 9 ...
## $ lat_dd2: num -22.1 -22.1 -22.1 -22.1 -22.1 ...
## $ lat2 : num 486 486 486 486 486 ...
lme.mean.ar<-lme.ar.lat2 %>% group_by(lat_dd2) %>%
summarize(mean=mean(arcoef), med=median(arcoef), sd=sd(arcoef)) %>% as.data.frame()
str(lme.mean.ar)#mean, median and sd of ar1 coeff within each LME grouping
## 'data.frame': 59 obs. of 4 variables:
## $ lat_dd2: num -46.3 -41 -40.8 -35.2 -32.5 ...
## $ mean : num 0.734 0.825 0.869 0.819 0.638 ...
## $ med : num 0.786 0.885 0.881 0.884 0.67 ...
## $ sd : num 0.1569 0.1519 0.0715 0.1627 0.1881 ...
lme.mean.ar$lat_fct<-as.factor(lme.mean.ar$lat_dd2)
de1<-lme.ar.lat2 %>% group_by(lme2) %>% summarise(nsp=n_distinct(species))
head(de1)
## # A tibble: 6 x 2
## lme2 nsp
## <fct> <int>
## 1 Agulhas Current 115
## 2 Aleutian Islands 12
## 3 Arabian Sea 190
## 4 Baltic Sea 25
## 5 Barents Sea 24
## 6 Bay of Bengal 157
Graph of number of species within each LME
lmekeysub<-read.csv("lmekey.csv")
lme.nsp<-de1 %>% left_join(lmekeysub, by=c("lme2" = "name"))
## Warning: Column `lme2`/`name` joining factors with different levels,
## coercing to character vector
nsp1<-ggplot(lme.nsp, aes(lat_dd, nsp))
nsp1 + geom_point() + theme_bw()
Note, X-axis is messy because of all the factor labels for latitude.
p3<-ggplot(lme.ar.lat2, aes(lat_fct, arcoef))
p3 + geom_boxplot() + theme_bw()
p4<-ggplot(lme.ar.lat2, aes(arcoef, lat_dd2))
p4 + geom_point() + theme_bw()
p5a<-ggplot(lme.mean.ar, aes(lat_dd2, mean))
p5a + geom_point() + theme_bw()
p5b<-ggplot(lme.mean.ar, aes(lat_dd2, med))
p5b + geom_point() + theme_bw()
p5c<-ggplot(lme.mean.ar, aes(lat_dd2, sd))
p5c + geom_point() + theme_bw()
## Warning: Removed 1 rows containing missing values (geom_point).
Note that for sd plot, Canadian High artic only had 1 species time series so there is no value for sd. Also, pattern for sd is probably due to the number of species, especially since at the higher latitudes, there are much lower number of species.
m1<-lm(arcoef ~ lat_dd, data=lme.ar.lat2)
summary(m1)#low Adj Rsq, p=0.028, coef=-0.0002055
##
## Call:
## lm(formula = arcoef ~ lat_dd, data = lme.ar.lat2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.60346 -0.07375 0.05063 0.12489 0.20090
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.827e-01 2.891e-03 270.713 <2e-16 ***
## lat_dd -2.055e-04 9.353e-05 -2.197 0.0281 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1621 on 3787 degrees of freedom
## Multiple R-squared: 0.001273, Adjusted R-squared: 0.001009
## F-statistic: 4.826 on 1 and 3787 DF, p-value: 0.0281
par(mfrow=c(2,2), oma=c(0,0,1,0))
plot(m1)
Note that I tried using quadratic equations and even a gam function, but the linear model worked best.
Note, I first tried categorizing into 4 categories of tropical, subtropical, temperate and polar, as well as 7 categories of N & S tropics, N & S subtropics, N and S temperate and N polar (no data exists for S polar), however there was not much difference between these regions, see plots below. Categories were taken from this website.
lme.ar.lat2$lat_abs<-abs(lme.ar.lat2$lat_dd2)
summary(lme.ar.lat2$lat_abs)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.42 11.94 24.67 26.21 37.11 80.54
lme.ar.lat2$climate<-ifelse(lme.ar.lat2$lat_abs<23.5, "tropics",
ifelse(lme.ar.lat2$lat_abs<40, "subtropics",
ifelse(lme.ar.lat2$lat_abs<60, "temperate", "polar")))
lme.ar.lat2$clim_fct<-as.factor(lme.ar.lat2$climate)
lme.ar.lat2$clim_fct<-factor(lme.ar.lat2$clim_fct, levels=c("tropics", "subtropics", "temperate", "polar"))
de2<-lme.ar.lat2 %>% group_by(clim_fct) %>% summarise(nsp=n_distinct(species))%>% as.data.frame()
de2#number of species in each climate zone
## clim_fct nsp
## 1 tropics 604
## 2 subtropics 541
## 3 temperate 353
## 4 polar 54
p7<-ggplot(lme.ar.lat2, aes(clim_fct, arcoef))
p7 + geom_boxplot() + theme_bw()#boxplot of 4 categories
Boxplots for 7 climate categories, separated by hemispheres.
lme.ar.lat2$climNS<-ifelse(lme.ar.lat2$lat_dd2<23.5 & lme.ar.lat2$lat_dd2>0, "N.trop",
ifelse(lme.ar.lat2$lat_dd2>(-23.5) & lme.ar.lat2$lat_dd2<0, "S.trop",
ifelse(lme.ar.lat2$lat_dd2>23.5 & lme.ar.lat2$lat_dd2<40, "N.subtrop",
ifelse(lme.ar.lat2$lat_dd2<(-23.5) & lme.ar.lat2$lat_dd2>(-40), "S.subtrop",
ifelse(lme.ar.lat2$lat_dd2>40 & lme.ar.lat2$lat_dd2<60, "N.temp",
ifelse(lme.ar.lat2$lat_dd2<(-40) & lme.ar.lat2$lat_dd2>(-60), "S.temp", "N.polar"))))))
lme.ar.lat2$climNS_fct<-as.factor(lme.ar.lat2$climNS)
lme.ar.lat2$climNS_fct<-factor(lme.ar.lat2$climNS_fct,
levels=c("S.temp", "S.subtrop", "S.trop",
"N.trop", "N.subtrop", "N.temp", "N.polar"))
p9<-ggplot(lme.ar.lat2, aes(climNS_fct, arcoef))
p9 + geom_boxplot() + theme_bw()#boxplot of 7 categories by N and S hemispheres
lme.ar.lat2$clim2<-ifelse(lme.ar.lat2$clim_fct=="tropics", "trop", "non-trop")
t.test(arcoef ~ clim2, data=lme.ar.lat2)#unequal variances
##
## Welch Two Sample t-test
##
## data: arcoef by clim2
## t = -2.5775, df = 3742.8, p-value = 0.009991
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.023936362 -0.003253744
## sample estimates:
## mean in group non-trop mean in group trop
## 0.7735909 0.7871859
This dataset consists of 3789 species timeseries globally, with 1803 sp timeseries in the tropics (-23.5 to 23.5 latitude), and 1986 species timeseries not in the tropics. Of these, there were 604 unique species in tropics, and 645 unique species at higher latitudes. Biologically, the difference in ar1 coeff is not that different, with tropics mean = 0.79 and non-tropics mean = 0.77.
p10<-ggplot(lme.ar.lat2, aes(clim2, arcoef))
p10 + geom_violin(draw_quantiles=0.5, scale="area") + theme_bw()