1 Introduction

For the project I am working now, I have been playing with biogeographic regions of the world (zoogeographic regions, Wallace). This gave me the opportunity to dive into ‘sf’, which is really excellent.

All text in the classification schemes was simply copied from the internet sections, where the data was found/bibliogeraphy.

2 Global classification schemes

“At least 25 marine classifications have been developed for the purpose of fisheries, and environment and conservation management, and some used the term “ecosystem.” Most were based on expert opinion or ad hoc management areas, but recent studies have used data analysis to map ecosystems at a global scale. Thus, there are now global ocean ecosystem classifications that distinguish up to 28 “units” based on environmental data analysis to a depth of 5500 m. However, whether these units have unique species communities remains to be determined. It is possible that animals move between these units vertically (diel migrations) and horizontally. The increased availability of data on species distributions will enable such questions to be answered.” For a recent review, please see: Zhao, Q., & Costello, M. J. (2019). Marine Ecosystems of the World. In Encyclopedia of the World’s Biomes

“The delineation of biogeographical regions has been one of the main focuses of naturalists, and later of ecologists, since Buffon (1761) noted the contrasting mammalian faunas of the tropical Old World and New World. Today, the process of gathering localities into regions and setting biogeographical boundaries is often considered a primary step in the establishment and evaluation of conservation priorities (Whiting et al., 2000; Olson et al., 2001).” “the basic reason for delineating biogeographical units is to distinguish between regional species pools over large spatial scales; that is, to identify regions with distinct faunas or floras (Kreft & Jetz, 2010).” in Mouillot, D., De Bortoli, J., Leprieur, F., Parravicini, V., Kulbicki, M., & Bellwood, D. R. (2013). The challenge of delineating biogeographical regions: nestedness matters for Indo-Pacific coral reef fishes. Journal of Biogeography, 40(12), 2228–2237.

2.1 Oceans divisions

2.1.1 Marine Ecoregions of the World (MEOW) (Spalding et al., 2007)

This are the ones we used in the data exploration. MEOW is a biogeographic classification of the world’s coasts and shelves. It is the first ever comprehensive marine classification system with clearly defined boundaries and definitions and was developed to closely link to existing regional systems. The ecoregions nest within the broader biogeographic tiers of Realms and Provinces. MEOW represents broad-scale patterns of species and communities in the ocean, and was designed as a tool for planning conservation across a range of scales and assessing conservation efforts and gaps worldwide. The current system focuses on coast and shelf areas (as this is where the majority of human activity and conservation action is focused) and does not consider realms in pelagic or deep benthic environment. It is hoped that parallel but distinct systems for pelagic and deep benthic biotas will be devised in the near future. The project was led by The Nature Conservancy (TNC) and the World Wildlife Fund (WWF), with broad input from a working group representing key NGO, academic and intergovernmental conservation partners.

In Encyclopedia of the World’s Biomes

“In order that generally small habitats or ecological communities can be recognized for management purposes, a new publication entitled Marine Ecoregions ofthe World is now available (Spalding et al., 2007). The 15 authors of this comprehensive map utilized the global biogeographic arrangement of Briggs (1974, 1995) together with many additional sources. The result was a classification that generally recognized the traditional biogeographic regions (realms) and provinces but, nested within the latter, a new series of 232 ecoregions(…)” (in Briggs, J., & Bowen, B. W. (1974). A realignment of marine biogeographic provinces with particular reference to fish distributions. Marine Biogeography, 1–57.)

require(sf)

# World map
library(rnaturalearth)
world_map <- rnaturalearth::ne_countries(scale = 'small', returnclass = c("sf"))


# Base map
kk <- ggplot() +
  geom_sf(data = world_map, size = .2, fill = "gray80", col = "gray90") +
  theme(panel.grid.major = element_line(color = gray(0.9), linetype = "dashed", size = 0.5))
# meow
meow <- sf::read_sf("shapes/MEOW/meow_ecos.shp") # it read as #c("MEOW.ECOREGION","MEOW.PROVINCE","MEOW.REALM","MEOW.Lat_Zone")

# see map
kk+  
  geom_sf(data = meow, aes(fill = REALM), size = .2, col = 0, alpha=.3)+
  ggtitle("MEOW REALM")+
  ggtitle(paste("Marine Ecoregions of the World  (MEOW)(Spalding et al., 2007) - ", length(unique(meow$REALM)),"realms"))+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("meow.png")

2.1.2 Marine Ecoregions and Pelagic Provinces of the World (2007, 2012)

This dataset combines two separately published datasets: the “Marine Ecoregions Of the World” (MEOW; 2007) and the “Pelagic Provinces Of the World” (PPOW; 2012). These datasets were developed by Mark Spalding and colleagues in The Nature Conservancy. Alongside the individual authors, partners for the MEOW layer included WWF, Ramsar, WCS, and UNEP-WCMC. The ecoregions and pelagic provinces are broadly aligned with each other and are non-overlapping. The MEOW dataset shows a biogeographic classification of the world’s coastal and continental shelf waters, following a nested hierarchy of realms, provinces and ecoregions. It describes 232 ecoregions, which lie within 62 provinces and 12 large realms. The regions aim to capture generic patterns of biodiversity across habitats and taxa, with regions extending from the coast (intertidal zone) to the 200 m depth contour (extended beyond these waters out by a 5 km buffer). The PPOW dataset shows a biogeographic classification of the surface pelagic (i.e. epipelagic) waters of the world’s oceans. It describes 37 pelagic provinces of the world, nested into four broad realms. A system of seven biomes are also identified ecologically, and these are spatially disjoint but united by common abiotic conditions, thereby creating physiognomically similar communities.

Data

#### Add LME regions information:
meowppo <- sf::read_sf("shapes/DataPack-14_001_WCMC036_MEOW_PPOW_2007_2012_v1/01_Data/WCMC-036-MEOW-PPOW-2007-2012.shp") 
names(meowppo)
## [1] "ECOREGION" "REALM"     "TYPE"      "PROVINC"   "BIOME"     "geometry"
# simplify the object to make it 'usable'
object.size(meowppo)
## 263464576 bytes
meowppo <- meowppo %>% 
  sf::st_simplify(dTolerance = 0.01 )

object.size(meowppo)
## 12054344 bytes
meowppo <- meowppo %>% 
  dplyr::group_by(PROVINC, REALM, TYPE) %>% 
  dplyr::summarise()
# plot(meowppo)
object.size(meowppo)
## 11933648 bytes
# plot
coli.fun = colorRampPalette(brewer.pal(12, "Set3"))
kk+  
  geom_sf(data = meowppo, aes(fill = REALM, linetype = TYPE), 
          size = .2, alpha=1)+
  scale_fill_manual(values=coli.fun(16))+ 
  ggtitle(paste("Marine Ecoregions and Pelagic Provinces of the World (MEOW.PPO) - ", length(unique(meowppo$REALM)),"realms"))+
  coord_sf(expand = FALSE)+
  theme(legend.position="bottom")

if(plot.save==TRUE)
   ggsave("meowppo.png")


# plot
coli.fun = colorRampPalette(brewer.pal(12, "Set3"))
kk+  
  geom_sf(data = meowppo, aes(fill = PROVINC, linetype = TYPE), 
          size = .2, alpha=1)+
  scale_fill_manual(values=coli.fun(length(unique(meowppo$PROVINC))))+ 
  ggtitle(paste("Marine Ecoregions and Pelagic Provinces of the World (MEOW.PPO) - ", length(unique(meowppo$PROVINC)),"realms"))+
  geom_sf_text(data = meowppo %>% group_by(PROVINC) %>% summarise() %>% st_centroid(), aes(label = PROVINC), colour = "grey10", size=3, check_overlap =TRUE)+
  coord_sf(expand = FALSE)+
  theme(legend.position="none")

if(plot.save==TRUE)
   ggsave("meowppo_provinc.png")

2.1.3 Biogeographic Marine Realms based on species endemicity (BMRE, Costello, 2017, Nature Comm)

“Marine biogeographic realms have been inferred from small groups of species in particular environments (e.g., coastal, pelagic), without a global map of realms based on statistical analysis of species across all higher taxa. Here we analyze the distribution of 65,000 species of marine animals and plants, and distinguish 30 distinct marine realms, a similar proportion per area as found for land. On average, 42% of species are unique to the realms. We reveal 18 continental-shelf and 12 offshore deep-sea realms, reflecting the wider ranges of species in the pelagic and deep-sea compared to coastal areas. The most widespread species are pelagic microscopic plankton and megafauna. Analysis of pelagic species recognizes five realms within which other realms are nested. These maps integrate the biogeography of coastal and deep-sea, pelagic and benthic environments, and show how land-barriers, salinity, depth, and environmental heterogeneity relate to the evolution of biota. The realms have applications for marine reserves, biodiversity assessments, and as an evolution relevant context for climate change studies.”(Costello, 2017, Nature Comm).

Data was downloaded from: https://auckland.figshare.com/articles/GIS_shape_files_of_realm_maps/5596840/1

#### Add LME regions information:
bmre <- sf::read_sf("shapes/MarineRealmsShapeFile/MarineRealms.shp") 
names(bmre)
## [1] "Realm"    "geometry"
# Simplify to make it more friendly
object.size(bmre)
## 23607168 bytes
bmre <- bmre %>% 
  sf::st_simplify(dTolerance = 0.01)
object.size(bmre)
## 219272 bytes
kk+  
  geom_sf(data = bmre, aes(fill = factor(Realm)), size = .2, alpha=.3)+
  ggtitle(paste("Biogeographic Marine Realms based on species endemicity (BMRE) - ", length(unique(bmre$Realm)),"realms"))+
  geom_sf_text(data = bmre, aes(label = Realm), colour = "white")+
  coord_sf(expand = FALSE)+
  theme(legend.position="none")

if(plot.save==TRUE)
   ggsave("bmre.png")

2.1.4 Large Marine Ecosystems of the World (66 classes; LME)

LMEs are natural regions of ocean space encompassing coastal waters from river basins and estuaries to the seaward boundary of continental shelves and the outer margins of coastal currents. They are relatively large regions of 200,000 km2 or greater, the natural boundaries of which are based on four ecological criteria: bathymetry, hydrography, productivity, and trophically related populations. The theory, measurement, and modeling relevant to monitoring the changing states of LMEs are imbedded in reports on ecosystems with multiple steady states, and on the pattern formation and spatial diffusion within ecosystems. The concept that critical processes controlling the structure and function of biological communities can best be addressed on a regional basis has been applied to the ocean by using LMEs as the distinct units for marine resources assessment, monitoring, and management.

#### Add LME regions information:
lme <- sf::read_sf("shapes/LME66/LMEs66.shp")
names(lme)
##  [1] "OBJECTID"   "LME_NUMBER" "LME_NAME"   "GROUPING"   "ARCTIC"    
##  [6] "USLMES"     "Shape_Leng" "Shape_Area" "SUM_GIS_KM" "geometry"
# simplify the object to make it 'usable'
lme <- lme %>% 
  sf::st_simplify(dTolerance = 0.01)

# plot
kk+  
  geom_sf(data = lme, aes(fill = LME_NAME), size = .2, col = 0, alpha=.3)+
  ggtitle(paste("LME - Large Marine Ecosystems of the World - ", length(unique(lme$LME_NAME)),"ecosystems"))+
  geom_sf_text(data = lme, aes(label = LME_NAME), colour = "grey40", check_overlap =TRUE)+
  coord_sf(expand = FALSE)+theme(legend.position="none")

if(plot.save==TRUE)
   ggsave("lme.png")

2.1.5 Longhurst Provinces

The dataset represents the division of the world oceans into provinces as defined by Longhurst (1995; 1998; 2006). The division has been based on the prevailing role of physical forcing as a regulator of phytoplankton distribution. The dataset contains the initial static boundaries developed at the Bedford Institute of Oceanography, Canada. Note that the boundaries of these provinces are not fixed in time and space, but are dynamic and move under seasonal and interannual changes in physical forcing. At the first level of reduction, Longhurst recognised four principal biomes: the Polar biome, the Westerlies biome, the Trade winds biome, and the Coastal biome. These four biomes are recognised in every major ocean basin. At the next level of reduction, the ocean basins are divided into provinces, roughly ten for each basin. These regions provide a template for data analysis or for making parameter assignments on a global scale. Please refer to Longhurst’s publications when using these shapefiles.

A summary table has been prepared by Mathias Taeger and David Lazarus, Museum für Naturkunde, Berlin (2010-03-26). This table makes it easier to relate the classification of Longhurst to the the original quantitative parameters used to create it. Productivity values are from the table in Longhurst, 1995, Chlorophyll values; photic depth and mixed layer depth originate from graphs in Longhurst, 1998. The sea temperature at 0 and 50 m are from the World Ocean Atlas (2005), average values were calculated in ArcGIS. Each parameter value was set into 5 equal intervals. Download summary table.

Data from marine regions site

#### Add longhurst regions information:
longhurst <- sf::read_sf("shapes/longhurst_v4_2010/Longhurst_world_v4_2010.shp")
names(longhurst)
## [1] "ProvCode"  "ProvDescr" "geometry"
head(longhurst)
## Simple feature collection with 6 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -180 ymin: 25.5 xmax: 180 ymax: 90
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## # A tibble: 6 x 3
##   ProvCode ProvDescr                                               geometry
##   <chr>    <chr>                                        <MULTIPOLYGON [°]>
## 1 BPLR     Polar - Boreal Polar Pro~ (((-161.1843 63.5, -161.5 63.5, -161.~
## 2 ARCT     Polar - Atlantic Arctic ~ (((-21.51305 64.64409, -21.55945 64.6~
## 3 SARC     Polar - Atlantic Subarct~ (((11.26472 63.96082, 11.09548 63.886~
## 4 NADR     Westerlies - N. Atlantic~ (((-11.5 57.5, -11.5 56.5, -11.5 55.5~
## 5 GFST     Westerlies - Gulf Stream~ (((-43.5 43.5, -43.5 42.5, -43.5 41.5~
## 6 NASW     Westerlies - N. Atlantic~ (((-39.5 25.5, -40.5 25.5, -41.5 25.5~
# simplify the object to make it 'usable'
longhurst <- longhurst %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(ProvCode,ProvDescr) %>% 
  dplyr::summarise()
# plot(longhurst)

# plot
kk+  
  geom_sf(data = longhurst, aes(fill = ProvCode), size = .2, col = "grey50", alpha=.4)+
  ggtitle(paste("Longhurst Biogeochemical Provinces -", length(unique(longhurst$ProvCode)),"provinces"))+
  theme(legend.position="none")+
  geom_sf_text(data = longhurst %>% group_by(ProvDescr) %>% summarize(n()), aes(label = ProvDescr), colour = "grey20", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("longhurst.png")

2.1.6 Pelagic Provinces of the World (PPOW)

The Pelagic Provinces of the World (PPOW) dataset shows a biogeographic classification of the surface pelagic (i.e. epipelagic) waters of the world’s oceans. It describes 37 pelagic provinces of the world, nested into four broad realms. A system of seven biomes are also identified ecologically, and these are spatially disjoint but united by common abiotic conditions, thereby creating physiognomically similar communities.

This data set was merged with MEOW by UN and is found in the section above.

Abstract: Off-shelf waters cover 66% of the planet. Growing concerns about the state of natural resources in these waters, and of future threats have led to a growing movement to improve management and conservation of natural resources. However, efforts to assess progress and to further plan and prioritise management interventions have been held back in part by the lack of a comprehensive biogeographic classification for the high seas. In this work we review existing efforts at classifying the surface pelagic waters of the world’s oceans and we present a synthesis classification which draws both on known taxonomic biogeography and on the oceanographic forces which are major drivers of ecological patterns. We describe a nested system of 37 pelagic provinces of the world, nested into a system of four broad realms. Ecologically we have also differentiated a system of 7 biomes which are spatially disjoint but united by common abiotic conditions creating physiognomically similar communities. This system builds on existing work and is further intended to align closely with the coastal biogeographic regionalisation provided by the Marine Ecoregions of the World classification. It is hoped that it will provide a valuable tool in supporting threat analysis, priority setting, policy development and active management of the world’s pelagic oceans. Spalding, M. D., Agostini, V. N., Rice, J., & Grant, S. M. (2012). Pelagic provinces of the world: A biogeographic classification of the world’s surface pelagic waters. Ocean & Coastal Management, 60, 19–30. https://doi.org/10.1016/J.OCECOAMAN.2011.12.016

2.1.7 Ecological Marine Units (62 realms; 13 ocean upper col; ARCGIS)

This 3D model (encompasses depth, lat and long) was developed in ARCGIS using k-means cluster analysis coupled with data mining tools, applied to XX variables (species distribution (realm and modelled), temperature, salinity, oxygen, nitrate, phosphate and silicate) to summarise/group world ocean. The data set corresponds to 57 years average, 0.25º latitude (~27km bands) and 102m depth bands.

# From Edzer Pebesma, topic on r-geo-sig

# Although we were unable to open the complete data set in r 
# directly, we were able to do so by opening a section, quering 
# it with SQL 

library(sf)
system.time(r.d1 <- st_read("D:/EMU_Z_Opendata/EMU.gpkg", query = "select * from EMU_Master where depth_lvl = 1"))

# Code chunks copied from below (adapted with the help of Barry Rowlingson):
require(dplyr)
dr <- r.d1 %>% 
  dplyr::rename("x"=POINT_X,"y"=POINT_Y) %>%
  dplyr::mutate(NameEMUn = as.numeric(NameEMU)) %>%
  dplyr::select(x,y, NameEMUn) %>% 
  st_set_geometry(NULL)

drz <- rasterFromXYZ(dr)
system.time(drzp <- rasterToPolygons(drz,dissolve=TRUE))
# spplot(drzp,"layer")

emu <- st_as_sf(drzp)
st_crs(emu)=4326
emu$NameEMU = levels(d$NameEMU)
plot(emu)
## Allright!!!
# Export as a shape file:
# write_sf(emu, "emu.shp")
library(sf)

# # Thanks to Barry Rowlingson (R-sig-geo)
# system.time(d <- read.table("shapes/top_EMU.csv",sep=",",head=TRUE))
# dim(d); names(d)
# head(d)
# levels(d$NameEMU)
# 
# # Using the package:
# dr <- d %>% 
#   dplyr::rename("x"=POINT_X,"y"=POINT_Y) %>% 
#   dplyr::mutate(NameEMUn = as.numeric(NameEMU)) %>% 
#   dplyr::select(x,y, NameEMUn)
# drz <- rasterFromXYZ(dr) 
# 
# system.time(drzp <- rasterToPolygons(drz,dissolve=TRUE))
# # spplot(drzp,"layer")
# 
# emu <- st_as_sf(drzp)
# st_crs(emu)=4326
# emu$NameEMU = levels(d$NameEMU)
# plot(emu)
### Allright!!!
# # Export as a shape file:
# write_sf(emu, "emu.shp")

#### Add EMU regions information:
emu <- sf::read_sf("shapes/emu/emu.shp")

# simplify the object to make it 'usable'
emu <- emu %>% 
  sf::st_simplify(dTolerance = 0.01)
# plot(emu)

# plot
coli.fun = colorRampPalette(brewer.pal(12, "Set3"))
kk+  
  geom_sf(data = emu, aes(fill = factor(layer)), size = .2, col = 0, alpha=.8)+
  ggtitle(paste("Ecological Marine Units top layer -", length(unique(emu$layer)),"units"))+
   scale_fill_manual(values=coli.fun(32))+
  theme(legend.position="none")+
  #geom_sf_text(data = emu, aes(label = layer), colour = "grey80", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("emu_top.png")

#### EMU sent by Keith (ESRI)
# emu.esri <- sf::read_sf("shapes/EMU_Top/EMU_Top.shp")
# head(emu.esri)

2.1.8 FAO fishing areas (FAO)

Food and Agriculture Organization of the United Nations - Fisheries and Aquaculture Department. For statistical purposes, 27 major fishing areas have been internationally established to date. These comprise eight major inland fishing areas covering the inland waters of the continents, and nineteen major marine fishing areas covering the waters of the Atlantic, Indian, Pacific and Southern Oceans, with their adjacent seas.

#### Add fao regions information:
fao <- sf::read_sf("shapes/FAO_AREAS/FAO_AREAS.shp") 
names(fao)
##  [1] "FID"        "F_CODE"     "F_LEVEL"    "F_STATUS"   "OCEAN"     
##  [6] "SUBOCEAN"   "F_AREA"     "F_SUBAREA"  "F_DIVISION" "F_SUBDIVIS"
## [11] "F_SUBUNIT"  "ID"         "NAME_EN"    "NAME_FR"    "NAME_ES"   
## [16] "SURFACE"    "geometry"
# simplify the object to make it 'usable'
fao <- fao %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(F_AREA) %>% 
  dplyr::summarise()
# plot(fao)

# plot
kk+  
  geom_sf(data = fao, aes(fill = F_AREA), size = .2, col = 0)+
  #geom_text(data=st_centroid(fao),aes(x=x,y=y,label=F_CODE))+
  ggtitle(paste("FAO fishing areas -", length(unique(fao$F_AREA)),"areas"))+
  theme(legend.position="none")+
  geom_sf_text(data = fao %>% group_by(F_AREA) %>% summarize(n()), aes(label = F_AREA), colour = "grey30", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("fao.png")

2.2 Freshwater/terrestrial divisions

2.2.1 Global International Waters Assessment’s Terrestrial WSs and Large Marine Ecosystems (GIWA)

GIWA_LME: Global International Waters Assessment’s Terrestrial WSs and Large Marine Ecosystems, a medium resolution WS delineation based on terrestrial modifications to the NOAA-URI Large Marine Ecosystems. The GIWA_LME shapefile data layer is comprised of 2936 derivative vector large marine ecosystems and terrestrial basins features derived based on ~100 000 cell data originally from GIWA - URI. The layer provides nominal analytical/mapping at 1:20 000 00. Data processing is complete and under revision globally.

Data from FAO

#### Add giwa regions information:
giwa <- sf::read_sf("shapes/giwa_lme/giwa_lme.shp")
names(giwa)
## [1] "GIWALME_ID" "LAKE"       "LAND"       "GIWAREGION" "MEGAREGION"
## [6] "NAME"       "CHANGED"    "NOTES"      "geometry"
# simplify the object to make it 'usable'
giwa <- giwa %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(MEGAREGION) %>% 
  dplyr::summarise()
# plot(giwa)

#plot
kk+  
  geom_sf(data = giwa, aes(fill = MEGAREGION), size = .2, col = 0, alpha=.3)+
  ggtitle("(68") +
  ggtitle(paste("GIWA - Global International Water Assessment -", length(unique(giwa$MEGAREGION)),"megaregions"))+
  geom_sf_text(data = giwa %>% group_by(MEGAREGION) %>% summarize(n()), aes(label = MEGAREGION), colour = "grey40", check_overlap =TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("giwa.png")

2.2.2 CMEC Zoogeographic Realms and Regions (updated from Wallace)

Authors: Holt et al. (DK)

Description: Modern attempts to produce biogeographic maps focus on the distribution of species, and the maps are typically drawn without phylogenetic considerations. Here, we generate a global map of zoogeographic regions by combining data on the distributions and phylogenetic relationships of 21,037 species of amphibians, birds, and mammals. We identify 20 distinct zoogeographic regions, which are grouped into 11 larger realms. We document the lack of support for several regions previously defined based on distributional data and show that spatial turnover in the phylogenetic composition of vertebrate assemblages is higher in the Southern than in the Northern Hemisphere. We further show that the integration of phylogenetic information provides valuable insight on historical relationships among regions, permitting the identification of evolutionarily unique regions of the world.

Macroecology DK

Reference: Holt, B. G., Lessard, J. P., Borregaard, M. K., Fritz, S. A., Araújo, M. B., Dimitrov, D., … Rahbek, C. (2013). An update of Wallace’s zoogeographic regions of the world. Science (Vol. 339). https://doi.org/10.1126/science.1228282 Kreft, H., & Walter Jetz. (2013). Comment on: “An Update of Wallace’s Zoogeographic Regions of the World.” Science, 341(6144), 343. https://doi.org/10.1517/14656566.2013.795544

Note that the projection needs to be verified/changed.

#### New ones
cmec <- sf::read_sf("shapes/CMEC regions & realms/newRealms.shp")
names(cmec)
## [1] "OBJECTID"   "fullupgmar" "Shape_Leng" "Shape_Area" "Realm"     
## [6] "geometry"
# simplify the object to make it 'usable'
cmec <- cmec %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(Realm) %>% 
  dplyr::summarise()
# plot(cmec)

# plot
kk+  
  geom_sf(data = cmec, aes(fill = Realm), size = .2, col = 0, alpha=.4)+
  ggtitle(paste("Zoogeographic terrestrial realms -", length(unique(cmec$Realm)),"realms"))+
  theme(legend.position="none")+
  geom_sf_text(data = cmec, aes(label = Realm), colour = "grey30", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("Wallace1.png")



#### Regions
cmec.regions <- sf::read_sf("shapes/CMEC regions & realms/Regions.shp")
names(cmec.regions)
## [1] "OBJECTID_1" "OBJECTID"   "Shape_Leng" "Regions"    "Shape_Le_1"
## [6] "Shape_Area" "geometry"
# simplify the object to make it 'usable'
cmec.regions <- cmec.regions %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(Regions) %>% 
  dplyr::summarise()

# plot
ggplot() +
  theme(panel.grid.major = element_line(color = gray(0.9), linetype = "dashed", size = 0.5))+
  geom_sf(data = cmec.regions, aes(fill = Regions), size = .2, col = 0, alpha=.4)+
  ggtitle(paste("Zoogeographic terrestrial regions -", length(unique(cmec.regions$Regions)),"regions"))+
  theme(legend.position="none")+
  geom_sf_text(data = cmec.regions, aes(label = Regions), colour = "grey30", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("Wallace2.png")

2.2.3 Global River Classification (GloRiC)

Authors: WWF Description: The Global River Classification GloRiC provides a database of river types and sub-classifications for all river reaches globally. Version 1.0 of GloRiC provides a hydrologic, physio-climatic, and geomorphic sub-classification, as well as a combined type for every river reach, resulting in a total of 127 river reach types. It also offers a k-means statistical clustering of the reaches into 30 groups. The dataset comprises 8.5 million river reaches with a total length of 35.9 million km.

link from the WWF more info

Notes: The file is subtantially large, thus it takes time to read. I have tried to simplify it but still not maanaged. In progress…

#### Add gloric regions information:
gloric <- sf::read_sf("shapes/GloRiC_v10_shapefile/GloRiC_v10_shapefile/GloRiC_v10.shp")
names(gloric)
gloric %>% data.frame() %>% head()

# simplify the object to make it 'usable'
object.size(gloric)
gloric <- gloric %>% 
  dplyr::filter(Length_km > 1)

gloric <- gloric %>% 
  sf::st_simplify(dTolerance = 0.1) 
object.size(gloric)


%>% 
  dplyr::group_by(MEGAREGION) %>% 
  dplyr::summarise()
# plot(giwa)

# # try to simplify it:
# kk1 <- gloric %>% 
#   dplyr::filter(Length_km > 1) %>%  
#   dplyr::group_by(Reach_type) %>% 
#   sf::st_simplify( dTolerance = 0.01)
#   # sf::st_cast("MULTILINESTRING")
#   #  # alternative

2.2.4 Hydrosheds

Author: WWF

Description: HydroSHEDS (Hydrological data and maps based on SHuttle Elevation Derivatives at multiple Scales) provides hydrographic information in a consistent and comprehensive format for regional and global-scale applications. HydroSHEDS offers a suite of geo-referenced data sets in raster and vector format, including stream networks, watershed boundaries, drainage directions, and ancillary data layers such as flow accumulations, distances, and river topology information. The goal of developing HydroSHEDS was to generate key data layers to support regional and global watershed analyses, hydrological modeling, and freshwater conservation planning at a quality, resolution and extent that has previously been unachievable. Available resolutions range from 3 arc-second (approx. 90 meters at the equator) to 5 minute (approx. 10 km at the equator) with seamless near-global extent.

HydroSHEDS has been developed by the Conservation Science Program of World Wildlife Fund (WWF), in partnership or collaboration with the U.S. Geological Survey (USGS); the International Centre for Tropical Agriculture (CIAT); The Nature Conservancy (TNC); McGill University, Montreal, Canada; the Australian National University, Canberra, Australia; and the Center for Environmental Systems Research (CESR), University of Kassel, Germany. Major funding for this project was provided to WWF by JohnsonDiversey, Inc. and Sealed Air Corporation.

References: Lehner, B., Verdin, K., Jarvis, A. (2008): New global hydrography derived from spaceborne elevation data. Eos, Transactions, AGU, 89(10): 93-94.

Data: data file

Notes: decided not to add - file split into multiple chunks with very high resolution.

2.2.5 Hydrobasins

Author: WWF

Description: HydroBASINS is a series of polygon layers that depict watershed boundaries and sub-basin delineations at a global scale. The goal of this product is to provide a seamless global coverage of consistently sized and hierarchically nested sub-basins at different scales (from tens to millions of square kilometers), supported by a coding scheme that allows for analysis of watershed topology such as up- and downstream connectivity.

Using the HydroSHEDS database at 15 arc-second resolution, watersheds were delineated in a consistent manner at different scales, and a hierarchical sub-basin breakdown was created following the topological concept of the Pfafstetter coding system. The resulting polygon layers are termed HydroBASINS and represent a subset of the HydroSHEDS database.

The HydroBASINS product has been developed on behalf of World Wildlife Fund US (WWF), with support and in collaboration with the EU BioFresh project, Berlin, Germany; the International Union for Conservation of Nature (IUCN), Cambridge, UK; and McGill University, Montreal, Canada.

References: Lehner, B., Grill G. (2013): Global river hydrography and network routing: baseline data and new approaches to study the world’s large river systems. Hydrological Processes, 27(15): 2171–2186. Data is available at www.hydrosheds.org

Data: data file

Notes: decided not to add - file split into multiple chunks with very high resolution.

2.2.6 Major Hydrologic Basins

Author: FAO/World Bank Description: References:

FAO file: Major_hydrological_basins World bank file: major_basins_of_the_world_0_0_0

basins <- sf::read_sf("shapes/Major_hydrological_basins/major_hydrobasins.shp")
head(basins)
## Simple feature collection with 6 features and 3 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -43.69583 ymin: 83.2375 xmax: -28.56667 ymax: 83.4625
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## # A tibble: 6 x 4
##   MAJ_BAS MAJ_NAME      MAJ_AREA                                   geometry
##     <dbl> <chr>            <int>                             <POLYGON [°]>
## 1    4050 Arctic Ocean~  2166086 ((-28.65 83.4375, -28.675 83.4375, -28.67~
## 2    4050 Arctic Ocean~  2166086 ((-41.15417 83.3375, -41.19583 83.3375, -~
## 3    4050 Arctic Ocean~  2166086 ((-41.2875 83.2875, -41.34167 83.2875, -4~
## 4    4050 Arctic Ocean~  2166086 ((-39.06667 83.2875, -39.25417 83.2875, -~
## 5    4050 Arctic Ocean~  2166086 ((-43.39583 83.25, -43.67917 83.25, -43.6~
## 6    4050 Arctic Ocean~  2166086 ((-42.49583 83.2375, -42.54167 83.2375, -~
object.size(basins)
## 93334864 bytes
# to reduce file size, simplify shape, filter small areas (< ) 
basins <- basins %>% 
  sf::st_simplify(dTolerance = 0.01) %>% 
  dplyr::group_by(MAJ_NAME) %>% 
  dplyr::summarise()
  #dplyr::filter(MAJ_AREA > quantile(basins$MAJ_AREA, .25)) %>% 
  #dplyr::select(MAJ_NAME)
object.size(basins)
## 12908624 bytes
# plot
kk+  
  geom_sf(data = basins, aes(fill = MAJ_NAME), size = .2, col = 0, alpha=.3)+
  ggtitle(paste("Major hydrological basins (FAO) -", length(unique(basins$MAJ_NAME)),"basins"))+
  theme(legend.position="none")+
  geom_sf_text(data = basins %>% group_by(MAJ_NAME) %>% summarize(n()), aes(label = MAJ_NAME), colour = "grey50", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("hydrobasin1.png")

## world bank basins:
basins2 <- sf::read_sf("shapes/major_basins_of_the_world_0_0_0/Major_Basins_of_the_World.shp")
head(basins2)
## Simple feature collection with 6 features and 15 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -164.6808 ymin: 49.81419 xmax: 169.2929 ymax: 71.77809
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## # A tibble: 6 x 16
##   BASWC4_ID    ID     N NAME   CONT    NN FISH_ ACRES SOURCETHM NO_COUNTRI
##       <int> <int> <int> <chr> <int> <int> <int> <dbl> <chr>          <int>
## 1         2   408    11 Indi~     2  2011     0 0.002 geoff2.d~          0
## 2         3   436    14 Koly~     2  2014    29 0.003 geoff2.d~          0
## 3         4    38    36 Yeni~     2  2036    42 0.007 final_dr~          0
## 4         5   148    78 Tana      0     0     0 0     final_dr~          0
## 5         6   104    11 Mack~     5  5011    53 0.006 final_dr~          0
## 6         7    98    26 Yukon     5  5026    31 0.004 final_dr~          0
## # ... with 6 more variables: Q3 <chr>, CHECKED <int>, LAEA_HA <dbl>,
## #   LAEA_ACRES <dbl>, LAEA_PRMTR <dbl>, geometry <POLYGON [°]>
# plot
kk+  
  geom_sf(data = basins2, aes(fill = NAME), size = .2, col = 0, alpha=.3)+
  ggtitle(paste("Major hydrological basins (WB) -", length(unique(basins2$NAME)),"basins"))+
  theme(legend.position="none")+
  geom_sf_text(data = basins2, aes(label = NAME), colour = "grey30", check_overlap=TRUE)+
  coord_sf(expand = FALSE)

if(plot.save==TRUE)
   ggsave("hydrobasin2.png")

2.2.7 FADA

File sent by SH.

## [1] "fadaregion" "geometry"

3 Other data bases

-Costello’s useful links

GMED: Global Marine Environment Datasets Ecological Marine Units in 3D and data layers are here. NODC: NOAA Oceanographic Data Center NSIDC: National Snow & Ice Data Center OceanColor: NASA Oceancolor Portal APRDC: In-situ oceanographic or atmospheric data from a Dapper OpenDap server Glovis: USGS Earth Resource Observation and Science Center Data Portal CCCMA: Canadian Centre for Climate Modelling and Analysis GOOS: Global Ocean Observing System WorldClim: Global Climate Data CliMond: Global climatologies for bioclimatic modelling Bio-Oracle: Ocean Rasters for Analysis of Climate Environment APDRC: Asia-Pacific Data-Research Center ARGOS: Worldwide tracking and environmental monitoring by satellite Global Islands Explorer and details here

-Marine Regions Site

-UN

require(sf)
print("All variables found in Spalding shape file")
#plot(meow <- sf::read_sf(paste0(DataDir,"shapes/MEOW/meow_ecos.shp")))# read as multipolygon

print("Spalding marine regions, with centroids overlaid")
plot(st_geometry(meow), col = sf.colors(12,categorical = TRUE), border = 'grey', axes = TRUE)
plot(st_geometry(st_centroid(meow)), pch = 3, col = 'grey10', add = TRUE, cex=.5)
# plot(meow["REALM"], border = 'grey', axes = TRUE, key.pos = 4)

# Using ggplot:
ggplot() + 
  geom_sf(data = meow, aes(fill = REALM))+
  scale_fill_brewer(palette = "Set3")

# The web also states tmap for ploting maps...
library(tmap)
tmap_mode("plot") #view
# qtm(meow)

tmap_style("classic")
#"white", "gray", "natural", "cobalt", "col_blind", "albatross", "beaver", "bw", "watercolor" 

require(spData); data(world)
tm_shape(world, is.master=TRUE) +
   tm_borders("grey20") +
   tm_polygons(col="grey", alpha=.8)+
   tm_grid(projection="longlat", labels.size = .5, col="grey90") +
   #tm_text("name", size="AREA") +
tm_shape(meow)+
  tm_polygons("REALM", legend.title = "Realm", alpha=.5)+
  tm_layout(legend.outside=TRUE)

tmap_mode("view") #view
tmap_style("natural")
tm_shape(world, is.master=TRUE) +
   tm_borders("grey20") +
   tm_polygons(col="grey", alpha=.8)+
   tm_grid(projection="longlat", labels.size = .5, col="grey90") +
   
tm_shape(meow)+
  tm_polygons("REALM", legend.title = "Realm", alpha=.5)+
  tm_layout(legend.outside=TRUE)

# tmap_tip()

Next steps: properly project each of the above files!

to be continued…