There are multiple ways to create maps using the ggplot2 package; in
this tutorial, I present two methods using three packages and a function
from the rgdal package, only if you need more accuracy in your map. The
first approach involves starting with a shapefile. In the second, I
downloaded occurrence data and plotted the data set using functions and
arguments from ggplot2. To obtain the compass rose with the North
direction and the scalebar, the ggspatial package is required. The
projection for each map is obtained using transformation methods
available through the sf package. The available projections for R are in
the rgdal package using the function make_EPSG().
In this example, I utilized the Iceland map provided by Berkeley GeoData
from. Once you have obtained the file, proceed to load the ggplot2,
ggspatial, and sf packages. Subsequently, use the st_read
function from the sf package, employing the dns argument.
Loading the shapefile with this function is a crucial step, as it
provides essential information to delineate your map.
pck = c("ggplot2","ggspatial","sf", "patchwork", "rgbif")
suppressMessages(lapply(pck, require, character.only = TRUE))
shp = st_read(dsn = "C:/Shapefile4/ISL_adm1.shp")
## Reading layer `ISL_adm1' from data source `C:\Shapefile4\ISL_adm1.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -27.98073 ymin: 60.00208 xmax: -12.05052 ymax: 66.70208
## Geodetic CRS: WGS 84
The initial attempt involves plotting the file using the basic syntax of ggplot. The resulting output is a single map featuring x and y labels in degrees. However, there is potential to maximize the available space for presenting a study area, country, or data (e.g. occurrence, abundance, biomass).
ggplot(data = shp) + geom_sf()
Fig. 1: Shapefile of Iceland without scale bar, compass rose, or specific projection.
In maps, it is important to indicate the geographic location with
respect to the north and include a scale bar, which serves as a visual
reference to indicate the real distances represented on the map. These
elements are provided by the ggspatial package. Additionally, in this
attempt, empty space is removed using the coord_sf function
and the arguments xlim and ylim.
A = ggplot(data = shp) + geom_sf() +
coord_sf(xlim = c(-27.9,-12.05), ylim = c(63, 67)) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm")) +
annotation_scale()
A
## Scale on map varies by more than 10%, scale bar may be inaccurate
Fig. 2: Shapefile of Iceland with scale bar and compass rose, without specific projection.
There is a warning about the accuracy about the representation of
distance. To improve the map is necessary search the best projection for
each area. To Iceland the best projection is ISN2004/LAEA, which is a
Lambert projection (conic) specifically for Iceland that have the code
9947. The transformation is conducted using st_transform
function from sf package, and throught st_bbox function is
obtained the file limits or bounding box in the new units (metric units
in the case of Lambert projection).
shp_LAM = st_transform(shp, 9947)
st_bbox(shp_LAM)
## xmin ymin xmax ymax
## 1229383.1 755303.5 2050290.9 1502427.1
Now, the file name in the data position, as well as the xlim and ylim, must be changed. There is still considerable empty space, so it is necessary to adjust the value of the south limit, and the warning should not be displayed again. The differences can be visualized using the patchwork package. For more information about the patchwork package visit.
B = ggplot(data = shp_LAM) + geom_sf() +
coord_sf(xlim = c(1229383.1,2050290.9), ylim = c(755303.5, 1502427.1)) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm")) +
annotation_scale()
C = ggplot(data = shp_LAM) + geom_sf() +
coord_sf(xlim = c(1229383.1,2050290.9), ylim = c(1083030.5, 1502427.1)) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm")) +
annotation_scale()
(B | C) + plot_layout(widths = c(2, 2)) + plot_annotation(tag_levels = 'A')
Fig. 3: Shapefile of Iceland with scale bar, compass rose, and ISN2004/LAEA projection. A. Figure with the original area, B. Figure with modified size.
Using the information provided in the shapefile, it is possible to
visualize the administrative regions of Iceland by employing the ‘fill’
argument and utilizing the ‘factor’ for the column name (in this case,
NAME_1). The column name is modified using the labs
function with the fill argument, and axis titles for
Latitude and Longitude have also been included.
D = ggplot() + geom_sf(data = shp_LAM, aes(fill = factor(NAME_1))) +
coord_sf(xlim = c(1229383.1,2050290.9), ylim = c(1083030.5, 1502427.1)) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm")) +
annotation_scale() + labs(x = "Longitude", y = "Latitude", fill = "Administrative Regions")
D
Fig. 4: Shapefile of Iceland with scale bar, compass rose, and ISN2004/LAEA projection. A. Figure with the original area, B. Figure with modified size.
The chondrichthyes most common in Iceland are (Somniosus
microcephalus) and (Amblyraja radiata) SEA-ICELAND.
Using the occ_data function from the rgbif package, the
occurrence data of these species is stored and processed. NAs are
removed in all columns, but in the ‘individualCount’ column, the values
are changed to one, considering the presence of data for location
(longitude and latitude), species (scientificName), and the entity or
institution providing the record (datasetName). In (Amblyraja
radiata) there is a synonymy (Raja radiata), this last
records were changed using gsub.
Sm = occ_data(scientificName = "Somniosus microcephalus",hasCoordinate = TRUE, limit = 20000, country = "IS")
Sm2 = data.frame(Sm$data)
head(Sm2[,1:4],3)
## key scientificName decimalLatitude
## 1 1883924369 Somniosus microcephalus (Bloch & Schneider, 1801) 65.02006
## 2 1439137300 Somniosus microcephalus (Bloch & Schneider, 1801) 65.09065
## 3 1439137316 Somniosus microcephalus (Bloch & Schneider, 1801) 65.09065
## decimalLongitude
## 1 -22.66576
## 2 -18.20023
## 3 -18.20023
dim(Sm2)
## [1] 16 113
Sm3 = Sm2[!is.na(Sm2$decimalLongitude) & !is.na(Sm2$decimalLongitude) & !is.na(Sm2$decimalLatitude) & !is.na(Sm2$datasetName),]
Sm3 = Sm3[,c("decimalLongitude","decimalLatitude","scientificName","individualCount")]
Sm3$individualCount = Sm3$individualCount[is.na(Sm3$individualCount)] = 1
dim(Sm3)
## [1] 11 4
Ar = occ_data(scientificName = "Amblyraja radiata",hasCoordinate = TRUE, limit = 20000, country = "IS")
Ar2 = data.frame(Ar$data)
head(Ar2[,1:4],3)
## key scientificName decimalLatitude decimalLongitude
## 1 4424683100 Amblyraja radiata (Donovan, 1808) 63.47836 -19.05652
## 2 3881772028 Amblyraja radiata (Donovan, 1808) 63.45673 -19.15955
## 3 4076403371 Amblyraja radiata (Donovan, 1808) 63.57236 -19.09094
dim(Ar2)
## [1] 55 115
Ar3 = Ar2[!is.na(Ar2$decimalLongitude) & !is.na(Ar2$decimalLongitude) & !is.na(Ar2$decimalLatitude) & !is.na(Ar2$datasetName),]
Ar3 = Ar3[,c("decimalLongitude","decimalLatitude","scientificName","individualCount")]
Ar3$scientificName = gsub("Raja radiata Donovan, 1808","Amblyraja radiata (Donovan, 1808)", Ar3$scientificName)
Ar3$individualCount = Ar3$individualCount[is.na(Ar3$individualCount)] = 1
dim(Ar3)
## [1] 16 4
Now, due to the data being in different projections, there are some issues in representing the data. This is evident in the approach without the shapefile. Here, the values of longitude and latitude are transformed. Initially, the current coordinate system is defined and then converted to the projection used in Iceland. I know it might seem redundant to convert the column names twice, but that’s the way I’ve done it.
colnames(Sm3)[1:2] = c("longitude","latitude")
colnames(Ar3)[1:2] = c("longitude","latitude")
sf_datSm = st_as_sf(Sm3[,1:2], coords = c("longitude","latitude"))
sf_datAr = st_as_sf(Ar3[,1:2], coords = c("longitude","latitude"))
proj_latlon = st_crs("+proj=longlat +datum=WGS84")
ProjLam = st_crs("+init=epsg:9947")
## Warning in CPL_crs_from_input(x): GDAL Message 1: +init=epsg:XXXX syntax is
## deprecated. It might return a CRS with a non-EPSG compliant axis order.
st_crs(sf_datSm) = proj_latlon
st_crs(sf_datAr) = proj_latlon
DatCha1 = st_transform(sf_datSm, ProjLam)
DatCha2 = st_transform(sf_datAr, ProjLam)
DatCha11 = as.data.frame(st_coordinates(DatCha1))
colnames(DatCha11) = c("decimalLongitude", "decimalLatitude")
DatCha12 = data.frame(cbind(DatCha11, Sm3[,3:4]))
head(DatCha12,3)
## decimalLongitude decimalLatitude
## 1 1527297 1307239
## 2 1737601 1310344
## 3 1737601 1310344
## scientificName individualCount
## 1 Somniosus microcephalus (Bloch & Schneider, 1801) 1
## 2 Somniosus microcephalus (Bloch & Schneider, 1801) 1
## 3 Somniosus microcephalus (Bloch & Schneider, 1801) 1
DatCha21 = as.data.frame(st_coordinates(DatCha2))
colnames(DatCha21) = c("decimalLongitude", "decimalLatitude")
DatCha22 = data.frame(cbind(DatCha21, Ar3[,3:4]))
head(DatCha22,3)
## decimalLongitude decimalLatitude scientificName
## 1 1697183 1130370 Amblyraja radiata (Donovan, 1808)
## 2 1692041 1127968 Amblyraja radiata (Donovan, 1808)
## 3 1695482 1140849 Amblyraja radiata (Donovan, 1808)
## individualCount
## 1 1
## 2 1
## 3 1
E = ggplot() +
geom_point(data = DatCha12, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName), size = 3) +
geom_point(data = DatCha22, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName), size = 3) +
geom_sf(data = shp_LAM) +
coord_sf(xlim = c(1000000,2150290.9), ylim = c(1000000.5, 1592427.1)) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm")) +
annotation_scale() + labs(x = "Longitude", y = "Latitude", color = "Species")
E + theme(legend.title = element_text(hjust = 0.5), legend.title.align = 0.5, legend.text = element_text(face = "italic", size = 8))
Using the rgbif package, we obtain and create a data frame of occurrence data. Subsequently, we delete NA values in the columns decimalLongitude, decimalLatitude, and datasetName, while also filtering the data to include only occurrences in Iceland Iceland government website. The northern shrimp (Pandalus borealis) is the predominant species with a capture of approximately 70,000 tonnes. However, the occurrence data for this species consists of only 3. The second most caught species is (Chlamys islandica) with around 17,000 tonnes, and it has 11 records. On the other hand, the pelagic species with the highest capture in Iceland, as per the data provided by the Iceland government website ndicates that notable species include (Mallotus villosus), (Micromesistius poutassou) and (Melanogrammus aeglefinus). The first species has 2 records with catch data ranging from 150,000 to 500,000 tonnes, while the second has 46 records with captures ranging from 10,000 to 500,000 tonnes. In the case of (M. aeglefinus), the capture range is from 34,000 to 120,000 tonnes.
Pb = occ_data(scientificName = "Pandalus borealis",hasCoordinate = TRUE, limit = 20000, country = "IS")
Pb2 = data.frame(Pb$data)
head(Pb2[,1:4],3)
## key scientificName decimalLatitude decimalLongitude
## 1 3045941063 Pandalus borealis Krøyer, 1838 67.09567 -18.72317
## 2 3045941110 Pandalus borealis Krøyer, 1838 67.16650 -18.71667
## 3 3045941106 Pandalus borealis Krøyer, 1838 67.15467 -18.63783
dim(Pb2)
## [1] 10 90
Pb3 = Pb2[!is.na(Pb2$decimalLongitude) & !is.na(Pb2$decimalLatitude) & !is.na(Pb2$datasetName),]
Pb3 = Pb3[,c("decimalLongitude","decimalLatitude","scientificName")]
Pb3$individualCount = rep(1,3)
dim(Pb3)
## [1] 3 4
Ci = occ_data(scientificName = "Chlamys islandica",hasCoordinate = TRUE, limit = 20000, country = "IS")
Ci2 = data.frame(Ci$data)
head(Ci2[,1:4],3)
## key scientificName decimalLatitude
## 1 4460350348 Chlamys islandica (O.F.Müller, 1776) 65.52984
## 2 3759194104 Chlamys islandica (O.F.Müller, 1776) 64.10765
## 3 2862308324 Chlamys islandica (O.F.Müller, 1776) 64.15143
## decimalLongitude
## 1 -23.19484
## 2 -22.02966
## 3 -21.97469
dim(Ci2)
## [1] 63 137
Ci3 = Ci2[!is.na(Ci2$decimalLongitude) & !is.na(Ci2$decimalLatitude)& !is.na(Ci2$datasetName),]
Ci3 = Ci3[,c("decimalLongitude","decimalLatitude","scientificName")]
Ci3$individualCount = Ci3[is.na(Ci3$individualCount)] = 1
dim(Ci3)
## [1] 11 4
Mv = occ_data(scientificName = "Mallotus villosus",hasCoordinate = TRUE, limit = 20000, country = "IS")
Mv2 = data.frame(Mv$data)
head(Mv2[,1:4],3)
## key scientificName decimalLatitude decimalLongitude
## 1 1265360841 Mallotus villosus (Müller, 1776) 63.0000 -18.00000
## 2 1265360914 Mallotus villosus (Müller, 1776) 69.2025 -15.32611
## 3 1265360849 Mallotus villosus (Müller, 1776) 63.0000 -18.00000
dim(Mv2)
## [1] 171 111
Mv3 = Mv2[!is.na(Mv2$decimalLongitude) & !is.na(Mv2$decimalLatitude)& !is.na(Mv2$datasetName),]
Mv3 = Mv3[,c("decimalLongitude","decimalLatitude","scientificName","individualCount")]
Mv3$individualCount = Mv3$individualCount[is.na(Mv3$individualCount)] = 1
dim(Mv3)
## [1] 31 4
Mp = occ_data(scientificName = "Micromesistius poutassou",hasCoordinate = TRUE, limit = 20000, country = "IS")
Mp2 = data.frame(Mp$data)
head(Mp2[,1:4],5)
## key scientificName decimalLatitude
## 1 43212941 Micromesistius poutassou (Risso, 1827) 62.01667
## 2 863167055 Micromesistius poutassou (Risso, 1827) 62.03000
## decimalLongitude
## 1 -28.00
## 2 -28.01
dim(Mp2)
## [1] 2 94
Mp3 = Mp2[!is.na(Mp2$decimalLongitude) & !is.na(Mp2$decimalLatitude)& !is.na(Mp2$datasetName),]
Mp3 = Mp3[,c("decimalLongitude","decimalLatitude","scientificName","individualCount")]
Mp3$individualCount = Mp3$individualCount[is.na(Mp3$individualCount)] = 1
dim(Mp3)
## [1] 1 4
Ma = occ_data(scientificName = "Melanogrammus aeglefinus",hasCoordinate = TRUE, limit = 10000, country = "IS")
Ma2 = data.frame(Ma$data)
head(Ma2[,1:4],3)
## key scientificName decimalLatitude
## 1 4461187990 Melanogrammus aeglefinus (Linnaeus, 1758) 65.03445
## 2 4431049450 Melanogrammus aeglefinus (Linnaeus, 1758) 65.71860
## 3 4430605509 Melanogrammus aeglefinus (Linnaeus, 1758) 64.08925
## decimalLongitude
## 1 -14.04291
## 2 -18.18138
## 3 -21.44272
dim(Ma2)
## [1] 129 88
Ma3 = Ma2[!is.na(Ma2$decimalLongitude) & !is.na(Ma2$decimalLatitude)& !is.na(Ma2$datasetName),]
Ma3 = Ma3[,c("decimalLongitude","decimalLatitude","scientificName","individualCount")]
Ma3$individualCount = Ma3$individualCount[is.na(Ma3$individualCount)] = 1
dim(Ma3)
## [1] 2 4
In ggplot2, the map is obtained using the map_data
function, representing a general visualization. The species data are
displayed using the geom_point function. By employing the
labs function, it becomes possible to assign axis titles,
labels, and legends. The map is incorporated into the plot using
geom_polygon, where the ‘group’ argument selects points
that are joined to form a polygon.
Similar to the shapefile approach, it is necessary to search for the
projection that provides the best geographic representation. The use of
the annotation_north_arrow function results in a warning
indicating that the north arrow signage is incorrectly marked.
Therefore, the compass rose must be inclined manually. With this
approach, the scalebar is represented manually in the code due to the
specific location of Iceland and the chosen projection. In other cases,
it is recommended to use coord_sf with an appropriate value
of st_crs.
map_data = map_data("world")
Map = ggplot() +
geom_point(data = Pb3, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName)) +
geom_point(data = Ci3, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName)) +
geom_point(data = Mv3, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName)) +
geom_point(data = Mp3, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName)) +
geom_point(data = Ma3, aes(x = decimalLongitude, y = decimalLatitude,
color = scientificName)) +
labs(x = "Longitude", y = "Latitude", color = "Species") +
coord_map(projection = "lambert",
parameters = list(lat0 = 65, lat1 = 66),
xlim = c(-30,-7),
ylim = c(60, 70)) +
geom_polygon(data = map_data, aes(x = long, y = lat, group = group), fill = "grey",
color = "black", alpha = 0.5) +
annotation_north_arrow(location = "tl",which_north = "true",
pad_x = unit(0.2, "cm"), pad_y = unit(0.2, "cm"),
style = north_arrow_nautical,width = unit(1.5, "cm"),
height = unit(1.5, "cm"),
rotation = -15)
Map + geom_rect(aes(xmin = -30, xmax = -28, ymin = 60, ymax = 60.1), fill = "black") +
geom_text(aes(x = -29, y = 59.75, label = "200 km"), color = "black", size = 2.5, angle = -12) +
theme(legend.title = element_text(hjust = 0.5), legend.title.align = 0.5, legend.text = element_text(face = "italic", size = 8))
Regarding the occurrence data, it is surprising to note the low number of valid records for the most important species. This suggests the need to improve the registration of all capture data for these species to enable effective management and conservation. It is important to exercise caution, as there is a record in the middle of the country. Additionally, it might not be feasible to obtain a shapefile or one may lack sufficient time or patience to create it. For this reason, it is crucial to work with the available data, manage it effectively, and achieve the best possible output.
References
Chamberlain, S., Oldoni, D., Barve, V., Desmet, P., Geffert, L., Mcglinn, D., Ram, K., 2023. Interface to the Global Biodiversity Information Facility API. Package ‘rgbif.’ 3.7.8.
Dunnington, D., Thorne, B., Hernangómez, D., 2023. Spatial Data Framework for ggplot2, Package ‘ggspatial.’ 1.1.9.
https://www.government.is/topics/business-and-industry/fisheries-in-iceland/the-main-species/
https://patchwork.data-imaginist.com/index.html
Lin, T., 2024. The Composer of Plots, Package ‘patchwork.’ 1.2.0.
Pebesma, E., Bivand, R., Racine, E., Sumner, M., Cook, I., Keitt, T., Lovelace, R., Wickham, H., Ooms, J., Müller, K., Lin, T., Baston, D., Dunnington, D., 2023. Simple Features for R, Package ‘sf.’ 1.0-15.
Wickham, H., Chang, W., Henry, L., Lin-Pedersen, T., Takahashi, K., Wilke, C., Woo, W., Yutani, H., Dunnington, D., 2023. Create Elegant Data Visualisations Using the Grammar of Graphics. Package ‘ggplot2.’ 3.4.4.