1. Introduction to the sf Package

The sf package is the modern standard in R for working with spatial vector data. It represents spatial features (points, lines, polygons) in a special kind of data frame. This makes it intuitive to use if you are already familiar with dplyr and the tidyverse.

An sf object is essentially a data.frame with one crucial addition: a special list-column, usually named geometry, that holds the spatial information for each row.

2. Reading and Inspecting Vector Data

We use the st_read() function to load vector data from a file, such as a shapefile (.shp). Let’s load the administrative boundaries for Somalia’s regions. You can download this data from the Humanitarian Data Exchange (HDX).

library(sf)
library(ggplot2)
library(dplyr)

# Load the shapefile for Somalia's administrative regions (level 1)
# Make sure the file "som_admbnda_adm1_ocha_20250108.shp" is in your working directory
som_adm1 <- st_read("som_admbnda_adm1_ocha_20250108.shp", quiet = TRUE)

# Check the class of the object
class(som_adm1)
## [1] "sf"         "data.frame"
# Print the object to see its structure
print(som_adm1)
## Simple feature collection with 18 features and 10 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 40.99488 ymin: -1.664897 xmax: 51.41303 ymax: 11.9852
## Geodetic CRS:  WGS 84
## First 10 features:
##           ADM1_EN ADM1_PCODE ADM0_EN ADM0_PCODE       date    validOn validTo
## 1           Awdal       SO11 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 2          Bakool       SO25 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 3         Banadir       SO22 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 4            Bari       SO16 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 5             Bay       SO24 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 6       Galgaduud       SO19 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 7            Gedo       SO26 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 8          Hiraan       SO20 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 9      Lower Juba       SO28 Somalia         SO 2024-12-04 2025-01-08    <NA>
## 10 Lower Shabelle       SO23 Somalia         SO 2024-12-04 2025-01-08    <NA>
##    Shape_Leng Shape_Area  AREA_SQKM                       geometry
## 1    5.657071 1.31222441 15884.8143 MULTIPOLYGON (((43.46189 11...
## 2    6.650225 2.10138869 25797.6141 MULTIPOLYGON (((44.03028 4....
## 3    1.068762 0.02661158   327.3517 MULTIPOLYGON (((45.55389 2....
## 4   12.218762 5.61806371 68077.0071 MULTIPOLYGON (((50.79877 11...
## 5    8.086913 3.57292945 43931.5260 MULTIPOLYGON (((44.3111 3.5...
## 6    9.101881 4.01935108 49279.5284 MULTIPOLYGON (((47.07738 6....
## 7    9.042089 3.66528190 45056.5014 MULTIPOLYGON (((42.88542 4....
## 8    7.046882 2.75764693 33853.5458 MULTIPOLYGON (((45.71684 5....
## 9   11.338459 3.89035579 47883.1992 MULTIPOLYGON (((41.9267 -1....
## 10   8.115243 2.08196012 25611.8031 MULTIPOLYGON (((45.33112 2....

The output shows us key information: - Geometry type: MULTIPOLYGON (since regions are areas) - Dimension: XY (2D data) - Bounding box: The geographic extent of the data - CRS: The Coordinate Reference System (WGS 84) - Attributes: A standard data frame with columns like ADM1_EN (region name), ADM0_EN (country name), etc. - geometry column: The special column holding the polygon data.

We can quickly plot the object to see the map.

plot(som_adm1)

3. Subsetting and Manipulating sf Objects

Since an sf object is like a data frame, we can use standard dplyr or base R methods to filter and select data.

Example: Isolate the Banadir region (where Mogadishu is).

banadir <- som_adm1 %>%
  filter(ADM1_EN == "Banadir")

# Plot just the Banadir region
ggplot() +
  geom_sf(data = banadir, fill = "lightblue", color = "black") +
  ggtitle("Map of Banaadir Region") +
  theme_bw()

Example: Isolate the regions of Somaliland. Somaliland’s regions in this dataset are Awdal, Woqooyi Galbeed, Togdheer, Sanaag, and Sool.

somaliland_regions <- som_adm1 %>%
  filter(ADM1_EN %in% c("Awdal", "Woqooyi Galbeed", "Togdheer", "Sanaag", "Sool"))

# Plot Somaliland regions
ggplot() +
  geom_sf(data = somaliland_regions, aes(fill = ADM1_EN)) +
  ggtitle("Regions of Somaliland") +
  theme_bw() +
  labs(fill = "Region Name")

4. Joining Attribute Data to Spatial Data

A very common task is joining non-spatial data (like statistics) to a spatial object for mapping. Let’s create a dummy dataset of Internally Displaced Person (IDP) counts for a few regions and join it to our som_adm1 map.

# 1. Create a simple data frame with IDP data
idp_data <- data.frame(
  region_name = c("Banaadir", "Lower Shabelle", "Gedo", "Woqooyi Galbeed"),
  idp_count = c(500000, 120000, 85000, 95000)
)

# 2. Join the IDP data to the sf object
# We join by the region name column in both datasets.
som_with_idps <- left_join(som_adm1, idp_data, by = c("ADM1_EN" = "region_name"))

# 3. Map the results (a choropleth map)
ggplot(data = som_with_idps) +
  geom_sf(aes(fill = idp_count)) +
  scale_fill_viridis_c(name = "IDP Count", na.value = "grey90") +
  ggtitle("Estimated IDP Counts by Region in Somalia") +
  theme_bw()

Regions without data in our idp_data frame are colored grey. This is a powerful way to visualize spatial patterns in your data.

5. Creating sf Objects from Coordinates

We can also create our own sf objects from a table of coordinates. This is useful for mapping specific locations. Let’s create an sf object for the capital cities of Somalia and Somaliland.

# 1. Create a data frame with city names and coordinates
cities <- data.frame(
  name = c("Mogadishu", "Hargeisa"),
  longitude = c(45.333333, 44.066667),
  latitude = c(2.033333, 9.566667)
)

# 2. Convert the data frame to an sf object
# We specify the coordinate columns and the CRS (4326 is the code for WGS 84)
cities_sf <- st_as_sf(cities, coords = c("longitude", "latitude"), crs = 4326)

class(cities_sf)
## [1] "sf"         "data.frame"
print(cities_sf)
## Simple feature collection with 2 features and 1 field
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 44.06667 ymin: 2.033333 xmax: 45.33333 ymax: 9.566667
## Geodetic CRS:  WGS 84
##        name                  geometry
## 1 Mogadishu POINT (45.33333 2.033333)
## 2  Hargeisa POINT (44.06667 9.566667)
# 3. Plot the cities on top of the country map
ggplot() +
  geom_sf(data = som_adm1) + # Base map
  geom_sf(data = cities_sf, color = "red", size = 4) + # City points
  ggtitle("Capitals on the Map of Somalia") +
  theme_bw()

This lecture covered the fundamentals of reading, manipulating, and creating vector data with sf, using real-world examples from Somalia and Somaliland. ```


LS0tDQp0aXRsZTogJ01vZHVsZSBJSTogV29ya2luZyB3aXRoIFZlY3RvciBEYXRhIGluIFIgdXNpbmcgdGhlIGBzZmAgUGFja2FnZScNCmF1dGhvcjogIkFiZGlzYWxhbSBIYXNzYW4gTXVzZSwgUGhEIg0KZGF0ZTogImByIFN5cy5EYXRlKClgIg0Kb3V0cHV0Og0KICAgaHRtbF9kb2N1bWVudDoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICB0aGVtZTogdW5pdGVkDQogICAgaGlnaGxpZ2h0OiB0YW5nbw0KICAgIGNvZGVfZG93bmxvYWQ6IHRydWUNCi0tLQ0KDQpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0NCmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSwgbWVzc2FnZSA9IEZBTFNFLCB3YXJuaW5nID0gRkFMU0UpDQpgYGANCg0KIyMjIDEuIEludHJvZHVjdGlvbiB0byB0aGUgYHNmYCBQYWNrYWdlDQoNClRoZSBgc2ZgIHBhY2thZ2UgaXMgdGhlIG1vZGVybiBzdGFuZGFyZCBpbiBSIGZvciB3b3JraW5nIHdpdGggKipzcGF0aWFsIHZlY3RvciBkYXRhKiouIEl0IHJlcHJlc2VudHMgc3BhdGlhbCBmZWF0dXJlcyAocG9pbnRzLCBsaW5lcywgcG9seWdvbnMpIGluIGEgc3BlY2lhbCBraW5kIG9mIGRhdGEgZnJhbWUuIFRoaXMgbWFrZXMgaXQgaW50dWl0aXZlIHRvIHVzZSBpZiB5b3UgYXJlIGFscmVhZHkgZmFtaWxpYXIgd2l0aCBgZHBseXJgIGFuZCB0aGUgdGlkeXZlcnNlLg0KDQpBbiBgc2ZgIG9iamVjdCBpcyBlc3NlbnRpYWxseSBhIGBkYXRhLmZyYW1lYCB3aXRoIG9uZSBjcnVjaWFsIGFkZGl0aW9uOiBhIHNwZWNpYWwgbGlzdC1jb2x1bW4sIHVzdWFsbHkgbmFtZWQgYGdlb21ldHJ5YCwgdGhhdCBob2xkcyB0aGUgc3BhdGlhbCBpbmZvcm1hdGlvbiBmb3IgZWFjaCByb3cuDQoNCiMjIyAyLiBSZWFkaW5nIGFuZCBJbnNwZWN0aW5nIFZlY3RvciBEYXRhDQoNCldlIHVzZSB0aGUgYHN0X3JlYWQoKWAgZnVuY3Rpb24gdG8gbG9hZCB2ZWN0b3IgZGF0YSBmcm9tIGEgZmlsZSwgc3VjaCBhcyBhIHNoYXBlZmlsZSAoYC5zaHBgKS4gTGV0J3MgbG9hZCB0aGUgYWRtaW5pc3RyYXRpdmUgYm91bmRhcmllcyBmb3IgU29tYWxpYSdzIHJlZ2lvbnMuIFlvdSBjYW4gZG93bmxvYWQgdGhpcyBkYXRhIGZyb20gdGhlIFtIdW1hbml0YXJpYW4gRGF0YSBFeGNoYW5nZSAoSERYKV0oaHR0cHM6Ly9kYXRhLmh1bWRhdGEub3JnL2RhdGFzZXQvc29tYWxpYS1hZG1pbmlzdHJhdGl2ZS1ib3VuZGFyaWVzLWNvZCkuDQoNCmBgYHtyIGxvYWRfc2Z9DQpsaWJyYXJ5KHNmKQ0KbGlicmFyeShnZ3Bsb3QyKQ0KbGlicmFyeShkcGx5cikNCg0KIyBMb2FkIHRoZSBzaGFwZWZpbGUgZm9yIFNvbWFsaWEncyBhZG1pbmlzdHJhdGl2ZSByZWdpb25zIChsZXZlbCAxKQ0KIyBNYWtlIHN1cmUgdGhlIGZpbGUgInNvbV9hZG1ibmRhX2FkbTFfb2NoYV8yMDI1MDEwOC5zaHAiIGlzIGluIHlvdXIgd29ya2luZyBkaXJlY3RvcnkNCnNvbV9hZG0xIDwtIHN0X3JlYWQoInNvbV9hZG1ibmRhX2FkbTFfb2NoYV8yMDI1MDEwOC5zaHAiLCBxdWlldCA9IFRSVUUpDQoNCiMgQ2hlY2sgdGhlIGNsYXNzIG9mIHRoZSBvYmplY3QNCmNsYXNzKHNvbV9hZG0xKQ0KDQojIFByaW50IHRoZSBvYmplY3QgdG8gc2VlIGl0cyBzdHJ1Y3R1cmUNCnByaW50KHNvbV9hZG0xKQ0KYGBgDQoNClRoZSBvdXRwdXQgc2hvd3MgdXMga2V5IGluZm9ybWF0aW9uOg0KLSAqKkdlb21ldHJ5IHR5cGU6KiogTVVMVElQT0xZR09OIChzaW5jZSByZWdpb25zIGFyZSBhcmVhcykNCi0gKipEaW1lbnNpb246KiogWFkgKDJEIGRhdGEpDQotICoqQm91bmRpbmcgYm94OioqIFRoZSBnZW9ncmFwaGljIGV4dGVudCBvZiB0aGUgZGF0YQ0KLSAqKkNSUzoqKiBUaGUgQ29vcmRpbmF0ZSBSZWZlcmVuY2UgU3lzdGVtIChXR1MgODQpDQotICoqQXR0cmlidXRlczoqKiBBIHN0YW5kYXJkIGRhdGEgZnJhbWUgd2l0aCBjb2x1bW5zIGxpa2UgYEFETTFfRU5gIChyZWdpb24gbmFtZSksIGBBRE0wX0VOYCAoY291bnRyeSBuYW1lKSwgZXRjLg0KLSAqKmdlb21ldHJ5IGNvbHVtbjoqKiBUaGUgc3BlY2lhbCBjb2x1bW4gaG9sZGluZyB0aGUgcG9seWdvbiBkYXRhLg0KDQpXZSBjYW4gcXVpY2tseSBwbG90IHRoZSBvYmplY3QgdG8gc2VlIHRoZSBtYXAuDQoNCmBgYHtyIHBsb3Rfc2Z9DQpwbG90KHNvbV9hZG0xKQ0KYGBgDQoNCiMjIyAzLiBTdWJzZXR0aW5nIGFuZCBNYW5pcHVsYXRpbmcgYHNmYCBPYmplY3RzDQoNClNpbmNlIGFuIGBzZmAgb2JqZWN0IGlzIGxpa2UgYSBkYXRhIGZyYW1lLCB3ZSBjYW4gdXNlIHN0YW5kYXJkIGBkcGx5cmAgb3IgYmFzZSBSIG1ldGhvZHMgdG8gZmlsdGVyIGFuZCBzZWxlY3QgZGF0YS4NCg0KKipFeGFtcGxlOiBJc29sYXRlIHRoZSBCYW5hZGlyIHJlZ2lvbiAod2hlcmUgTW9nYWRpc2h1IGlzKS4qKg0KDQpgYGB7ciBzdWJzZXRfc2Z9DQpiYW5hZGlyIDwtIHNvbV9hZG0xICU+JQ0KICBmaWx0ZXIoQURNMV9FTiA9PSAiQmFuYWRpciIpDQoNCiMgUGxvdCBqdXN0IHRoZSBCYW5hZGlyIHJlZ2lvbg0KZ2dwbG90KCkgKw0KICBnZW9tX3NmKGRhdGEgPSBiYW5hZGlyLCBmaWxsID0gImxpZ2h0Ymx1ZSIsIGNvbG9yID0gImJsYWNrIikgKw0KICBnZ3RpdGxlKCJNYXAgb2YgQmFuYWFkaXIgUmVnaW9uIikgKw0KICB0aGVtZV9idygpDQpgYGANCg0KKipFeGFtcGxlOiBJc29sYXRlIHRoZSByZWdpb25zIG9mIFNvbWFsaWxhbmQuKioNClNvbWFsaWxhbmQncyByZWdpb25zIGluIHRoaXMgZGF0YXNldCBhcmUgQXdkYWwsIFdvcW9veWkgR2FsYmVlZCwgVG9nZGhlZXIsIFNhbmFhZywgYW5kIFNvb2wuDQoNCmBgYHtyIHN1YnNldF9zb21hbGlsYW5kfQ0Kc29tYWxpbGFuZF9yZWdpb25zIDwtIHNvbV9hZG0xICU+JQ0KICBmaWx0ZXIoQURNMV9FTiAlaW4lIGMoIkF3ZGFsIiwgIldvcW9veWkgR2FsYmVlZCIsICJUb2dkaGVlciIsICJTYW5hYWciLCAiU29vbCIpKQ0KDQojIFBsb3QgU29tYWxpbGFuZCByZWdpb25zDQpnZ3Bsb3QoKSArDQogIGdlb21fc2YoZGF0YSA9IHNvbWFsaWxhbmRfcmVnaW9ucywgYWVzKGZpbGwgPSBBRE0xX0VOKSkgKw0KICBnZ3RpdGxlKCJSZWdpb25zIG9mIFNvbWFsaWxhbmQiKSArDQogIHRoZW1lX2J3KCkgKw0KICBsYWJzKGZpbGwgPSAiUmVnaW9uIE5hbWUiKQ0KYGBgDQoNCiMjIyA0LiBKb2luaW5nIEF0dHJpYnV0ZSBEYXRhIHRvIFNwYXRpYWwgRGF0YQ0KDQpBIHZlcnkgY29tbW9uIHRhc2sgaXMgam9pbmluZyBub24tc3BhdGlhbCBkYXRhIChsaWtlIHN0YXRpc3RpY3MpIHRvIGEgc3BhdGlhbCBvYmplY3QgZm9yIG1hcHBpbmcuIExldCdzIGNyZWF0ZSBhIGR1bW15IGRhdGFzZXQgb2YgSW50ZXJuYWxseSBEaXNwbGFjZWQgUGVyc29uIChJRFApIGNvdW50cyBmb3IgYSBmZXcgcmVnaW9ucyBhbmQgam9pbiBpdCB0byBvdXIgYHNvbV9hZG0xYCBtYXAuDQoNCmBgYHtyIGpvaW5fZGF0YX0NCiMgMS4gQ3JlYXRlIGEgc2ltcGxlIGRhdGEgZnJhbWUgd2l0aCBJRFAgZGF0YQ0KaWRwX2RhdGEgPC0gZGF0YS5mcmFtZSgNCiAgcmVnaW9uX25hbWUgPSBjKCJCYW5hYWRpciIsICJMb3dlciBTaGFiZWxsZSIsICJHZWRvIiwgIldvcW9veWkgR2FsYmVlZCIpLA0KICBpZHBfY291bnQgPSBjKDUwMDAwMCwgMTIwMDAwLCA4NTAwMCwgOTUwMDApDQopDQoNCiMgMi4gSm9pbiB0aGUgSURQIGRhdGEgdG8gdGhlIHNmIG9iamVjdA0KIyBXZSBqb2luIGJ5IHRoZSByZWdpb24gbmFtZSBjb2x1bW4gaW4gYm90aCBkYXRhc2V0cy4NCnNvbV93aXRoX2lkcHMgPC0gbGVmdF9qb2luKHNvbV9hZG0xLCBpZHBfZGF0YSwgYnkgPSBjKCJBRE0xX0VOIiA9ICJyZWdpb25fbmFtZSIpKQ0KDQojIDMuIE1hcCB0aGUgcmVzdWx0cyAoYSBjaG9yb3BsZXRoIG1hcCkNCmdncGxvdChkYXRhID0gc29tX3dpdGhfaWRwcykgKw0KICBnZW9tX3NmKGFlcyhmaWxsID0gaWRwX2NvdW50KSkgKw0KICBzY2FsZV9maWxsX3ZpcmlkaXNfYyhuYW1lID0gIklEUCBDb3VudCIsIG5hLnZhbHVlID0gImdyZXk5MCIpICsNCiAgZ2d0aXRsZSgiRXN0aW1hdGVkIElEUCBDb3VudHMgYnkgUmVnaW9uIGluIFNvbWFsaWEiKSArDQogIHRoZW1lX2J3KCkNCmBgYA0KUmVnaW9ucyB3aXRob3V0IGRhdGEgaW4gb3VyIGBpZHBfZGF0YWAgZnJhbWUgYXJlIGNvbG9yZWQgZ3JleS4gVGhpcyBpcyBhIHBvd2VyZnVsIHdheSB0byB2aXN1YWxpemUgc3BhdGlhbCBwYXR0ZXJucyBpbiB5b3VyIGRhdGEuDQoNCiMjIyA1LiBDcmVhdGluZyBgc2ZgIE9iamVjdHMgZnJvbSBDb29yZGluYXRlcw0KDQpXZSBjYW4gYWxzbyBjcmVhdGUgb3VyIG93biBgc2ZgIG9iamVjdHMgZnJvbSBhIHRhYmxlIG9mIGNvb3JkaW5hdGVzLiBUaGlzIGlzIHVzZWZ1bCBmb3IgbWFwcGluZyBzcGVjaWZpYyBsb2NhdGlvbnMuIExldCdzIGNyZWF0ZSBhbiBgc2ZgIG9iamVjdCBmb3IgdGhlIGNhcGl0YWwgY2l0aWVzIG9mIFNvbWFsaWEgYW5kIFNvbWFsaWxhbmQuDQoNCmBgYHtyIGNyZWF0ZV9zZn0NCiMgMS4gQ3JlYXRlIGEgZGF0YSBmcmFtZSB3aXRoIGNpdHkgbmFtZXMgYW5kIGNvb3JkaW5hdGVzDQpjaXRpZXMgPC0gZGF0YS5mcmFtZSgNCiAgbmFtZSA9IGMoIk1vZ2FkaXNodSIsICJIYXJnZWlzYSIpLA0KICBsb25naXR1ZGUgPSBjKDQ1LjMzMzMzMywgNDQuMDY2NjY3KSwNCiAgbGF0aXR1ZGUgPSBjKDIuMDMzMzMzLCA5LjU2NjY2NykNCikNCg0KIyAyLiBDb252ZXJ0IHRoZSBkYXRhIGZyYW1lIHRvIGFuIHNmIG9iamVjdA0KIyBXZSBzcGVjaWZ5IHRoZSBjb29yZGluYXRlIGNvbHVtbnMgYW5kIHRoZSBDUlMgKDQzMjYgaXMgdGhlIGNvZGUgZm9yIFdHUyA4NCkNCmNpdGllc19zZiA8LSBzdF9hc19zZihjaXRpZXMsIGNvb3JkcyA9IGMoImxvbmdpdHVkZSIsICJsYXRpdHVkZSIpLCBjcnMgPSA0MzI2KQ0KDQpjbGFzcyhjaXRpZXNfc2YpDQpwcmludChjaXRpZXNfc2YpDQoNCiMgMy4gUGxvdCB0aGUgY2l0aWVzIG9uIHRvcCBvZiB0aGUgY291bnRyeSBtYXANCmdncGxvdCgpICsNCiAgZ2VvbV9zZihkYXRhID0gc29tX2FkbTEpICsgIyBCYXNlIG1hcA0KICBnZW9tX3NmKGRhdGEgPSBjaXRpZXNfc2YsIGNvbG9yID0gInJlZCIsIHNpemUgPSA0KSArICMgQ2l0eSBwb2ludHMNCiAgZ2d0aXRsZSgiQ2FwaXRhbHMgb24gdGhlIE1hcCBvZiBTb21hbGlhIikgKw0KICB0aGVtZV9idygpDQpgYGANCg0KVGhpcyBsZWN0dXJlIGNvdmVyZWQgdGhlIGZ1bmRhbWVudGFscyBvZiByZWFkaW5nLCBtYW5pcHVsYXRpbmcsIGFuZCBjcmVhdGluZyB2ZWN0b3IgZGF0YSB3aXRoIGBzZmAsIHVzaW5nIHJlYWwtd29ybGQgZXhhbXBsZXMgZnJvbSBTb21hbGlhIGFuZCBTb21hbGlsYW5kLg0KYGBgDQoNCi0tLQ0K