Spatial Calculations

Change projection

Projection is critical for spatial calculations.

We will:

  • calculate the area of each school district
  • perform a spatial join.

These require special attention to the projection to ensure accurate calculations.

Notice

Note

Before doing any spatial calculations:

  • Consider what you need your projection to preserve
  • Determine the best projection for your purpose: google or epsg.io
  • Check the current projection: st_crs()
  • Transform the projection, if necessary: st_transform()

In-class exercise: transform GA school districts for area calculation

Step 1.

Consider what you need your projection to accomplish or preserve



Projection that preserves area, called an equal-area projection

Step 2.

Determine the best projection for your purpose: google or epsg.io

  • In the projections lesson, we learned about Albers-Equal Area (EPSG: 102008). This is a great choice for any state in United States.

  • Other areas, google it! (try: best projection for area calculation Europe)

  • Then look up the details at epsg.io or google the projection + “esri”

Step 3.

Check the current projection:

st_crs(ga_sd_pop)
Coordinate Reference System:
  User input: NAD83 
  wkt:
GEOGCRS["NAD83",
    DATUM["North American Datum 1983",
        ELLIPSOID["GRS 1980",6378137,298.257222101,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4269]]

Step 4.

Transform the projection

ga_sd_pop_102008 <- st_transform(ga_sd_pop, crs='ESRI:102008')

Notice

Note

You can keep the same dataframe name and transform the projection. I like to be explicit by renaming.

ggplot(data = ga_sd_pop) +
  geom_sf(color = "blue", fill = NA) 

ggplot(data = ga_sd_pop_102008) +
  geom_sf(color = "blue", fill = NA) 

NAD83

Albers Equal-Area

Calculate area

To calculate area:

  • always need to go through the steps above
  • transform projection if needed
  • verify the units that the projection use - You can google, or look a the projection in R with st_crs()
  • use the st_area() function

In-class exercise: calculate the area of each school district

Step 1.

Check the current projection:

st_crs(ga_sd_pop_102008)
Coordinate Reference System:
  User input: ESRI:102008 
  wkt:
PROJCRS["North_America_Albers_Equal_Area_Conic",
    BASEGEOGCRS["NAD83",
        DATUM["North American Datum 1983",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["North_America_Albers_Equal_Area_Conic",
        METHOD["Albers Equal Area",
            ID["EPSG",9822]],
        PARAMETER["Latitude of false origin",40,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8821]],
        PARAMETER["Longitude of false origin",-96,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8822]],
        PARAMETER["Latitude of 1st standard parallel",20,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8823]],
        PARAMETER["Latitude of 2nd standard parallel",60,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8824]],
        PARAMETER["Easting at false origin",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8826]],
        PARAMETER["Northing at false origin",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8827]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Not known."],
        AREA["North America - onshore and offshore: Canada - Alberta; British Columbia; Manitoba; New Brunswick; Newfoundland and Labrador; Northwest Territories; Nova Scotia; Nunavut; Ontario; Prince Edward Island; Quebec; Saskatchewan; Yukon. United States (USA) - Alabama; Alaska (mainland); Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming."],
        BBOX[23.81,-172.54,86.46,-47.74]],
    ID["ESRI",102008]]

Notice

Notice

LENGTHUNIT[“metre”,1]]],

That means that length will be calculated in meters, and area will be calculated in meters-squared.

Step 2.

Calculate the area

ga_sd_pop_102008 <- ga_sd_pop_102008 %>% 
  mutate(area = st_area(.))
GEOID NAME pop white_o_pop pct_bipoc_pop geometry area
1301590 Dade County School District 15385 14786 0.039 MULTIPOLYGON (((898359.4 -5… 451027466 [m^2]
1303690 Mitchell County School District 17770 8932 0.497 MULTIPOLYGON (((1047714 -98… 1320330716 [m^2]

Notice

Notice

  • I used the tidyverse pipe operator. The new, base R operator doesn’t have the dot syntax functionality, so I’m going old-school for convenience.

  • The units are included in the area calculation. Great!!! Except now we have to remove it for them to use it in any calculation.

Step 3.

Drop the units

Calculate as square miles

ga_sd_pop_102008 <- ga_sd_pop_102008 %>% 
  mutate(area = st_area(.),
         sq_miles = round((units::drop_units(area))*3.86102e-7, 1))
GEOID NAME pop white_o_pop pct_bipoc_pop geometry area sq_miles
1301590 Dade County School District 15385 14786 0.039 MULTIPOLYGON (((898359.4 -5… 451027466 [m^2] 174.1
1303690 Mitchell County School District 17770 8932 0.497 MULTIPOLYGON (((1047714 -98… 1320330716 [m^2] 509.8

Spatial join

You can join spatial datasets based on their spatial location with the sf function st_join(), just like you can based on a common key in a left_join.

st_join

st_join(

  • x,
  • y,
  • join = st_intersects,
  • …,
  • left = TRUE,
  • largest = FALSE )

To perform spatial operation between to spatial dataframes, they must be in the same projection.

In-class exercise: create a dataframe that lists the schools in each district

Steps 1-3.

  1. check the projection of the spatial dataframes to ensure that they match

  2. transform them to the same projection

  3. use st_join() to join the schools to the school districts

schools_102008 <- st_transform(ga_schools_shp, 'ESRI:102008')

sd_schools <- ga_sd_pop_102008 |>
  st_join(schools_102008) 
GEOID NAME pop white_o_pop pct_bipoc_pop area sq_miles school state school_id district_id schoolwide_title1 title1_eligible title1_school_status national_school_lunch_program total_enroll two_or_more_race hawaiin_pi white black hispanic aapi native geometry
1 1301590 Dade County School District 15385 14786 0.039 451027466 [m^2] 174.1 DADE COUNTY HIGH SCHOOL Georgia 1.30159e+11 1301590 NA No 6-Not a Title I school Yes participating without using any Provision or the CEO 630 12 0 587 7 18 5 1 MULTIPOLYGON (((898359.4 -5…
1.1 1301590 Dade County School District 15385 14786 0.039 451027466 [m^2] 174.1 DADE ELEMENTARY SCHOOL Georgia 1.30159e+11 1301590 Yes Yes 5-Title I schoolwide school Yes participating without using any Provision or the CEO 748 17 0 702 3 14 12 0 MULTIPOLYGON (((898359.4 -5…
1.2 1301590 Dade County School District 15385 14786 0.039 451027466 [m^2] 174.1 DADE MIDDLE SCHOOL Georgia 1.30159e+11 1301590 NA No 6-Not a Title I school Yes participating without using any Provision or the CEO 465 11 0 428 3 17 3 3 MULTIPOLYGON (((898359.4 -5…

Notice

Notice

  • There is a row for each school - this is a one-to-many join.
  • That means that there are now multiple polygons for each school district.

Summarize the data from the spatial join

Just like a regular dataframe, you can group_by() and summarise() to create summary statistics.

In-class exercise: list the schools in each district

Steps 1-3.

  1. group_by() school districts

  2. summarise() to calculate the number of schools, the enrollment, and list the schools

sd_schools <- ga_sd_pop_102008 |>
  st_join(schools_102008) |>
  group_by(GEOID) |> 
  summarise(count = n(),
            schools = toString(school),
            enroll = sum(total_enroll, na.rm=T))
GEOID count schools enroll geometry
1300001 19 BERTA WEATHERSBEE ELEMENTARY SCHOOL, BRADFIELD CENTER - AULT ACADEMY, CALLAWAY ELEMENTARY SCHOOL, CALLAWAY HIGH SCHOOL, CALLAWAY MIDDLE SCHOOL, CLEARVIEW ELEMENTARY SCHOOL, ETHEL W. KIGHT ELEMENTARY SCHOOL, FRANKLIN FOREST ELEMENTARY, GARDNER-NEWMAN MIDDLE SCHOOL, HILLCREST ELEMENTARY SCHOOL, HOGANSVILLE ELEMENTARY SCHOOL, HOLLIS HAND ELEMENTARY SCHOOL, LAGRANGE HIGH SCHOOL, LONG CANE ELEMENTARY SCHOOL, LONG CANE MIDDLE SCHOOL, ROSEMONT ELEMENTARY SCHOOL, THE HOPE ACADEMY SCHOOL, TROUP COUNTY HIGH SCHOOL, WEST POINT ELEMENTARY SCHOOL 12160 MULTIPOLYGON (((954616 -751…
1300003 1 NA 0 MULTIPOLYGON (((1269878 -84…
1300060 6 ALTAMAHA ELEMENTARY SCHOOL, APPLING COUNTY ELEMENTARY SCHOOL, APPLING COUNTY HIGH SCHOOL, APPLING COUNTY MIDDLE SCHOOL, APPLING COUNTY PRIMARY SCHOOL, FOURTH DISTRICT ELEMENTARY SCHOOL 3469 MULTIPOLYGON (((1214068 -88…

Convert to non-spatial dataframe

Sometimes you want to remove the geometry and go back to a regular dataframe.

st_drop_geometry() to the rescue!

sd_schools_list <- st_drop_geometry(sd_schools)
GEOID count schools enroll
1300001 19 BERTA WEATHERSBEE ELEMENTARY SCHOOL, BRADFIELD CENTER - AULT ACADEMY, CALLAWAY ELEMENTARY SCHOOL, CALLAWAY HIGH SCHOOL, CALLAWAY MIDDLE SCHOOL, CLEARVIEW ELEMENTARY SCHOOL, ETHEL W. KIGHT ELEMENTARY SCHOOL, FRANKLIN FOREST ELEMENTARY, GARDNER-NEWMAN MIDDLE SCHOOL, HILLCREST ELEMENTARY SCHOOL, HOGANSVILLE ELEMENTARY SCHOOL, HOLLIS HAND ELEMENTARY SCHOOL, LAGRANGE HIGH SCHOOL, LONG CANE ELEMENTARY SCHOOL, LONG CANE MIDDLE SCHOOL, ROSEMONT ELEMENTARY SCHOOL, THE HOPE ACADEMY SCHOOL, TROUP COUNTY HIGH SCHOOL, WEST POINT ELEMENTARY SCHOOL 12160
1300003 1 NA 0
1300060 6 ALTAMAHA ELEMENTARY SCHOOL, APPLING COUNTY ELEMENTARY SCHOOL, APPLING COUNTY HIGH SCHOOL, APPLING COUNTY MIDDLE SCHOOL, APPLING COUNTY PRIMARY SCHOOL, FOURTH DISTRICT ELEMENTARY SCHOOL 3469

Notice

Notice

  • You might think you can just remove the geometry column to make it a regular dataframe. Nope!
  • Even if you select(-geometry) the geometry column stays and the spatial slots remain

There are many other spatial operations that you might want to do using the sf package.

These 3 examples will give you the tools to wade into more methods on your own.

Want to try more now?

  • For 1 state:

  • Use st_intersection() to intersect the school districts and congressional districts and list the school districts that are in Congressional District.

  • you can download the new congressional spatial dataframes with tidycensus too!*