December 2019

Spatial Statistics


  • Motivation
  • Waldo Tobler’s First Law of Geography
    • “everything is related to everything else, but near things are more related than distant things”.
  • It was meant to be tongue-in-cheek but think of
    • Diffusion processes
    • Migration
    • Crime (eg. Felson and Cohen’s routine activity theory)
    • Cell kinetics (Ok, that isn’t geography!)
  • All support the idea well
  • But this isn’t encapsulated in independence between observations assumption of many statistical models


  • “Data of geographic units are tied together, like bunches of grapes, not separate, like balls in an urn” - Frederick F. Stephan (1934)
  • Similar to time series and temporal closeness

  • We need to adjust existing approaches to data analysis, or consider new ones accordingly
    • Summary statistics
    • Visualisation
    • Modelling

  • For the different kinds of spatial data seen earlier

Plan for rest of session

  • Focus on
    1. Area data with associated values
    2. Point data with associated valures
    3. A few quick examples of other types of analysis

Summary Statistics
The absence of independence

Join Count Statistic

age_cty <- age_cty %>% mutate(Age_gt_37 = Mean_Age > 37)

  • For binary (or categorical variables)
  • In LHS map do counties of the same kind clump together
  • Count the cross-border joins:
Combination Count
Red-Red 8
Red-Blue 17
Blue-Blue 33

  • Is that what one might expect if Red/Blue were randomly distributed across counties?

A Crowdsourced Hypothesis Test