Establishing the Idea

Why Geocomputation ?

… and not
- GI Science
- Geography
- Spatial Statistics
… for example?

Need more than ‘standard’ GIS:

“GIS was, for some, a backwards step because the data models and analysis methods provided were simply not rich enough in geographical concepts and understanding to meet their needs.”

http://www.geocomputation.org/what.html

Needs such as:

Fitting new (but more appropriate) models
Searching for spatial pattern
- Visualisation
- Knowledge discovery
- Exploratory data analysis?

Fitting new models

Its not just about the models - its also about the approaches

‘Classical’ spatial statistics:
Assume \(y_i = X\beta + \epsilon\) where \(\epsilon = W\epsilon + \nu\) and \(\nu_i \sim N(0,\sigma^2)\)

But why? Were does this model come from?
Why is it linear?
Why does it depend on a specific set of spatial units? What about the MAUP?

“This lack of a well-defined link between process and form is commonplace in spatial analysis, and is well-documented in fields such as point set clustering and fractal analysis. That it also applies here, in spatial regression modeling, should come as no surprise.” De Smith, Goodchild, Longley 2007 - Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools P243

Process-oriented approaches

Some excellent ones exist
- Cellular automata
- Agent-based models
- Microsimulation
All more grounded in reality
But some issues not fully addressed
- Calibration
- Model selection
- Hypothesis testing
Maybe these are done better by classical approaches

A geocomputation solution?

Approximate Bayesian Computation (ABC)
See eg Marjoram, P., J. Molitor, V. Plagnol and S. Tavaré. Markov chain Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences USA 100: 15324–15328.
Simplifying massively:
- this allows you to make Bayesian inferences about processes you can simulate
even if the likelihood is intractable

A Quick overview

Draw parameter values from a prior distribution
Use these for the simulation
Keep them in a set of successful parameters if sufficiently ‘near’ to the real data
repeat these steps LOTS of times
the successful parameters have a distribution that should approximate the Bayesian posterior
Throw mud at the wall and see what sticks

Example - 2D hardcore point process

Random points but
- Always separated by a distance \(d\)
Models
- Locations of coins on fairground game
- Locations of settlements?
- Locations of animal nests?
Easy to simulate
Hard to manage analytically
How to estimate \(d\)?

ABC in Action…

It is certainly computation

Based on code-based simulation
Embarassingly parallel
Suited to cloud-based computing
Suited to map-reduce approach

It is certainly geography

A very specific part
- Geography is a big area
But we want to be able to
- Answer geography-led questions
- Assess geography-led models
- Microsimulation
- ABMs
- time-evolving models
IT HAS SOMETHING TO OFFER!!

Visualisation

Visual Explanation

‘Honest mapping’
Draw crisp lines with caution!
Issues of fuzziness and uncertainty don’t go away with ‘big data’
How can these be conveyed?

Fuzzy Travel To Work Areas

Last Public Appearance?

Sketchy Map of House Price Estimates

Why is this geocomputation?

Needs to consider computer graphics techniques
Needs an understanding of fuzzy algorithms
Needs to put these in a spatial context
Has a geographical interpretation
Although not seen here, needs to consider interaction as well
No ‘off the shelf’ solution
- finding a solution and implementing it is research in itself

Making the future happen…

Rediscover Coding!

Discover Reproducibility

Embrace Openness

Evolving science
- Credibility

Exercise Caution

A Worrying Statement

“Petabytes allow us to say ‘Correlation is enough’. We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot …

Correlation supercedes causation, and science can advance without coherent models, unified theories, or really any mechanistic explanation at all."

Chris Anderson, Wired, June 23rd 2008

Hopefully not in Geocomputation !

Any observed correlations depend heavily on the data collection process
Simpson’s paradox
- Where patterns that appear in different subsets of data disappear when these subsets are merged
- Different patterns can appear in the resultant data set
A notable shift is away from the designed experiment
- Data has to be taken as given - little control over its collection
- Inference based on statistical models assumes some kind of survey design or experimental design
Ignore data collection process at your peril.
GIGO still holds!

Brunsdon, 2014, Spatial Science - Looking Outward, Dialogs in Human Geography, 4(1), 45-49

When Venn Diagrams go Bad

Thank you

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

A Future of Geocomputation

Chris Brunsdon
18th December 2015

Kings College London

Establishing the Idea

Why Geocomputation ?

Need more than ‘standard’ GIS:

Fitting new models

Its not just about the models - its also about the approaches

Process-oriented approaches

A geocomputation solution?

A Quick overview

Example - 2D hardcore point process

ABC in Action…

It is certainly computation

It is certainly geography

Visualisation

Visual Explanation

Fuzzy Travel To Work Areas

Last Public Appearance?

Sketchy Map of House Price Estimates

Why is this geocomputation?

Making the future happen…

Rediscover Coding!

Discover Reproducibility

Embrace Openness

Exercise Caution

Hopefully not in Geocomputation !

When Venn Diagrams go Bad

Thank you

A Future of Geocomputation

Chris Brunsdon 18th December 2015

Kings College London

Establishing the Idea

Why Geocomputation ?

Need more than ‘standard’ GIS:

Fitting new models

Its not just about the models - its also about the approaches

Process-oriented approaches

A geocomputation solution?

A Quick overview

Example - 2D hardcore point process

ABC in Action…

It is certainly computation

It is certainly geography

Visualisation

Visual Explanation

Fuzzy Travel To Work Areas

Last Public Appearance?

Sketchy Map of House Price Estimates

Why is this geocomputation?

Making the future happen…

Rediscover Coding!

Discover Reproducibility

Embrace Openness

Exercise Caution

Hopefully not in Geocomputation !

When Venn Diagrams go Bad

Thank you

Chris Brunsdon
18th December 2015