1 Retail catchments

A retail catchment can be defined as the areal extent from which the main patrons of a store or retail centre will typically be found. There are numerous ways in which catchments can be delineated, depending on the requirements for a particular study, available data, software used or the analytical capability of a practitioner. The simplest technique might be to draw buffer rings around a store or retail centre; however, such a technique is naive as it doesn’t consider geographical barriers or competition. More advanced methods referred to as ‘Gravity’ and ‘Spatial Interaction Models’ delineate catchment areas by considering the spatial distribution of competing locations and evaluating their relative attractiveness to different groups of the population.

In the remainder of this practical session, we will be creating retail catchment areas using both basic and more advanced methods. We’ll start by creating buffer rings around retail centres in Liverpool, followed by drive distance polygons. Lastly, we will examine catchment areas based on a gravity model (Huff model in our case). The latter method will incorporate estimating retail centre attractiveness to depict the possible impact of retail hierarchy on the catchment extents.In this case we will use predefined index values available from the data pack ‘Liv_Attractiveness.csv’.

1.1 Buffer rings (primary and secondary catchments)

  • First, load retail centres point data for the entire country GB_RetailCentres (located in the retail folder)
  • From the Vector menu choose Geoprocessing tools and then Clip
  • Fill in the window so that you have the GB_RetailCentres layer as your input, IMD2015 (or any other layer with Liverpool boundaries) as your Clip (Overlay) layer and save the new shapefile to your output folder. Name it RetailCentres_Liverpool
  • Click OK; the new layer should be automatically added to the Table of Contents (Layers Panel)
  • Remove the GB_RetailCentres layer
    Clip_Retail Centres

Next, we’ll define primary catchments, which are usually areas within a short distance from the retail centre and account for over 50% of patronage, and secondary catchments, which generally cover larger distances and represent patronage levels between 20% and 50%. We’ll use a straightforward approach — buffer rings — to create these catchments. Follow the steps below:

  • Create the Primary catchments by using a buffer distance of 2000m (go to Vector/Geoprocessing Tools/Buffer)

  • Fill in the dialog box as follows: select RetailCentres_Liverpool as your Input layer and type 2000 in the Distance field

  • Save the Output shapefile to your working directory; name it Buffer2000 (see the picture below)
    Buffers Window

  • Then create the secondary catchments by using Buffer distance of 4000m (repeat the above steps and name the new layer: Buffer4000)

  • Render the image by adjusting colours, transparency etc.

  • Add labels by going to Properties of the RetailCentres_Liverpool layer

    • Choose the Single Labels tab and use ‘NAME’ from the drop down menu
    • Explore the labels tab (e.g. change the font to Arial, size to 10, add buffer and shadow)

Note: if the name is too long you can adjust it in the Toggle editing mode Toggle Edit - Go to the Attribute Table and delete part of the name e.g. Wavertree - You should now have created a map that looks similar to the one below:
Buffers2_4k

Comment on the retail catchments computed by using the buffer rings method:
- Can the primary catchments be distinguished easily from the - secondary ones?
- Is the hierarchy within the retail centres accounted for in any way?
- How do you think these representations could be improved?

1.1.1 Catchment estimation and retail hierarchy

Although the distinction between the primary and secondary catchment areas is reasonably clear, their extents are far from being realistic. One of the major reasons for it is that the so-called hierarchy of retail centres has not been accounted for. Typically, such hierarchy relate to their size, attractiveness and the geographical extent of retail centres influence, with those centres towards the upper end of a hierarchy typically offering a ‘multi-purpose shopping’ experience, and as such, drawing consumers from a wider area. Conversely, smaller town or district centres will typically serve a different function, and therefore be patronised more prevalently by local communities. Based on the Index of Retail Centres Attractiveness developed by Dolega et al. (2016) [https://www.sciencedirect.com/science/article/pii/S0969698915300412] and the retail hierarchies developed by Macdonald et al. (2022) [https://www.nature.com/articles/s41597-022-01556-3], at least three types of town/retail centres can be distinguished in Liverpool: Regional Centre (Liverpool), District Centres (Allerton Rd, Old Swan, Kirkdale), and Local Centres (the remaining centres). Note: this classification and data do not include Retail and Leisure Parks.

We can add information on retail hierarchy in Liverpool to the attribute table of the relevant shapefile. We will have to create a new variable that will depict the hierarchy. We can do this by using the “CASE” conditional function, which allows us to specify different values for each retail centre type directly within the attribute table. This approach enables efficient recoding, with each centre classified as either “Regional Centre,” “District Centre,” or “Local Centre” according to the hierarchy.

  • Go to the Attribute Table of RetailCentres_Liverpool. Go to Field Calculator. Put Hierarchy as the Output field name

  • Create a statement: CASE WHEN condition THEN result, ELSE result. In our case, we want to assign a value of 1 to Liverpool; value of 2 to Allerton Rd, Old Swan and Kirkdale and value of 3 to the remaining centres. Give it a go, however if you get stuck, try the following statement:
    CASE
    WHEN “NAME” = ‘Liverpool’ THEN ‘1’
    WHEN “NAME” = ‘Allerton Road’ THEN ‘2’
    WHEN “NAME” = ‘Old Swan’ THEN ‘2’
    WHEN “NAME” = ‘Kirkdale’ THEN ‘2’
    ELSE ‘3’
    END

  • Next, go to Vector > Geoprocessing Tools > Buffer

  • In the buffer window select RetailCentres Liverpool layer as your Input vector layer and this time click on Data defined override > Edit (next to the Distance window)

  • Create another CASE conditional statement (similar to the one above although the ‘condition’ will be different this time)

  • Create a buffer distance based on the hierarchy: 5,000 meters for Hierarchy 1; then 2,000 meters for Hierarchy 2; and 1,000 meters for Hierarchy 3

  • Save the new Buffer rings to your working directory as Buffer_Hierarchy

  • The new buffers should be added automatically to the Layers Panel

  • Since you need to create a map of Buffer rings taking into account the hierarchy of retail centres you should make your map very clear. Try the following:

  • Adjust transparency of the newly created layer and try different colours and transparency, so the overlaps are visible

  • Label the centres and adjust the size of your points to reflect the hierarchy - as shown on the map below (Use the CASE, WHEN condition THEN result, ELSE, END) statement in the layer’s Property > Size > Data defined override > Edit field to display different sizes of points e.g. for Hierarchy 1 - point size 5, Hierarchy 2 - point size 3 and Hierarchy 3 - point size 2. Analogically, you could also use the CASE conditional function to adjust the size of labels.
    BuffersHierarchy

1.2 Drive distances

Despite the consideration of retail centre hierarchies, there are still significant limitations to the buffer approach, as it does not account for real-world barriers such as rivers, lakes, and railway tracks etc. A more accurate approach would be to consider road distances and drive time techniques. Both techniques are still popular amongst the major retailers, so we will make a use of it too. This method is relatively easy to execute in some GIS packages such as ArcGIS Pro; however, in QGIS you would need an extension/plugin to do that. In QGIS this could be done by using the pgRouting extension, but it isn’t straightforward and is beyond the scope of this tutorial so instead, we will use ArcGIS Pro. Follow the steps below:

1.2.1 Setting up ArcGIS Pro

First we need to set up ArcGIS and then we’ll do network analysis
- Open ArcGIS by going to Start menu > ArcGIS Pro - Click on the Map icon - a new map/project will open. Give it a name and save it to your working directory.

  • Go to Project > Licensing and make sure that the Network Analyst is enabled
  • Then go back to your ARCGIS Project/Map and in the Catalog window, located on the right hand side, right click on Folders and then Add Folder Connection
  • Browse to your Practical 4 data > Retail folder, click OK

The Retail folder should appear now in your Catalog window.

1.2.2 Creating service areas

The ArcGIS setup is done now, so you’re ready to start your analysis

  • Add the RetailCentres_Liverpool layer by dragging and dropping the .shp file to the Table of Contents
  • Then add the RoadLiv layer to the Table of Contents. (If you use a different version of ArcGIS Pro than 3.3.0, you’ll need to add the RoadLiv_ND.nd layer, which is your Liverpool road network).

Note: If the RetailCentres_Liverpool has been saved as a geopackage, resave the layer in QGIS as a shapefile first, and then add it to your Practical 4 data > Retail folder

  • Click on the Analysis tab and then Network Analysis > Service Area
    Service Area

  • A new layer called Service Area will be added to the Table of Contents

  • Click first on the Facilities and then select the Service Area Layer tab located at the top of ArcGIS toolbar

  • Add the retail centre locations by clicking on the Import Facilities Imp_Facilities icon and choose RetailCentres_Liverpool as your Input Locations from the drop down menu, click Apply and then OK

  • Once the retail centres locations have been added to the Facilities, set your Cutoffs to 1 (the distance units are preset to km), Direction: Away from facilities and set your Mode as Driving Distance

  • By now, the NetworkAnalyst > Service Area Layer window should look something like the picture below:
    Service Area Layer

  • Play with other settings e.g. use Polygons or Polygons and Lines under the Polygons tab drop down menu

  • Click the Run button

  • This will generate 10 service areas (retail catchments), each delineated for 1km drive distance

  • You can also generate multiple drive times/distances by specifying two or more different distances in the Cutoffs window (e.g. type 1, 2 and hit the Run button again)

  • The output below shows the map of service areas for Liverpool Retail Centres delineated for 1km and 2km
    ServiceAreas

So if you want to export for example the 1km service areas, do the following:

  • Right click on the Polygons layer (automatically named Cutoff) > Data > Export Features and save it to your Retail or any other folder you save your work to.
  • Name it Liv_DriveDist.shp.
  • Now, you should be able to use these drive polygons in QGIS, just drag and drop the newly created file to your QGIS Practical 4 project

1.2.3 Creating hierarchical drive distances - (optional)

So this is how the drive or distance polygons are generated. At the moment we have generated 10 polygons of 1km and 2km, however if we wanted to account for the retail hierarchy, we would also need to generate a new service area of 5km for Liverpool city centre, three service areas of 2km for the district centres with the hierarchy of 2 (Old Swan, Allerton Rd and Kirkdale) and six service areas of 1km for the remaining centres.

  • Using the above guidelines generate variable distance service areas to account for the hierarchy of Liverpool’s town centres and save them to your working directory.
  • You may need to create a new column (integer) with a distance in km, name it Dist
  • Then in the Import facilities under the Field name use the newly created Dist values. Also select Breaks-Kilometers under the Fields Mapping Property
  • Click, Apply, OK and then Run again to generate new hierarchical service areas
  • Name it Liv_Hierachy_DriveDist
  • Save it to your working directory, so you can use it in QGIS Hier_Drdist.png
  • Next, overlay the generated service areas (Liv_Herachy_DriveDist) onto the Buffer_Hierarchy layer
  • Create a printable map showing the spatial extent of both types of retail catchments
  • Begin with opening a new “Print composer”, add a north arrow, scale bar and legend.
  • Once ready, export your map as a picture image and answer the questions below:

- What are the limitations of these approaches? - How could these be addressed?

1.3 Huff catchments

Another way of estimating retail catchments (much more complex and arguably more accurate) is to use a gravity model (also referred to as Spatial Interaction Model (SIM)). We will use a probabilistic SIM, more specifically the Huff model. It estimates the probability of each domicile (LSOA in our case) to use a particular retail centre by taking into account road distances and competition between centres as well as their attractiveness and position within retail hierarchy. As such, this information can be utilised to delineate primary, secondary and tertiary catchment areas for each retail centre at the LSOA level of granularity. The results referred to as Huff probabilities are available in the Practical 4 data package, however if you wish to compute them by yourself the code is below.
Note: You won’t need Huff catchments for your assignment

1.3.1 Calculating Huff probabilities (optional)

I recommend that you consider attempting this optional section only if you have strong R language skills and plenty of spare time. Some of the used R packgaes have been removed from the CRAN repository, so you’d need to obtain them from the archives (https://cran.r-project.org/src/contrib/Archive/rgeos/; https://cran.r-project.org/src/contrib/Archive/rgdal/).

1.3.1.1 Load Huff tools and data

The huff_basic function is used to apply the Huff model which is part of the huff-tools package and for that reason we should import huff-tools into R. The “huff_tools.r” package is available in the retail folder so you should download the ‘Huff’ file and save it in your working directory. Once you are ready, follow the steps below:

  • Open RStudio from All programmes
  • Set your working directory to where the Huff tool and your data are stored. (You can copy and paste the code below into RStudio, however you will have to set the path to your working directory
  • Click Run
# Set your working directory
setwd("M:/Liverpool/Teaching/Teaching 2023-24/ENVS609/Practical 4/Practical 4 data/Practical 4 data/retail")
# Import the huff-tools library
source("huff-tools.r")

The following libraries should be loaded automatically*:

library(rgdal)  
library(rgeos)  
library(igraph)  
library(FNN)  
library(dplyr)

Note: If you get an error saying that packages are unavailable, e.g. Error in import.packages(“rgdal”) : could not find function “import.packages” then you will need to copy the following code and Run:

install.packages("rgdal")
install.packages("rgeos")
install.packages("igraph")
install.packages("FNN")
install.packages("dplyr")

Next, we will use the ‘read.csv’ function to import the comma-delimited files (.csv table) into R. In most cases, the function can be applied by specifying just the name of the csv file as it assumes by default that the csv file has a header, and that the field separator character is ‘,’. These and other parameters can be modified if required (type: ‘?read.csv’ for more details). The function returns a data frame object and can be used as follows:

# load the data from your retail folder
attr_score <- read.csv('Liv_Attractiveness.csv')
distances <- read.csv('Liv_distances.csv')

We have now created two data frame objects. The “attr_score” data frame stores the attractiveness score of the retail centres (including ranking based on the hierarchy) while the “distances” object has the pre-calculated road distances between the centroids of each LSOA and the boundary of the retail centres.

A summary output of the data can be obtained with the ‘str’ and ‘summary’ functions. The former function outputs the name and type of each variable as well as the number of rows, while the latter function produces summaries depending on the type of the variable (e.g. quartiles, minimum and maximum values for continuous variables).

str(attr_score)
summary(attr_score)
str(distances)
summary(distances)

1.3.1.2 Join tables

Next step involves joining the two tables we have created: distances and attr_score. In R there are two commonly used functions that perform data join: merge and inner_join. In this practical we will use the latter. The inner_join function uses a common field to merge the tables. In our case it is the destinations_name field. Use the code below (name the output huff_input):

huff_input <- inner_join(distances, attr_score)
# Note: If the column names differ use the 'merge' function  and the 'by' argument to specify  the names
str(huff_input)

So the output is a long table with 328,430 rows and 6 columns.

1.3.1.3 Assign beta values

The next step is to assign different values to beta parameter based on the attractiveness rank.

This is easily done with an ifelse function. This function returns an object which is filled with elements according to whether or not a condition is met. It is also possible to nest two or more ‘ifelse’ statements to evaluate more complex conditions.

In this practical we’ll use the retail centre hierarchy to develop the beta parameter (distance decay exponent) of the Huff model, assuming that retail centres at the top of the hierarchy should have a lower distance decay parameter (i.e. their attractiveness is reduced with distance but at a slower rate than in the case of the small centres).

For the major retail centres, which serve extensive catchments (Rank 1) use beta = 1.4, for the secondary retail centres, in our case the district centres (Rank 2) beta = 1.6 and for the smallest centres (Rank 3) beta = 1.8.

huff_input$beta <- with(huff_input, ifelse(Rank == 1, 1.4,
                                           ifelse(Rank == 2, 1.6,1.8)))

Check whether the beta values have been allocated as required

# Display the first six values for Rank and beta fields
head(huff_input[,c("Rank", "beta")])
# Display the last six values for Rank and beta fields
tail(huff_input[,c("Rank", "beta")])

It should be noted that the beta values can be altered, depending on the requirements, estimation technique etc. The above used values were found to produce the most appropriate catchment areas for the national level. These, of course, may vary for the city level for a number of reasons: a) the number of competitors is smaller within a city, b) there is a violation of a boundary free modelling, and c) the retail hierarchy within a single city may vary from the national one.

1.3.1.4 Run Huff model

We will use the huff_basic function to estimate Huff patronage probabilities. The huff_basic function requires the input of 6 arguments to run, namely:

  1. A list of unique names for the destination locations.
  2. A list of attractiveness score for the destination locations.
  3. A list of unique names for the origin locations.
  4. A list of pairwise distances between origins and destinations.
  5. A list or scalar for the beta exponent of distance.**
  6. A list or scalar for the alpha exponent of the attractiveness score.

The first four arguments are required, while the last two are optional, and if not provided, a default value will be used (i.e. alpha = 1, beta = 2). If you examine the huff_input data frame that we created earlier, you’ll realise that we have 5 of the required variables (names for destinations and origins, attractiveness score, distance and beta values), so we only need an alpha value. In this practical we’ll use the default value of 1.

1.3.1.5 Calculate Huff probabilities

huff_probs <- huff_basic (huff_input$destinations_name,
                          huff_input$AttrScore,
                          huff_input$origins_name,
                          huff_input$distance,
                          huff_input$beta,
                          alpha = 1
                        )
# display data summary
str(huff_probs)

The output of the huff-basic function is a table in long format (all pairwise probabilities between origins and destinations).

1.3.1.6 Extract the highest Huff probabilities for each LSOA

To extract the highest probabilities for each LSOA we can use the select_by_probs function. The function will essentially group the data by LSOA name, sort each of the group entries by Huff probabilities (from higher to lower) and then extract the top number of entries, where the number is specified as second argument in the function. So to extract the highest entry (number = 1) for each LSOA, we can run the following R snippet:

# Extract probabilities
sele_probs <- select_by_probs(huff_probs, 1)

1.3.1.7 Output the result as shapefile

In order to display the extracted Huff probabilities, we will create a shapefile by merging our sele_probs data frame with a shapefile of LSOAs provided for our study area. First we import the shapefile into R with the readOGR function. The function accepts as first argument the directory where the vector dataset is located and as second argument the name of the dataset - note, there is no need to add a file extension.

#use YOUR working directory for `the readOGR` function
origins <- readOGR("M:/Documents/Liverpool/Practicals2017/retail", "Liv_lsoas")

Then merge the spatial data object with our data frame.

# Merge with spatial object of LSOAs
origins@data <- data.frame(origins@data, sele_probs[match(origins@data$lsoa_cd, sele_probs$origins_name), ])

#delete column 1 as it is a duplicate of the `origin locations`
origins@data[,1] <- NULL

Final step of the modelling is to save the data as a shapefile in your working directory. This can be done with the writeOGR function. First create new folder with ‘dir.create’ function, then save the results.

dir.create("Huff_results") # Create a new folder named results
# Save the origins shapefile in the "Huff_results" folder
writeOGR(origins, "Huff_results", "huff_probs", driver = "ESRI Shapefile")

#If you get a warning message, you can ignore it:

1.4 Mapping Huff catchments

If you managed to calculate Huff probabilities - Well Done as this was really advanced spatial analysis in action! Otherwise you can use Huff probabilities from the Practical 4 data pack. So, you will now display HUff probabilities using various thresholds, and secondly: create catchments by allocating the LSOAs based on these probabilities to each retail centre

  • Open the huff_probs shapefile in QGIS
  • Explore Attribute table of the layer
  • Navigate to Properties > Symbology window and display the hff_prb column (Graduated) as two different classes using the following thresholds: below 0.50 (secondary catchments) and above 0.50 (primary catchments)
  • In order to see the extent of a particular centre run a simple query in the Attribute table
  • Click on Select features using an expression and create an expression by specifying the destination name (“dstntn_” = ‘name of a town centre’)

The picture below shows the catchment for Smithdown Rd, which is highlighted in yellow SmithdownRd Catchment

1.4.1 Creating Huff polygons

  • Next, create polygons showing the extent of catchments for individual retail centres in Liverpool
  • This can be easily done by using the Dissolve function from Geoprocessing tools
  • Use huff_probs as your Input vector layer
  • Click … next to Dissolve field(s) and select dstntn_ and click OK
  • Save it to your working directory and name your output: Huff_catchments and click Save and then Run
  • Inspect the newly added layer; render the image (adjust transparency, change line colour and thickness)
  • Overlay the Huff_catchments onto the buffer rings and compare the extents
  • Create a map showing these differences
    Buffer_Huff

Answer the following questions:

  • What are the main differences in the catchment extents compared to the previous method (buffer rings)?
  • How do the differences affect total patronage levels?
  • What are the strengths and limitation of each of these techniques?