tinyForestR
packagetinyForestR
package?tinyForestR
is a set of tools designed to extract,
manipulate and analyse data relevant to the location of Tiny Forests in
the UK.
Specifically it extracts and processes landcover and biodiversity data from a range of sources for a given area around Tiny Forest locations, and provides a set of tools for analysing citizen science data derived directly from Tiny Forests.
The package is hosted on Github and is a work in progress. It can be
installed by running
devtools::install_github("julianflowers/tinyForestR")
.
The package makes use of a number of Application Programming Interfaces (APIs) some of which require API keys which will need to be applied for separately. This is outlined in the relevant sections of this vignette.
It also uses a range of Python packages to access some datasets (in
some cases Python packages are better developed than R). For this reason
the first step is to run initialise_tf()
to intialise the
package.
This:
Loads reticulate
which R uses to talk to
Python
Sets up a Python virtual environment (VE) called
tinyforest
, and ensures R uses the correct version of
Python by specifying the RETICULATE_PYTHON
environment
variable. It may be necessary to restart R to make sure R correctly uses
this.
You will be asked if you want to remove the
tinyforest
environment. Say “no” unless you have a previous
installation and are having trouble. It will create a clean install of
the VE.
Installs a number of Python packages to the
tinyforest
environment. These include
earthengine-api
which enables access to Google Earth
Engine (GEE)
geemap
- a set of added value tools to extract and
manipulate GEE data
osdatahub
- access to Ordnance Survey National
Geographic Database data.
OSGridConverter
- converts lat-longs to UK Grid
references
Imports the relevant modules for use in other packages
if(!require("tinyForestR"))
devtools::install_github("julianflowers/tinyForestR", force = TRUE)
if(!require("pacman"))install.packages("pacman")
library(tinyForestR)
p_load(leaflet.extras2, tidyverse, mapview, sf, ggmap, lubridate)
initialise_tf()
#> Virtual environment 'tinyforest' removed.
#> Using Python: /opt/homebrew/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/bin/python3.11
#> Creating virtual environment 'tinyforest' ...
#> Done!
#> Installing packages: 'pip', 'wheel', 'setuptools', 'numpy'
#> Virtual environment 'tinyforest' successfully created.
#> Using virtual environment 'tinyforest' ...
The next step is to load Tiny Forest (TF) data. Because this only
exists in a series of web pages the get_tf_data
function
identifies the relevant pages and iterates over them to extract name,
id, location, area, planters, and types of tree planted (as a list
column), for those TFs planted at the time of extraction. It does
include TFs planted outside the UK. The function takes about 30 seconds
to iterate over all the relevant pages.
tf <- read_csv("data/tf_data.csv")
tf_df <- tf |>
unnest("trees") |>
mutate(year = year(date),
month = month(date))
Once the data is loaded we can save it as a csv file and get some high level information on planting, timings, size and so on.
As of 2023-05-26 there are 190 planted TFs.
needs(patchwork)
## annual planting
tf_year <- tf_df |>
select(-trees) |>
distinct() |>
count(year) |>
ggplot(aes(year, n)) +
geom_col() +
labs(title = "TFs planted per year")
## area distribution
tf_area <- tf_df |>
select(-trees) |>
distinct() |>
ggplot(aes(factor(year), area)) +
geom_boxplot() +
labs(title = "TF area year")
tf_trees <- tf_df |>
group_by(tf_id) |>
summarise(n_trees = n()) |>
ggplot() +
geom_histogram(aes(n_trees)) +
labs(title = "Distribution of tree species",
x = "Number of tree species")
tf_year + tf_area + tf_trees
It is now straightforward to map locations of TFs using
sf
and mapview
.
We can also look at planting frequency for different tree species.
tf_df |>
ungroup() |>
unnest("trees") |>
count(trees, sort = TRUE) |>
top_n(25) |>
ggplot() +
geom_col(aes(n, reorder(trees, n))) +
labs(y = "",
title = "25 most commonly planted tree species") +
ggthemes::theme_base() +
theme(plot.title.position = "plot")
The get_nbn_buffer
downloads occurrence data from the
NBN Atlas in a set buffer around a given longitude and latitude. For
example we can download 10000 records around lat=51.777889,
lon=-1.469139 (Witney TF).1
safe_buff <- safely(tinyForestR::get_nbn_buffer)
nbn_data <- safe_buff(lon, lat, n = 10000)
#> 2.644 sec elapsed
nbn_data$result |>
head()
#> kingdom phylum classs order family genus
#> 1 Animalia Chordata Aves Anseriformes Anatidae Cygnus
#> 2 Plantae Tracheophyta Magnoliopsida Dipsacales Adoxaceae Sambucus
#> 3 Animalia Chordata Aves Passeriformes Aegithalidae Aegithalos
#> 4 Plantae Tracheophyta Magnoliopsida Rosales Ulmaceae Ulmus
#> 5 Animalia Chordata Aves Passeriformes Paridae Parus
#> 6 Animalia Chordata Aves Passeriformes Muscicapidae Erithacus
#> decimalLatitude decimalLongitude year month
#> 1 51.78298 -1.472293 1968 04
#> 2 51.76903 -1.470282 1996 <NA>
#> 3 51.78298 -1.472293 2019 12
#> 4 51.78316 -1.475597 2017 04
#> 5 51.78298 -1.472293 2012 04
#> 6 51.78298 -1.472293 2015 10
#> dataProviderName speciesGroups
#> 1 British Trust for Ornithology Animals, Birds
#> 2 Botanical Society of Britain & Ireland Plants, FloweringPlants
#> 3 British Trust for Ornithology Animals, Birds
#> 4 Botanical Society of Britain & Ireland Plants, FloweringPlants
#> 5 British Trust for Ornithology Animals, Birds
#> 6 British Trust for Ornithology Animals, Birds
#> vernacularName species
#> 1 Mute Swan Cygnus olor
#> 2 Elder Sambucus nigra
#> 3 Long-tailed Tit Aegithalos caudatus
#> 4 English Elm Ulmus procera
#> 5 Great Tit Parus major
#> 6 Robin Erithacus rubecula
I have also included functions to extract data for the 2020 Botanic Society of Britain and Ireland survey. This is publicly available from for UK National Grid 1k hectads. This requires conversion of lat-longs to UK grids.
grid_ref <- tinyForestR::os_lat_lon_to_grid(lat = lat, lon = lon)
#> Using virtual environment 'tinyforest' ...
grid_ref$grid
#> [1] "SP3672"
bsbi_data <- tinyForestR::get_bsbi_data(grid_ref = grid_ref$grid)
bsbi_data |>
enframe() |>
unnest("value") |>
unnest("value") |>
# slice(-c(168:173)) |>
mutate(year = str_extract(value, "20\\d{2}"),
value = str_remove(value, year),
count = parse_number(value),
value = str_remove(value, as.character(count)),
value = str_remove(value, "\\d{1,}"),
grid = grid_ref$grid,
tf_id = tf_df$tf_id[i]) |>
arrange(value) |>
drop_na()
#> # A tibble: 7 × 6
#> name value year count grid tf_id
#> <chr> <chr> <chr> <dbl> <chr> <dbl>
#> 1 records "Clinopodium ascendens() " 2018 1 SP3672 85
#> 2 records "Cochlearia danica() " 2016 1 SP3672 85
#> 3 records "Crataegus monogyna() " 2015 1 SP3672 85
#> 4 records "Cupressus × leylandii() " 2016 1 SP3672 85
#> 5 records "Galeopsis bifida() " 2015 1 SP3672 85
#> 6 records "Hesperis matronalis() " 2017 1 SP3672 85
#> 7 records "Oenothera stricta() " 2017 1 SP3672 85
The calc_bd_metrics
function takes an output from
get_nbn_buffer
or get_bsbi_data
, converts the
data from long to wide format, creates a species matrix for a specified
class (for get_nbn_buffer
data), and outputs a list
containing:
metrics <- calc_bd_metrics(df = nbn_data$result, class = "Aves")
metrics$metrics
#> month richness N ratio diversity
#> 1 01 17 60 0.2833333 0.9294444
#> 2 02 17 58 0.2931034 0.9256837
#> 3 03 24 74 0.3243243 0.9382761
#> 4 04 16 56 0.2857143 0.9266582
#> 5 05 43 104 0.4134615 0.9613536
#> 6 06 16 57 0.2807018 0.9258233
#> 7 07 19 53 0.3584906 0.9284443
#> 8 08 14 41 0.3414634 0.9125521
#> 9 09 24 59 0.4067797 0.9399598
#> 10 10 16 48 0.3333333 0.9253472
#> 11 11 15 60 0.2500000 0.9205556
#> 12 12 17 58 0.2931034 0.9304400
metrics$plot +
labs(title = "Monthly species richness for ",
subtitle = paste("Aves", tf1$stub[i]),
y = "Richness",
x = "Month") +
ggthemes::theme_base() +
theme(plot.title.position = "plot")
The package includes a calc_ndvi_buff
function to enable
the calculation of normalized vegetation index (NDVI) for the buffer
area around a given point. It uses Sentinel2 surface reflectance
satellite images which are available at 10m resolution and are regularly
updated. The function extracts images via the Google Earth Engine API
and requires registration and authentication prior to use (see…).
The function returns a list including, image dates, NDVI statistics for the image, an interactive map and a raster. Note it may take few minutes to run.
The code chunk below calculates the NDVI for each image containing the buffer around the Witney TF for 2019 and 2022 and maps them side-by-side. (Note, the function selects only those S2 images with cloud cover < 10%).
ndvi_2019 <- calc_ndvi_buff(lat = lat, lon = lon, dist = 1000, start_date = "2019-01-01", end_date = "2019-12-31")
ndvi_2022 <- calc_ndvi_buff(lat = lat, lon = lon, dist = 1000, start_date = "2022-01-01", end_date = "2022-12-31")
bind_rows(ndvi_2019$ndvi_stats, ndvi_2022$ndvi_stats)
ndvi_2019$map | ndvi_2022$map
There are 25 images in 2019 with a median NDVI of 0.62, and 23 in 2022 with median NDVI of 0.64.
I have included a snapshot with environmental variables for the TF dataset. This includes the age of the TF, rural-urban classification of the lower super output area containing the TF centroid, the area (m2) of public parks and allotments in the buffer taken from OS Open Greenspace data, and the area (m2) of deciduous woodland, taken from Priority Habitat Inventory data.
data("tf1")
tf1 |>
head()
#> tfid deciduous_woodland public_park_or_garden
#> 1: 85 110671.51 227896.2
#> 2: 86 82356.98 72937.2
#> 3: 87 232792.13 171525.2
#> 4: 88 83893.85 247971.8
#> 5: 89 0.00 557434.9
#> 6: 91 162195.72 914805.6
#> allotments_or_community_growing_spaces age when.x
#> 1: 0.00 1113 2020-03-14
#> 2: 47611.61 793 2021-01-28
#> 3: 129832.54 786 2021-02-04
#> 4: 12434.15 786 2021-02-04
#> 5: 0.00 771 2021-02-19
#> 6: 93438.37 764 2021-02-26
#> stub
#> 1: 85-tychwood-witney
#> 2: 86-foxwell-drive-oxford
#> 3: 87-meadow-lane-oxford
#> 4: 88-trym-valley-bristol
#> 5: 89-avenue-end-glasgow
#> 6: 91-queensmead-playing-field-leicester
#> Rural Urban Classification 2011 (10 fold)
#> 1: Rural town and fringe
#> 2: Urban city and town
#> 3: Urban city and town
#> 4: Urban city and town
#> 5: <NA>
#> 6: Urban city and town
#> who when.y area lat
#> 1: Volunteer Group, Local Residents, Earthwatch 2020-03-14 207 51.77789
#> 2: Volunteer Group, Landscape contractors 2021-01-28 201 51.76914
#> 3: Landscape contractors 2021-02-04 206 51.73478
#> 4: Volunteer Group 2021-02-04 205 51.50028
#> 5: Landscape contractors 2021-02-19 206 55.87525
#> 6: School Children 2021-02-26 205 52.63214
#> lon year year_month
#> 1: -1.469139 2020 Mar 2020
#> 2: -1.215472 2021 Jan 2021
#> 3: -1.239861 2021 Feb 2021
#> 4: -2.600333 2021 Feb 2021
#> 5: -4.160222 2021 Feb 2021
#> 6: -1.174139 2021 Feb 2021
Note this may sometimes time out depending on traffic on the NBN Atlas webservice↩︎