Lock data covering shipping traffic from the 90s to 2017. Spatial coverage is Wolters data
RICOMEX data is contemporary (2022). Spaital coverage is pan-European.
Goal is to link them spatially and see how well the RICOMEX data does in representing shipping data. We want to use this because it is pan-European and can give us detailed data (ship speed, size, frequency) that the lock data cannot provide. Additionally, the RICOMEX data has a finer spatial grain than the lock data for a majority of Europe in areas where locks are less dense.
# Packages
library(dplyr)
library(tidyverse)
library(sf)
library(ggplot2)
library(mapview)
Within the Large River Databse (LRDB) there are several files relevant to lock data
Locks_WSV - shipping frequency for each lock across entire temporal range
Locks_Stabu - multiple names for each lock and each year
Locks_Birk_300km - number of and type of ships passing each site/lock for a 300m stretch. Currently an issue with lock name
Locks_Birk_150km - same as above, at 150km
Locks_Birk_10km - same as above, at 10km
Occasion - biodiveristy data that lists each collection effort, what
the
site code is, and what lock is associated with that site
Site - collection sites - has data on each siteID, collection ID, and lockID
Need to link them all - Site says where spatially the collection is. Then can link the collection to a lock. Then we can plot out shipping intensity
# Lets start with the shipping data for each lock provided
lockswsv <- read.csv("../NAVIDIV/Data/Wolters/Locks_WSV.csv")
str(lockswsv)
## 'data.frame': 814 obs. of 16 variables:
## $ lock_WSV : chr "Abwinden" "Abwinden" "Abwinden" "Abwinden" ...
## $ river : chr "Danube" "Danube" "Danube" "Danube" ...
## $ year : int 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 ...
## $ FRA : int 14523 14523 14523 14523 14523 14523 14523 14523 14523 14523 ...
## $ Lad_t : int 9986502 9986502 9986502 9986502 9986502 9986502 10979904 11633673 12316450 10737354 ...
## $ Tragf_t : int 16804264 16804264 16804264 16804264 16804264 16804264 16804264 16804264 16804264 16804264 ...
## $ FTK : int 4392 4392 4392 4392 4392 4392 4392 4392 4392 4392 ...
## $ SpB : int 786 786 786 786 786 786 786 786 786 786 ...
## $ Son : int 221 221 221 221 221 221 221 221 221 221 ...
## $ SpB_Son : int 880 880 880 880 880 880 880 880 880 880 ...
## $ FTK_SpB_Son : int 5273 5273 5273 5273 5273 5273 5273 5273 5273 5273 ...
## $ FRA_FTK_SpB_Son: int 19796 19796 19796 19796 19796 19796 19796 19796 19796 19796 ...
## $ lock_ID_WSV : chr "lo_01" "lo_01" "lo_01" "lo_01" ...
## $ WSV_ID : chr "lo_01_1994" "lo_01_1995" "lo_01_1996" "lo_01_1997" ...
## $ Efficiency : int NA NA NA NA NA NA NA NA NA NA ...
## $ n_empty : int NA NA NA NA NA NA NA NA NA NA ...
# How many locks do we have data on?
length(unique(lockswsv$lock_WSV))
## [1] 37
# Each lock has data from 90s to today
# Lets take a look at how shipping traffic changed across the years
ggplot(lockswsv, aes(year, FRA_FTK_SpB_Son)) +
geom_point()
Remarkably stable over time
# Lets color code it by river
ggplot(lockswsv, aes(year, FRA_FTK_SpB_Son, group = lock_WSV)) +
geom_point() +
geom_point(aes(colour = river)) +
geom_line()
Again we can see it is stable across time even within
rivers.
Only difference really is spatially.
# Plotting the same thing with just lines for each lock
ggplot(lockswsv, aes(year, FRA_FTK_SpB_Son, color = lock_WSV)) +
geom_line(size = 1.5) +
theme(legend.position = "none")
Lets plot out the biodiversity collection sites spatially
# This is the sites package trimmed down to a geopackage
woltersites <- st_read("../NAVIDIV/GIS/Creation/outputs/wolter_pts.gpkg")
## Reading layer `wolter_pts' from data source
## `C:\Users\Aaron Sexton\Desktop\NAVIDIV\GIS\Creation\outputs\wolter_pts.gpkg'
## using driver `GPKG'
## Simple feature collection with 826 features and 2 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 3936968 ymin: 2396709 xmax: 5847602 ymax: 3483496
## Projected CRS: ETRS89-extended / LAEA Europe
# Simple ggplot
sitesgg <- ggplot() + geom_sf(data = woltersites) + aes()
sitesgg
# interactive map
mapview(woltersites)
# Fish Collection Occasions
occasion <- read.csv("../NAVIDIV/Data/Wolters/Occassion.csv")
str(occasion)
## 'data.frame': 2737 obs. of 68 variables:
## $ River_name_copy : chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_name_copy : chr "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" ...
## $ Wetted_width : int 60 60 60 60 280 280 280 280 280 280 ...
## $ Site_code_LR : chr "DELR0399N" "DELR0399N" "DELR0399N" "DELR0399N" ...
## $ Occ_laufnr : int 3608 3609 3610 3611 3849 3846 3847 3848 3835 3840 ...
## $ Dist_ocean_copy : num 383 383 383 383 237 ...
## $ Sampling_location : chr "Mixed" "Mixed" "Mixed" "Mixed" ...
## $ Date : chr "01/09/2000" "11/10/2001" "25/10/2002" "17/10/2003" ...
## $ Yearocc : int 2000 2001 2002 2003 1999 1997 1997 1998 1997 1998 ...
## $ Monthocc : int 9 10 10 10 6 6 9 6 9 8 ...
## $ Method : chr "Electro" "Electro" "Electro" "Electro" ...
## $ Fishing_time : chr "Day" "Day" "Day" "Day" ...
## $ species_caught : int 11 11 10 10 8 9 11 7 15 15 ...
## $ specimen_caught : int 427 266 159 563 100 310 5424 54 1445 1132 ...
## $ n_Ind_100m2 : chr "142.33" "8.87" "3.18" "11.26" ...
## $ n_Ind_ha : chr "14,233.33" "886.67" "318.00" "1,126.00" ...
## $ Fished_area : chr "300.00" "3,000.00" "5,000.00" "5,000.00" ...
## $ Fished_length : int 100 1000 1000 1000 -999 -999 -999 -999 980 1150 ...
## $ Fished_width : num 3 3 5 5 -999 -999 -999 -999 3 3 ...
## $ Average_depth : num -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 ...
## $ Equipment : chr "Boat" "Boat" "Boat" "Boat" ...
## $ Sampling_strategy : chr "Partial" "Partial" "Partial" "Partial" ...
## $ Runs_separated : chr "Separated" "Separated" "Separated" "Separated" ...
## $ Number_of_runs : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Dominating_substrate : chr NA NA NA NA ...
## $ Stop_nets_used : chr NA NA NA NA ...
## $ No_of_anodes : chr NA NA NA NA ...
## $ Type_of_anode : chr "Ring" "Ring" "Stripe" "Stripe" ...
## $ Size_of_anode : num -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 ...
## $ Type_of_current : chr "DC" "DC" "DC" "DC" ...
## $ Voltage_used : num 300 300 300 300 300 300 300 300 300 300 ...
## $ Country : chr "Germany" "Germany" "Germany" "Germany" ...
## $ Equivalent : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Reporter_code : chr "DEIGB" "DEIGB" "DEIGB" "DEIGB" ...
## $ Fishdata_owner_code : chr "DE002" "DE002" "DE002" "DE002" ...
## $ lock_ID : chr "lo_35" "lo_35" "lo_35" "lo_35" ...
## $ WSV_ID : chr "lo_35_2000" "lo_35_2001" "lo_35_2002" "lo_35_2003" ...
## $ Stabu_ID : chr "lo_99_2000" "lo_99_2001" "lo_99_2002" "lo_99_2003" ...
## $ Barriers_catchment_down : chr "Yes" "Yes" "Yes" "Yes" ...
## $ Barriers_segment_up : chr "No" "No" "No" "No" ...
## $ Barriers_segment_down : chr "No" "No" "No" "No" ...
## $ Barriers_no_segment_up : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Barriers_no_segment_down : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Barriers_dist_segment_up : num -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 ...
## $ Barriers_dist_segment_down: num -999 -999 -999 -999 -999 -999 -999 -999 -999 -999 ...
## $ Impoundment : chr "No" "No" "No" "No" ...
## $ Hydropeaking : chr "No" "No" "No" "No" ...
## $ Water_abstraction : chr "Weak" "Weak" "Weak" "Weak" ...
## $ Water_use : chr "No" "No" "No" "No" ...
## $ Hydro_mod : chr "No" "No" "No" "No" ...
## $ Temperature_impact : chr "No" "No" "No" "No" ...
## $ Velocity_increase : chr "Yes" "Yes" "Yes" "Yes" ...
## $ Reservoir_flushing : chr "No" "No" "No" "No" ...
## $ Sedimentation : chr "No" "No" "No" "No" ...
## $ Channelisation : chr "Straightened" "Straightened" "Straightened" "Straightened" ...
## $ Cross_section : chr "Technical crossec/U-profile" "Technical crossec/U-profile" "Technical crossec/U-profile" "Technical crossec/U-profile" ...
## $ Instream_habitat : chr "High" "High" "High" "High" ...
## $ Riparian_vegetation : chr "High" "High" "High" "High" ...
## $ Embankment : chr "Continous no permeability" "Continous no permeability" "Continous no permeability" "Continous no permeability" ...
## $ Floodprotection : chr "Yes" "Yes" "Yes" "Yes" ...
## $ Floodplain : chr "No" "No" "No" "No" ...
## $ Toxic_substances : chr "Intermediate" "Intermediate" "Intermediate" "Intermediate" ...
## $ Acidification : chr "No" "No" "No" "No" ...
## $ Water_quality_index : int 3 3 3 3 3 3 3 3 3 3 ...
## $ Eutrophication : chr "Low" "Low" "Low" "Low" ...
## $ Organic_pollution : chr "Weak" "Weak" "Weak" "Weak" ...
## $ Organic_siltation : chr "Yes" "Yes" "Yes" "Yes" ...
## $ Navigation : chr "Strong" "Strong" "Strong" "Strong" ...
# How many locks here?
length(unique(occasion$lock_ID))
## [1] 37
# So the column lock_ID inside the df occasions matches the column
# lock_WSV column in the df Locks_WSV (which has traffic data)
# Lets trim Occasions for ease of plotting
occasion2 <- dplyr::select(occasion, River_name_copy, Site_code_LR, Occ_laufnr,
Yearocc, lock_ID, WSV_ID, Stabu_ID)
str(occasion2)
## 'data.frame': 2737 obs. of 7 variables:
## $ River_name_copy: chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_code_LR : chr "DELR0399N" "DELR0399N" "DELR0399N" "DELR0399N" ...
## $ Occ_laufnr : int 3608 3609 3610 3611 3849 3846 3847 3848 3835 3840 ...
## $ Yearocc : int 2000 2001 2002 2003 1999 1997 1997 1998 1997 1998 ...
## $ lock_ID : chr "lo_35" "lo_35" "lo_35" "lo_35" ...
## $ WSV_ID : chr "lo_35_2000" "lo_35_2001" "lo_35_2002" "lo_35_2003" ...
## $ Stabu_ID : chr "lo_99_2000" "lo_99_2001" "lo_99_2002" "lo_99_2003" ...
length(unique(occasion2$Site_code_LR))
## [1] 842
length(unique(occasion2$WSV_ID)) # Value of locks x year (multiple years per lock)
## [1] 189
# Study sites (same as woltersites above, but a csv here)
studysites <- read.csv("../NAVIDIV/Data/Wolters/Sites_trimmed.csv")
# Combine studysites name, lat and long AND occasion2 locks
linking <- left_join(occasion2, studysites, c("Site_code_LR" = "Site_code_LR"))
str(linking)
## 'data.frame': 2737 obs. of 12 variables:
## $ River_name_copy: chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_code_LR : chr "DELR0399N" "DELR0399N" "DELR0399N" "DELR0399N" ...
## $ Occ_laufnr : int 3608 3609 3610 3611 3849 3846 3847 3848 3835 3840 ...
## $ Yearocc : int 2000 2001 2002 2003 1999 1997 1997 1998 1997 1998 ...
## $ lock_ID : chr "lo_35" "lo_35" "lo_35" "lo_35" ...
## $ WSV_ID : chr "lo_35_2000" "lo_35_2001" "lo_35_2002" "lo_35_2003" ...
## $ Stabu_ID : chr "lo_99_2000" "lo_99_2001" "lo_99_2002" "lo_99_2003" ...
## $ River_name : chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_name : chr "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" ...
## $ Y_Latitude : num 52.2 52.2 52.2 52.2 53 ...
## $ X_Longitude : num 11.7 11.7 11.7 11.7 11.6 ...
## $ Strahler : int 7 7 7 7 7 7 7 7 7 7 ...
# Now lets add the column "FRA_FTK_SpB_Son" from Locks_WSV which gives
# cumulative shipping per year per lock
sumshipdf <- dplyr::select(lockswsv, FRA_FTK_SpB_Son, lock_ID_WSV, WSV_ID)
str(sumshipdf)
## 'data.frame': 814 obs. of 3 variables:
## $ FRA_FTK_SpB_Son: int 19796 19796 19796 19796 19796 19796 19796 19796 19796 19796 ...
## $ lock_ID_WSV : chr "lo_01" "lo_01" "lo_01" "lo_01" ...
## $ WSV_ID : chr "lo_01_1994" "lo_01_1995" "lo_01_1996" "lo_01_1997" ...
linking <- left_join(linking, sumshipdf, c("lock_ID" = "lock_ID_WSV",
"WSV_ID" = "WSV_ID"))
str(linking)
## 'data.frame': 2737 obs. of 13 variables:
## $ River_name_copy: chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_code_LR : chr "DELR0399N" "DELR0399N" "DELR0399N" "DELR0399N" ...
## $ Occ_laufnr : int 3608 3609 3610 3611 3849 3846 3847 3848 3835 3840 ...
## $ Yearocc : int 2000 2001 2002 2003 1999 1997 1997 1998 1997 1998 ...
## $ lock_ID : chr "lo_35" "lo_35" "lo_35" "lo_35" ...
## $ WSV_ID : chr "lo_35_2000" "lo_35_2001" "lo_35_2002" "lo_35_2003" ...
## $ Stabu_ID : chr "lo_99_2000" "lo_99_2001" "lo_99_2002" "lo_99_2003" ...
## $ River_name : chr "Elbe" "Elbe" "Elbe" "Elbe" ...
## $ Site_name : chr "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" "Abstiegskanal" ...
## $ Y_Latitude : num 52.2 52.2 52.2 52.2 53 ...
## $ X_Longitude : num 11.7 11.7 11.7 11.7 11.6 ...
## $ Strahler : int 7 7 7 7 7 7 7 7 7 7 ...
## $ FRA_FTK_SpB_Son: int 17645 15078 14514 15436 37873 29557 29557 36800 29557 36800 ...
# Now lets save it as a spatial feature so that we can plot it
linking_sf <- st_as_sf(linking, coords = c("X_Longitude", "Y_Latitude"), crs = 4326)
# Lets change the name of that column to something sensible
linking_sf <- rename(linking_sf, Cumulative_Annual_Ships = FRA_FTK_SpB_Son)
# Lets plot it out
cumship_locks <- mapview(linking_sf, zcol = "Cumulative_Annual_Ships", legend = TRUE)
cumship_locks
Whats nice about this is that it links the biodiveristy data with the lock shipping data from that year of collection.
# Can play around with colors if we want
mapviewOptions(vector.palette = c("plum1", "firebrick4", "black"))
mapview(linking_sf, zcol = "Cumulative_Annual_Ships", legend = TRUE)
For example, when we zoom in on two biodiversity collection sites and look at their metadata, we see that one uses data from a 2007 lock and the other from 1999.
.
However, there is currently an issue with missing data Locks lo_99, lo_88, and lo_30 in the “lock_ID_WSV” column have 0 values for all years. This comes from the Locks_WSV csv file.
lockswsv <- read.csv("../NAVIDIV/Data/Wolters/Locks_WSV.csv")
missinglocks <- dplyr::filter(lockswsv, lock_ID_WSV == c("lo_99", "lo_88",
"lo_30"))
## Warning in lock_ID_WSV == c("lo_99", "lo_88", "lo_30"): longer object length is
## not a multiple of shorter object length
These locks account for roughly ~30% of all the observations…
length(occasion$lock_ID)
## [1] 2737
length(occasion$lock_ID[occasion$lock_ID == 'lo_99'])
## [1] 646
length(occasion$lock_ID[occasion$lock_ID == 'lo_88'])
## [1] 105
length(occasion$lock_ID[occasion$lock_ID == 'lo_30'])
## [1] 38
(645 + 105 + 38) / 2737
## [1] 0.2879065
If you zoom in on the Danube, you can see a good amount of Zero values.
Currently working with CW to figure this out. Will update.
.
For now, we can plot out March just to see spatial cover.
# For the sake of visualization, I'll just plot out one hour of one day.
marchpos1 <- st_read("../NAVIDIV/GIS/Shipping/March2022_Positions/28_03_22/28_03_22_16h.shp")
## Reading layer `28_03_22_16h' from data source
## `C:\Users\Aaron Sexton\Desktop\NAVIDIV\GIS\Shipping\March2022_Positions\28_03_22\28_03_22_16h.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 6660 features and 17 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -2.898483 ymin: 42.26117 xmax: 8.499942 ymax: 51.19984
## Geodetic CRS: WGS 84
mapview(marchpos1)
Now lets plot them together.
mapview(marchpos1, color = "black", col.regions = "grey", lwd = 2, layer.name = "RICOMEX") +
mapview(linking_sf, color = "black", col.regions = "pink", lwd = 2, layer.name = "CW")