In order to make an exmaple of the problem I will use a variation of the example used in the SF website
b0 = st_polygon(list(rbind(c(-1,-1), c(1,-1), c(1,1), c(-1,1), c(-1,-1))))
b1 = b0 + 2
b2 = b0 + c(-0.2, 2)
x = st_sfc(b0, b1, b2)
red_sf <- x %>% st_as_sf() %>% mutate(Red = 1)
red_sf$NameRed <- c("b0", "b1", "b2")
a0 = b0 * 0.8
a1 = a0 * 0.5 + c(2, 0.7)
a2 = a0 + 0.5
a3 = b0 * 0.5 + c(2, -0.5)
y = st_sfc(a1,a2,a3)
green_sf <- y %>% st_as_sf() %>% mutate(Green = 1)
green_sf$NameGreen <- c("a1", "a2", "a3")
This creates 2 groups of polygons that we can see here:
As it can be seen there is some intersection between both groups of polygons, that compromises the 3 polygons in the red group, and 2 of the polygons in the green group. Since the solution is going to be later used for several groups each with over 300.000 polygons, the idea is that the solution generates a pre calculated sf, where all the polygons that are not intersecting with the other group are left us such.
If I add the green and read areas I should always get the following area:
st_area(st_union(st_union(red_sf), st_union(green_sf)))
## [1] 14.01
Which is the area of the following polygon:
This should always get you:
st_area(st_union(red_sf))
## [1] 12
Which is the area of the following polygon:
This should always get you:
st_area(st_union(green_sf))
## [1] 4.2
Which is the area of the following polygon:
x=1;y=2;z=x+y;z
## [1] 3
And then the difference of both layers with the intersection
I found that with just terra’s function union, it works super fast, and it is very simple:
red_vect <- red_sf %>% vect()
green_vect <- green_sf %>% vect()
All_Terra <- terra::union(red_vect, green_vect)
All_Terra$Area <- All_Terra %>% terra::expanse()
if we saw this as a table it looks like this:
Red | NameRed | Green | NameGreen | Area |
---|---|---|---|---|
1 | b0 | 1 | a2 | 1.69 |
1 | b1 | 1 | a1 | 0.08 |
1 | b1 | 1 | a2 | 0.09 |
1 | b2 | 1 | a2 | 0.33 |
1 | b0 | NaN | NA | 2.31 |
1 | b1 | NaN | NA | 3.83 |
1 | b2 | NaN | NA | 3.67 |
NaN | NA | 1 | a1 | 0.56 |
NaN | NA | 1 | a2 | 0.45 |
NaN | NA | 1 | a3 | 1.00 |
if we visualize this:
Total Area:
sum(All_Terra$Area)
## [1] 14.01
Green Area:
sum(All_Terra[All_Terra$Green == 1,]$Area)
## [1] 4.2
Red Area:
sum(All_Terra[All_Terra$Red == 1,]$Area)
## [1] 12
When we try with this:
Natura_Small <- sf::read_sf("/vsicurl/https://github.com/Sustainscapes/BiodiversityCounsil/raw/master/ForQuestion/Natura_Small.shp") %>%
terra::vect() %>% aggregate(by = "temanavn")
markblokkort_small <- sf::read_sf("/vsicurl/https://github.com/Sustainscapes/BiodiversityCounsil/raw/master/ForQuestion/markblokkort_small.shp") %>%
terra::vect() %>% aggregate(by='MB_TYPE')
We can see that the area of Natura is:
terra::expanse(Natura_Small) %>% prettyNum(big.mark = ",")
## [1] "755,664,778"
however when we try this:
Mark_Natura <- terra::union(markblokkort_small,Natura_Small)
Mark_Natura$Area <- expanse(Mark_Natura)
Mark_Natura_DF <- as.data.frame(Mark_Natura)
MB_TYPE | agg_n | temanavn | agg_n | Area |
---|---|---|---|---|
OMD | 26504 | Natura2000-områder | 20 | 68133354.3 |
PGR | 12738 | Natura2000-områder | 20 | 56607322.2 |
OMD | 26504 | NA | NA | 1881661298.0 |
PGR | 12738 | NA | NA | 237778740.2 |
NA | 20 | Natura2000-områder | NA | 927270.8 |
And we calculate the total area of natura 2000
sum(Mark_Natura[!is.na(Mark_Natura$temanavn),]$Area) %>% prettyNum(big.mark = ",")
## [1] "125,667,947"
So we lost 629,996,831 Sq meters
if we visualize this:
Natura_Smallest <- sf::read_sf("/vsicurl/https://github.com/Sustainscapes/BiodiversityCounsil/raw/master/ForQuestion/Natura_Smallest.shp") %>%
terra::vect() %>% aggregate(by = "temanavn")
markblokkort_smallest <- sf::read_sf("/vsicurl/https://github.com/Sustainscapes/BiodiversityCounsil/raw/master/ForQuestion/markblokkort_smallest.shp") %>%
terra::vect() %>% aggregate(by='MB_TYPE')
We can see that the area of Natura is:
terra::expanse(Natura_Smallest) %>% prettyNum(big.mark = ",")
## [1] "82,220,915"
Now when we try this:
Mark_Natura_smallest <- terra::union(markblokkort_smallest,Natura_Smallest)
Mark_Natura_smallest$Area <- expanse(Mark_Natura_smallest)
Mark_Natura_DF_smallest <- as.data.frame(Mark_Natura_smallest)
MB_TYPE | mean_agg_n | agg_n | temanavn | mean_agg_n | agg_n | Area |
---|---|---|---|---|---|---|
OMD | 26504 | 1 | Natura2000-områder | 250 | 1 | 15376765 |
PGR | 12738 | 1 | Natura2000-områder | 250 | 1 | 7975296 |
OMD | 26504 | 1 | NA | NaN | NA | 91141740 |
PGR | 12738 | 1 | NA | NaN | NA | 18286329 |
NA | 250 | 1 | Natura2000-områder | NaN | NA | 58868853 |
And we calculate the total area of natura 2000
sum(Mark_Natura_smallest[!is.na(Mark_Natura_smallest$temanavn),]$Area) %>% prettyNum(big.mark = ",")
## [1] "82,220,915"
So we lost 0.1291361 Sq meters
if we visualize this: