EEB4200 Lab 5

Load in packages! (And install as needed)

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(leaflet)
#install.packages("leaflet.providers")
library(leaflet.providers)
#install.packages("V8")
library(V8)
## Using V8 engine 13.6.233.17
## 
## Attaching package: 'V8'
## 
## The following object is masked from 'package:leaflet':
## 
##     JS
#install.packages("leaflet.extras2")
library(leaflet.extras2)
#install.packages("leaflegend")
library(leaflegend)
library(readr)
library(sf)
## Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
#install.packages("rgbif")
library(rgbif)
library(RColorBrewer)
#install.packages("elevatr")
library(elevatr)
## elevatr v0.99.0 NOTE: Version 0.99.0 of 'elevatr' uses 'sf' and 'terra'.  Use 
## of the 'sp', 'raster', and underlying 'rgdal' packages by 'elevatr' is being 
## deprecated; however, get_elev_raster continues to return a RasterLayer.  This 
## will be dropped in future versions, so please plan accordingly.
#install.packages("terra")
library(terra)
## terra 1.8.93
## 
## Attaching package: 'terra'
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
#install.packages("raster")
library(raster)
## Loading required package: sp
## 
## Attaching package: 'raster'
## 
## The following object is masked from 'package:dplyr':
## 
##     select
#install.packages("geodata")
library(geodata)

Connect with data housed in gbif (https://www.gbif.org/) to examine migration patterns of a specific species of bird (4 pts).

I chose to examine the migration patterns of Branta canadensis, the Canada goose. It is a well known migratory bird, primarily found across North America and northern Europe. It had a very large number of occurrences, so I decided to look at recent migratory patterns from 2025 in Europe.

#importing the data
goose_gbif<- occ_download_get(key = "0016597-260226173443078", overwrite = TRUE) %>% 
    occ_download_import(kenya_gbif_download, na.strings = c("", NA))
## Download file size: 15.89 MB
## On disk at /Users/katring07/Documents/R Directory/0016597-260226173443078.zip
#getting a look at the data
summary(goose_gbif)
##      gbifID            datasetKey        occurrenceID         kingdom         
##  Min.   :4070288353   Length:151333      Length:151333      Length:151333     
##  1st Qu.:5104984848   Class :character   Class :character   Class :character  
##  Median :5171547764   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :5323988601                                                           
##  3rd Qu.:5287926073                                                           
##  Max.   :6163321222                                                           
##                                                                               
##     phylum             class              order              family         
##  Length:151333      Length:151333      Length:151333      Length:151333     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     genus             species          infraspecificEpithet  taxonRank        
##  Length:151333      Length:151333      Length:151333        Length:151333     
##  Class :character   Class :character   Class :character     Class :character  
##  Mode  :character   Mode  :character   Mode  :character     Mode  :character  
##                                                                               
##                                                                               
##                                                                               
##                                                                               
##  scientificName     verbatimScientificName verbatimScientificNameAuthorship
##  Length:151333      Length:151333          Length:151333                   
##  Class :character   Class :character       Class :character                
##  Mode  :character   Mode  :character       Mode  :character                
##                                                                            
##                                                                            
##                                                                            
##                                                                            
##  countryCode          locality         stateProvince      occurrenceStatus  
##  Length:151333      Length:151333      Length:151333      Length:151333     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  individualCount   publishingOrgKey   decimalLatitude decimalLongitude 
##  Min.   :   1.00   Length:151333      Min.   :38.73   Min.   :-23.822  
##  1st Qu.:   1.00   Class :character   1st Qu.:51.30   1st Qu.:  4.395  
##  Median :   3.00   Mode  :character   Median :52.79   Median :  6.250  
##  Mean   :  23.49                      Mean   :54.72   Mean   :  8.233  
##  3rd Qu.:  13.00                      3rd Qu.:58.20   3rd Qu.: 12.691  
##  Max.   :9000.00                      Max.   :71.09   Max.   : 38.085  
##  NA's   :29044                                                         
##  coordinateUncertaintyInMeters coordinatePrecision   elevation     
##  Min.   :     0.5              Mode:logical        Min.   :  0.0   
##  1st Qu.:    25.0              NA's:151333         1st Qu.:140.0   
##  Median :   300.0                                  Median :217.5   
##  Mean   :  1430.8                                  Mean   :220.9   
##  3rd Qu.:  1760.0                                  3rd Qu.:236.2   
##  Max.   :532114.0                                  Max.   :480.0   
##  NA's   :20039                                     NA's   :151322  
##  elevationAccuracy  depth         depthAccuracy   eventDate        
##  Min.   : 15.00    Mode:logical   Mode:logical   Length:151333     
##  1st Qu.: 20.00    NA's:151333    NA's:151333    Class :character  
##  Median : 35.00                                  Mode  :character  
##  Mean   : 45.00                                                    
##  3rd Qu.: 59.38                                                    
##  Max.   :120.00                                                    
##  NA's   :151323                                                    
##       day            month             year         taxonKey      
##  Min.   : 1.00   Min.   : 1.000   Min.   :2025   Min.   :5232437  
##  1st Qu.: 8.00   1st Qu.: 3.000   1st Qu.:2025   1st Qu.:5232437  
##  Median :15.00   Median : 5.000   Median :2025   Median :5232437  
##  Mean   :15.28   Mean   : 5.261   Mean   :2025   Mean   :5233887  
##  3rd Qu.:23.00   3rd Qu.: 7.000   3rd Qu.:2025   3rd Qu.:5232437  
##  Max.   :31.00   Max.   :12.000   Max.   :2025   Max.   :7191098  
##  NA's   :150     NA's   :52                                       
##    speciesKey      basisOfRecord      institutionCode    collectionCode    
##  Min.   :5232437   Length:151333      Length:151333      Length:151333     
##  1st Qu.:5232437   Class :character   Class :character   Class :character  
##  Median :5232437   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :5232437                                                           
##  3rd Qu.:5232437                                                           
##  Max.   :5232437                                                           
##                                                                            
##  catalogNumber      recordNumber       identifiedBy      
##  Length:151333      Length:151333      Length:151333     
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##  dateIdentified                  license          rightsHolder      
##  Min.   :2025-01-01 20:05:10   Length:151333      Length:151333     
##  1st Qu.:2025-04-10 19:37:17   Class :character   Class :character  
##  Median :2025-05-30 13:06:30   Mode  :character   Mode  :character  
##  Mean   :2025-06-21 17:03:40                                        
##  3rd Qu.:2025-08-30 19:56:57                                        
##  Max.   :2026-02-23 12:14:31                                        
##  NA's   :143322                                                     
##   recordedBy        typeStatus     establishmentMeans
##  Length:151333      Mode:logical   Length:151333     
##  Class :character   NA's:151333    Class :character  
##  Mode  :character                  Mode  :character  
##                                                      
##                                                      
##                                                      
##                                                      
##  lastInterpreted                mediaType            issue          
##  Min.   :2025-11-04 18:07:09   Length:151333      Length:151333     
##  1st Qu.:2026-02-18 02:01:52   Class :character   Class :character  
##  Median :2026-02-28 15:03:50   Mode  :character   Mode  :character  
##  Mean   :2026-02-17 09:12:38                                        
##  3rd Qu.:2026-03-03 22:43:23                                        
##  Max.   :2026-03-04 23:15:53                                        
## 
head(goose_gbif)
## # A tibble: 6 × 50
##   gbifID datasetKey occurrenceID kingdom phylum class order family genus species
##   <int6> <chr>      <chr>        <chr>   <chr>  <chr> <chr> <chr>  <chr> <chr>  
## 1    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## 2    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## 3    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## 4    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## 5    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## 6    5e9 a4d53b24-… http://tun.… Animal… Chord… Aves  Anse… Anati… Bran… Branta…
## # ℹ 40 more variables: infraspecificEpithet <chr>, taxonRank <chr>,
## #   scientificName <chr>, verbatimScientificName <chr>,
## #   verbatimScientificNameAuthorship <chr>, countryCode <chr>, locality <chr>,
## #   stateProvince <chr>, occurrenceStatus <chr>, individualCount <int>,
## #   publishingOrgKey <chr>, decimalLatitude <dbl>, decimalLongitude <dbl>,
## #   coordinateUncertaintyInMeters <dbl>, coordinatePrecision <lgl>,
## #   elevation <dbl>, elevationAccuracy <dbl>, depth <lgl>, …
#pared down some of the variables to just the ones I need
goose_slim<-goose_gbif%>%
  drop_na(individualCount)%>%
  dplyr::select(gbifID,individualCount,countryCode,stateProvince,decimalLatitude,decimalLongitude,elevation,month,year)%>%
  rename(latitude=decimalLatitude,longitude=decimalLongitude)

#smaller data sets to play around with to begin with
goose_200<-goose_slim[1:200, ] #first 200
goose_2000<-goose_slim[1:2000, ] #first 2000
goose_fl2000<-goose_slim[c(1:1000,121000:122000), ] #first and last 1000, so 2000 combined

#looking at some data that pulls from the start and middle-end of the total data (I wanna make sure it has varying months and locations for experimenting with colors in leaflet)
head(goose_fl2000)
## # A tibble: 6 × 9
##    gbifID individualCount countryCode stateProvince latitude longitude elevation
##   <int64>           <int> <chr>       <chr>            <dbl>     <dbl>     <dbl>
## 1     5e9               4 FI          Uusimaa           60.2      24.6        NA
## 2     5e9              45 FI          Varsinais-Su…     60.4      22.4        NA
## 3     5e9              85 FI          Uusimaa           60.2      24.7        NA
## 4     5e9              30 FI          Päijät-Häme       60.8      26.1        NA
## 5     5e9               2 FI          Uusimaa           60.3      25.2        NA
## 6     5e9             235 FI          Varsinais-Su…     60.5      22.0        NA
## # ℹ 2 more variables: month <int>, year <int>
tail(goose_fl2000)
## # A tibble: 6 × 9
##    gbifID individualCount countryCode stateProvince latitude longitude elevation
##   <int64>           <int> <chr>       <chr>            <dbl>     <dbl>     <dbl>
## 1     5e9               6 BE          Brussels Cap…     50.8      4.43        NA
## 2     5e9               2 BE          Brussels Cap…     50.8      4.44        NA
## 3     5e9               7 BE          Limburg           50.8      5.40        NA
## 4     5e9               2 BE          Antwerp           51.0      4.43        NA
## 5     5e9               1 BE          Limburg           51.0      5.69        NA
## 6     5e9              20 BE          Brussels Cap…     50.8      4.30        NA
## # ℹ 2 more variables: month <int>, year <int>
#color palette according to the months
month<-c(1,2,3,4,5,6,7,8,9,10,11,12)
factpal<-colorFactor("Set3",levels=month)
#test to get a feel for leaflet
map1<-leaflet(goose_fl2000)%>%
  addTiles()%>%
  addCircleMarkers(~longitude,~latitude,popup=goose_fl2000$individualCount,radius=1,fill=TRUE,fillOpacity=0.3,color=~factpal(month))
map1

Display this information (bird observations and migration) on a map via leaflet - include layers for temperature and elevation (10 pts).

Label the three most populous cities near the largest set of observations for your target species (4 pts)

Provide appropriate legends to describe the data (2 pts). Describe the choices you made on the map design and representation.

This first chunk is mostly setting up my data, objects, palettes, etc. so I don’t have to rerun all of this each time I iterate the chunk with my map(s).

# Canada geese typically migrate for winter between September to November, and return for spring between late February to May. So I considered the months December to February, their 'winter' habitat, and June to August, their 'summer' habitat.

#assigning season based on month to the data
winter<-c(1,2,12)
summer<-c(6,7,8)
goose_winter<-goose_slim%>%
  drop_na(individualCount)%>%
  mutate(month=factor(month),season="winter", )%>%
  filter(month%in%winter)
goose_summer<-goose_slim%>%
  drop_na(individualCount)%>%
  mutate(month=factor(month),season="summer")%>%
  filter(month%in%summer)
#putting the season data back together
goose_season<-rbind(goose_winter,goose_summer)

#color palettes for each season
winterpal<-colorFactor("Blues",levels=winter)
summerpal<-colorFactor("Oranges",levels=summer)

#none of the below commented out code really worked, but I thought i would keep it in honor of the blood, sweat, and tears drained into figuring out raster layers or wms tiles or how to get any of that into leaflet

#tavg_data<-worldclim_global(var="tavg", res=10, path=tempdir())
#tavg_data
#lontavg_data_jan<-as.tif(tavg_data$wc2.1_10m_tavg_01)
#rast_tavg_jan<-rast(tavg_data_jan)
#tavg_data_jan
#rast_tavg_jan
#tpal <- colorNumeric(
  #palette = "Spectral",
  #domain = c(min(tavg_data_jan$wc2.1_10m_tavg_01),max(tavg_data_jan$wc2.1_10m_tavg_01)),
  #na.color = "transparent")

#tmap<-leaflet()%>%
  #addWMSTiles(
    #baseUrl = "https://api.meteomatics.com/wms?VERSION=1.3.0&REQUEST=GetMap&LAYERS=mix:t_2m:C&STYLE=JET&CRS=EPSG:4326&BBOX=-90,-180,90,180&WIDTH=1024&HEIGHT=512&FORMAT=image/png",
    #layers = wms_layer,
    #options = WMSTileOptions(format = "image/png", transparent = TRUE), # Common options
    #attribution = "Some Attribution")
#nlyr(tavg_data_jan)

#for question 3
#find highest counts of observations
high_count<-goose_gbif%>%
  group_by(stateProvince)%>%
  summarise(individualCount=sum(individualCount))%>%
  arrange(desc(individualCount))
head(high_count)
## # A tibble: 6 × 2
##   stateProvince   individualCount
##   <chr>                     <int>
## 1 Antwerp                   78538
## 2 East Flanders             63426
## 3 West Flanders             57776
## 4 Flemish Brabant           32728
## 5 Hainaut                   23264
## 6 Namur                     18420
#largest set of observations is near Antwerp. Mechelen and Lier are two other populous cities in the area.
Antwerp_coord<-c(51.2199,4.4150)
Mechelen_coord<-c(51.0267,4.4777)
Lier_coord<-c(51.1333,4.5667)

The below chunk has 2 maps for questions 2-4. I made two maps because I think the slight variations in marker size have their own pros and cons.

#map with previously defined winter or summer observations are marked. this first map has size proportional to the individual count.
map2<-leaflet()%>%
  addTiles()%>%
  addProviderTiles(providers$OpenTopoMap,group="Elevation")%>%
  addProviderTiles(providers$OpenWeatherMap.Temperature,options = providerTileOptions(apiKey = "9cce9e25d3ac85975490885938ddd794"),group="Temperature")%>%
  addOpenweatherTiles(apikey="9cce9e25d3ac85975490885938ddd794",layers = "temperature",group="Temperature")%>%
  addCircleMarkers(data=filter(goose_season,season=="summer"),~longitude,~latitude,popup=goose_season$individualCount,radius=log(goose_season$individualCount),fill=TRUE,opacity=0.3,color=~summerpal(month),group="Summer")%>%
  addCircleMarkers(data=filter(goose_season,season=="winter"),~longitude,~latitude,popup=goose_season$individualCount,radius=log(goose_season$individualCount),fill=TRUE,opacity=0.3,color=~winterpal(month),group="Winter")%>%
  addMarkers(lng = Antwerp_coord[2], lat = Antwerp_coord[1], label = "Antwerp", labelOptions = labelOptions(noHide = TRUE),group="Cities") %>%
  addMarkers(lng = Mechelen_coord[2], lat = Mechelen_coord[1], label = "Mechelen", labelOptions = labelOptions(noHide = TRUE),group="Cities") %>%
  addMarkers(lng = Lier_coord[2], lat = Lier_coord[1], label = "Lier", labelOptions = labelOptions(noHide = TRUE),group="Cities")%>%
  addLegend(title = "Month",
            pal=winterpal,
            values = c(1,2,12),
            labels = num_month,opacity = 1,group="Winter",labFormat=labelFormat())%>%
  addLegend(title = "Month",
            pal=summerpal,
            values = c(6,7,8),
            labels = c("June","July","August"),opacity = 1,group="Summer",labFormat=labelFormat())%>%
  addLegend(title = "Elevation",
            pal = colorNumeric(palette = c("mediumblue","steelblue","mediumseagreen", "limegreen", "khaki1","lightgoldenrod3", "peru", "brown","white"), domain = c(0, 4000)),
            values = c(0, 4000),
            labels = c("0-500m", "501-1000m", "1001-1500m", "1501-2000m", "2001-2500m", "2501-3000m","3001-3500m","3501-4000m"),opacity = 1,group="Elevation")%>%
  addLayersControl(position="topleft",overlayGroups = c("Winter","Summer","Elevation","Temperature","Cities"),options = layersControlOptions(collapsed = FALSE))%>%
  fitBounds(min(goose_season$longitude), min(goose_season$latitude), max(goose_season$longitude), max(goose_season$latitude))
map2
#then for visibility/clarity i made another map where each marker is the same size, so as to obscure less of the map.
map3<-leaflet()%>%
  addTiles()%>%
  addProviderTiles(providers$OpenTopoMap,group="Elevation")%>%
  addProviderTiles(providers$OpenWeatherMap.Temperature,options = providerTileOptions(apiKey = "9cce9e25d3ac85975490885938ddd794"),group="Temperature")%>%
  addOpenweatherTiles(apikey="9cce9e25d3ac85975490885938ddd794",layers = "temperature",group="Temperature")%>%
  addCircleMarkers(data=filter(goose_season,season=="summer"),~longitude,~latitude,popup=goose_season$individualCount,radius=0.2,fill=TRUE,opacity=0.3,color=~summerpal(month),group="Summer")%>%
  addCircleMarkers(data=filter(goose_season,season=="winter"),~longitude,~latitude,popup=goose_season$individualCount,radius=0.2,fill=TRUE,opacity=0.3,color=~winterpal(month),group="Winter")%>%
  addMarkers(lng = Antwerp_coord[2], lat = Antwerp_coord[1], label = "Antwerp", labelOptions = labelOptions(noHide = TRUE),group="Cities") %>%
  addMarkers(lng = Mechelen_coord[2], lat = Mechelen_coord[1], label = "Mechelen", labelOptions = labelOptions(noHide = TRUE),group="Cities") %>%
  addMarkers(lng = Lier_coord[2], lat = Lier_coord[1], label = "Lier", labelOptions = labelOptions(noHide = TRUE),group="Cities")%>%
  addLegend(title = "Month",
            pal=winterpal,
            values = c(1,2,12),
            labels = num_month,opacity = 1,group="Winter",labFormat=labelFormat())%>%
  addLegend(title = "Month",
            pal=summerpal,
            values = c(6,7,8),
            labels = c("June","July","August"),opacity = 1,group="Summer",labFormat=labelFormat())%>%
  addLegend(title = "Elevation",
            pal = colorNumeric(palette = c("mediumblue","steelblue","mediumseagreen", "limegreen", "khaki1","lightgoldenrod3", "peru", "brown","white"), domain = c(0, 4000)),
            values = c(0, 4000),
            labels = c("0-500m", "501-1000m", "1001-1500m", "1501-2000m", "2001-2500m", "2501-3000m","3001-3500m","3501-4000m"),opacity = 1,group="Elevation")%>%
  addLayersControl(position="topleft",overlayGroups = c("Winter","Summer","Elevation","Temperature","Cities"),options = layersControlOptions(collapsed = FALSE))%>%
  fitBounds(min(goose_season$longitude), min(goose_season$latitude), max(goose_season$longitude), max(goose_season$latitude))
map3
#a lot of overlap in the layers so the toggle/interactivity is very necessary, so this would not be well suited to a static image


#explanation of map design:
#base tiles: OpenStreetMap is the default and cannot be toggled, so the points remain in geographical context. OpenTopoMap was used for the elevation with colors matched for the legend. OpenWeatherMap.Temperature was used for the temperature, legend automatically generated for that. The latter two can be toggled, so as to reduce visual noise as needed, but still included for the information.
#markers: Circle Markers are used to show each occurance of Branta Canadensis in a given region. Their colors are from the "Blues" palette (associated with cold) for winter, and the "Oranges" palette (associated with warmth) for summer. Each season is added as its own group of markers so they can be toggled separately. As commented above, the first map has the size of each marker proportional to the individual count of each occurance, while the second map has set marker size. Then I used the default Marker for the three most populous cities, using each cities coordinates.
#legend: Legends for any color associated with a variable, indlucing: elevation, temperature, and month. Month is split between the seasons so they can be toggled with the markers.

#also some quick interpretation based on the visuals. toggling between winter and summer, there seems to be a much fewer observations of Canada Geese in the far north during the winter, but there does not seem to be much of a change in the southern regions during the summer. this makes sense since the more temperate regions may have more residential populations, while regions with more extreme temperature variation (like in the far north) may have more migratory populations (in this case migrating south away from the cold).