Source: Washington Dept. of Fish & Wildlife

Introduction

This project uses data from the 2019 HawkWatch International Golden Eagles study. This study contains information on 33 unique Aquila Chrysaetos captured and tracked between 1999 and 2008 with the goal of documenting migration routes via GPS trackers. The study’s dataset has captured information on the eagles themselves (sex, date of birth) and their location at different timestamps throughout their migratory journeys.

Source: https://datarepository.movebank.org/entities/datapackage/aca486d6-3d99-4aeb-9d23-2bb61d28a7eb

Key variables:
-Timestamp
-Latitude/longitude
-Animal ID
-Date of Birth and Life Stage
-Sex

With these variables I would like to explore the hawks’ distance traveled over time and variables that might affect this. I would mainly like to focus on how seasonal migration patterns affect distance traveled. To achieve this, I will clean the data by joining the bird reference and location data frames, removing outliers, and mutating columns for the difference between distance and time for each measured location point.

The eagles in the study were captured, recorded, fitted with a tracking harness, and released. Location information was tracked using Argos satellite telemetry.

I chose to study this dataset because I am a fan of birds and I enjoyed looking into the migration patterns of birds featured in the FAA wildlife collisions dataset from a previous project. I also liked that this dataset included location information that I could plot on a map.

Load libraries and dataset

library("tidyverse")
library("geosphere")
library("ggfortify")

eagles <- read_csv("golden_eagles.csv")
eagles_reference <- read_csv("golden_eagles_reference.csv")

Join data frames

colnames(eagles)[21] <- "animal-id"     #ensure id column names match
eagles2 <- left_join(eagles, eagles_reference, by = "animal-id")

Clean

#Remove symbols from column names
names(eagles2) <- gsub("-", "_", names(eagles2))
names(eagles2) <- gsub("argos\\:", "", names(eagles2))

Filter and gather useful variables

eagles3 <- eagles2 |>
  filter(visible == TRUE) |>  #filter out outliers
  select(timestamp,    #select only useful variables
         location_long, 
         location_lat, 
         comments, 
         animal_id, 
         deploy_on_date, 
         deploy_off_date, 
         deployment_end_comments, 
         animal_exact_date_of_birth, 
         animal_life_stage, 
         animal_sex, 
         tag_mass) |>
  group_by(animal_id) |>    #group by bird to separate times/positions 
  mutate(last_latitude = lag(location_lat, n=1),   
         last_longitude = lag(location_long, n=1),
         last_time = lag(timestamp, n=1),
#source of lag: https://www.geeksforgeeks.org/r-language/how-to-create-a-lag-variable-within-each-group-in-r/
         age_days = as.numeric(difftime(timestamp, animal_exact_date_of_birth, units = "days")),
         sec_since_last = as.numeric(difftime(timestamp, last_time, units = "secs")),
#source of difftime: https://www.geeksforgeeks.org/r-language/calculate-time-difference-between-dates-in-r-programming-difftime-function/
         meters_traveled = distHaversine(matrix(c(location_long, location_lat), ncol = 2), matrix(c(last_longitude, last_latitude), ncol = 2)),
#source of geosphere: https://stackoverflow.com/questions/32363998/function-to-calculate-geospatial-distance-between-two-points-lat-long-using-r
         travel_rate = meters_traveled/sec_since_last,
         season = case_when(
           month(timestamp) %in% 9:11 ~ "Autumn",
           month(timestamp) %in% c(12, 1, 2) ~ "Winter",
           month(timestamp) %in% 3:5 ~ "Spring",
           month(timestamp) %in% 6:8 ~ "Summer"
         )) |>
  filter(!travel_rate > 35)  #from readme "max plausible speed = 35m/sec"

I gathered the last known position to find out how far the bird traveled since the last position update using the geosphere library’s distHaversine function. I then found the time difference between the two position recordings and divided by this time to find the bird’s average speed over the time interval in meters per second.

Multiple linear regression model

eagles_model <- lm(travel_rate ~ season + age_days, data = eagles3)
#animal_sex and tag_mass dropped due to statistical insignificance, animal_life_stage dropped due to multicollinearity
summary(eagles_model)
## 
## Call:
## lm(formula = travel_rate ~ season + age_days, data = eagles3)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.961 -1.989 -1.356 -0.007 32.931 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.097e+00  6.498e-02  32.266  < 2e-16 ***
## seasonSpring -2.137e-01  8.038e-02  -2.659  0.00784 ** 
## seasonSummer  1.062e+00  8.156e-02  13.020  < 2e-16 ***
## seasonWinter -7.162e-01  7.901e-02  -9.065  < 2e-16 ***
## age_days      4.844e-04  6.659e-05   7.273 3.62e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.364 on 22804 degrees of freedom
## Multiple R-squared:  0.02375,    Adjusted R-squared:  0.02358 
## F-statistic: 138.7 on 4 and 22804 DF,  p-value: < 2.2e-16

Diagnostic plots to check assumptions

#autoplot(eagles_model, 1:4, nrow=2, ncol=2)

Equation: travel_rate = -.2137(Spring) + 1.062(Summer) -.7162(Winter) + .0004844(age_days) + 2.097

This suggests that golden eagles travel fasts in the summer and slowest in the winter, and that they travel slightly faster as they age.

P-value: 2.2e-16, suggests statistical significance, likely small from large sample size.

Adjusted R^2: 0.02358, meaning ~2.4% of variance in speed can be explained by the model.

Diagnostic plots:
The Normal Q-Q plot’s right tail deviates from the line, indicating non-normality.
The scale-location plot’s line increases with fitted values. I believe some of the recorded location data may be inaccurate, causing outliers in the speed calculation.

Overall, this model shows that the variables have a statistically significant influence on travel speed, but it is not good for predicting speed.

Prepare data for first plot

eagles4 <- eagles3 |>
  group_by(animal_life_stage, month(timestamp)) |>
  summarize(avg_velocity = mean(travel_rate))

#Clean data for plotting, make names more clear
colnames(eagles4)[2] <- "month"
eagles4$animal_life_stage[eagles4$animal_life_stage == "ATY"] <- "After Third Year"
eagles4$animal_life_stage[eagles4$animal_life_stage == "HY"] <- "Hatch Year"
eagles4$animal_life_stage[eagles4$animal_life_stage == "SY"] <- "Second Year" 
#Separated due to renamed month column
#Initially meant to fill line plot by season (unsuccessful) 
eagles4 <- eagles4 |>
  mutate(season = case_when(
           month %in% 9:11 ~ "Autumn",
           month %in% c(12, 1, 2) ~ "Winter",
           month %in% 3:5 ~ "Spring",
           month %in% 6:8 ~ "Summer"))

#Reorder for facet plot by factoring
eagles4$animal_life_stage <- factor(eagles4$animal_life_stage,
                                       levels = c("Hatch Year",
                                                  "Second Year",
                                                  "After Third Year"))
#head(eagles4)

Plot 1

ggplot(eagles4, aes(x = month, y = avg_velocity)) +
  geom_line() + 
  geom_ribbon(data = subset(eagles4, month %in% 6:9),
              aes(ymin = 0, ymax = avg_velocity),
              fill = "darkgreen") +
  geom_ribbon(data = subset(eagles4, month %in% 9:12),
              aes(ymin = 0, ymax = avg_velocity),
              fill = "darkorange") +
  geom_ribbon(data = subset(eagles4, month %in% 12),
              aes(ymin = 0, ymax = avg_velocity),
              fill = "lightcyan") +
  geom_ribbon(data = subset(eagles4, month %in% 1:3),
              aes(ymin = 0, ymax = avg_velocity),
              fill = "lightcyan") +
  geom_ribbon(data = subset(eagles4, month %in% 3:6),
              aes(ymin = 0, ymax = avg_velocity),
              fill = "lightgreen") +
#source for color filling: https://stackoverflow.com/questions/28730083/filling-in-the-area-under-a-line-graph-in-ggplot2-geom-area and help from google ai overview
  labs(title = "Golden Eagle Travel Velocity by Age Group",
       x = "Month",
       y = "Average Velocity (m/s)",
       caption = "Color indicates season \nSource: HawkWatch International Golden Eagles") +
  facet_wrap(~animal_life_stage) + 
  scale_x_continuous(
    breaks = 1:12,
    labels = month.abb
  ) + 
  theme_bw() + 
  theme(axis.text.x = element_text(angle = 90))

These faceted line plots show average golden eagle travel speed in m/s for each month of the year, separated by eagle age group. The area under the line shows the season by color to clearly visualize how season effects travel speed while showing the differences in speed for younger vs older eagles.

Plot 2

https://public.tableau.com/views/Book3_17658443372950/Sheet1?:language=en-US&:sid=&:redirect=auth&:display_count=n&:origin=viz_share_link

This map was created using Tableau Pages to animate golden eagle movement over time. Each point indicates the location of an eagle at the time stamp displayed under the title. The animal ID is displayed near each point to differentiate each bird at a glance. The size of each point grows proportionally to the eagles’ age in days, and sex (m/f) is indicated by point shape. Darker points indicate a higher travel speed. Each point contains a tooltip when hovered over which contains additional information such as exact time the location was captured, date of birth, and coordinates. The eagle ids can be filtered to focus on the movement of specific birds.

Conclusion

Through my exploration of the data and the subsequent visualizations, I have found that season and age affect the speed at which golden eagles travel, while sex seems to have little to no impact. From my line plots, it is apparent that the eagles cover the most distance over time during late spring to summer, and from late summer to early fall. This does not surprise me, as it is consistent with the migration patterns of golden eagles. Golden eagles migrate from northern breeding grounds like Alaska and Canada to the southwest US towards winter, and they migrate north to higher elevations in the summer. My map is also consistent with, as the animation shows northern migration to Canada and Alaska in the summer to the southern US towards winter. It also shows that some birds don’t move much, which I found could be explained by the fact that some eagles, especially those living at high elevations like the Rocky Mountains, tend to stay local year-round.

I initially wanted to measure distance traveled by the eagles, but the time intervals between measurements were very inconsistent, forcing me to measure distance over time. I also wanted to see how the speed would change at different times of the day (or night), but I could not accurately analyse this for the same reason of inconsistent time intervals since some gaps would span multiple days.

Citations

Katzner, Todd E., et al. “Golden Eagle (Aquila Chrysaetos), Version 2.0.” Birds of the World, 2020, birdsoftheworld.org/bow/species/goleag/cur/introduction, https://doi.org/10.2173/bow.goleag.02.

Smith JP. 2019. Data from: Study “HawkWatch International Golden Eagles”. Movebank Data Repository. https://doi.org/10.5441/001/1.95r77m9k