GPS Tracking of Eagle Data in Iowa

How do Eagles in Iowa Traverse?

We wanted to see how Eagles moved around in the “Hawkeye” state!

Background

  • Biologging is the use of attached sensors that go on animals to monitor their movement, heart rate & other behaviors

  • In this large scale observational study over 2 million data points were collected from 57 eagles over a 4 year period

Our variables

  • Overall, 9 numerical variables were drawn from this survey to chart out GPS points

  • We are only working with six variables in our data to analyze and determine what our tracking points say relating to 2 research questions:

  1. Can we use the movement variables KPH, Sn, AGL, |Angle|, Vertical rate, and |VR|, to differentiate between in-flight from perching points?
  2. What are some big takeaways from analysis and visualization of the in-flight points that we can determine?

Data

This is a massive set of data we’re gifted (2 million rows) we might have to use data scaling skills to make it smaller

# A tibble: 6 × 15
  Animal_ID TimeDiff segment_id segment_length LocalTime           Latitude Longitude       X        Y   KPH    Sn   AGL VerticalRate abs_angle absVR
      <int>    <int>      <dbl>          <int> <dttm>                 <dbl>     <dbl>   <dbl>    <dbl> <dbl> <dbl> <dbl>        <dbl>     <dbl> <dbl>
1       105        5          1             10 2019-06-25 16:44:10     41.6     -92.9 509745. 4609596.  0.07  0.52  9.55        -0.99      0.45  0.99
2       105        6          1             10 2019-06-25 16:44:16     41.6     -92.9 509746. 4609597.  0.07  0.17  9.55         0         0.21  0   
3       105        5          1             10 2019-06-25 16:44:21     41.6     -92.9 509747. 4609597.  0.2   0.14  9.55         0         0.91  0   
4       105        6          1             10 2019-06-25 16:44:27     41.6     -92.9 509747. 4609599.  0.16  0.22  9.55         0         3.14  0   
5       105        5          1             10 2019-06-25 16:44:32     41.6     -92.9 509747. 4609598.  0.11  0.08 11.6          0.39      2.69  0.39
6       105        6          1             10 2019-06-25 16:44:38     41.6     -92.9 509746. 4609599.  0.07  0.22 11.6          0         0.53  0   

Methods

  • We want to minimize the dataset since 2 million is a lot of rows. Personally I grouped by Animal ID for random selection of smaller data for optimization.
  • We want to distinguish points from each other before we look at a PCA biplot.
  • Clustering methods that will help pin point what points should be grouped together

Is our data normal?

  • What would a distributions of each variable look like?
  • We see that abs_angle, absVR, AGL and Sn all very positively skewed- not good.

Fixing our data first

  • Before we do any work with the data we want to make sure that our data is regularly skewed

  • I used an Square Root Transformation to normalize the focal variables

Clustering our data

  • K-means is a very effective way of trying to cluster large datasets related to movement analysis

1st.) Silhouette 2nd.) WSS

Clustered PCA biplot k=2

  • This PCA biplot is used to distinguish GPS tracking points that were moving and those that were perched so we used 2 clusters.
PCA Biplot of the data k=2

Takeaways from the PCA Biplot where k=2

  • In-flight data points are likely overly represented in the first cluster where vectors AGL, KPH,Sn, and absVR were projected far across the primarly first observations.

  • Perched data points were likely over represented and grouped in the second cluster where where vectors VerticalRate and abs_angle were projected furter across the second group.

  • The strongest variables were KPH, Sn, & AGL

  • The weakest variable was verticalRate

  • 67% of variance explained by first two PC in covariates

Clustered PCA biplot k=4

This PCA biplot is used to distinguish between the movement that were flying points-ascending, flapping, descending-and see patterns

Takeaways from the PCA Biplot where k=4

  • The clusters that had was highest in KPH and in SN, strong in other clusters was the gliding/descending cluster. It also had much lower association with vertical rate and absolute angle.

  • The cluster strong in AGL, absVR, VerticalRate, and abs_angle was the ascending movement. Lower association with KPH and Sn

  • The cluster stronger in KPH and Sn, was the flapping movement. Which had lower association with AGL, absVR, VerticalRate, abs_angle.

  • Accounts for 68.7% of the variance in covariates

Looking at In-flight Behaviors

  • We want to view ‘in-flight’ behavior by randomly selecting 1 eagle to visualize their KPH over time (11 second time intervals)

  • This will allow us to deduce what behaviors (the four) are exhibited at certain times of their journey

  • Behaviors that I deduced: descending had the highest KPH, followed by ascending and flapping. Perching had next to none.

Appendix

#loading libraries and our dataset
library(cluster)
library(dbscan)
library(factoextra)
library(tidyverse)
library(patchwork)
library(ggrepel)
library(tidyverse)
options(width=10000)

#loading our dataset 
load('eagle_data.Rdata')
(eagle_data
  %>% as_tibble() 
  ) %>% head

Appendix 2

set.seed(123)

#Getting a rough grouped sample 
sample_1000 <- eagle_data %>%
  group_by(Animal_ID) %>%
  sample_frac(size = 1000 / nrow(eagle_data)) %>%
  ungroup()

#Selecting just the important variables of the dataset
vars <- sample_1000 %>%
  select(KPH, Sn, AGL, abs_angle, VerticalRate, absVR)

#pivoting those variable 
vars_long <- vars %>%
  pivot_longer(cols = c(Sn, KPH, AGL, VerticalRate, absVR, abs_angle),
               names_to = "variable",
               values_to = "value")

#faceted histogram of each graph 
ggplot(vars_long, aes(x = value)) +
  geom_histogram(bins = 30, color = "white") +
  facet_wrap(~ variable, scales = "free") +
  theme_minimal() +
  labs(
    title = "Distribution of Selected Variables",
    x = "Value",
    y = "Count"
  )
#sqaure rooting the variables
vars_fixed <- vars %>%
  mutate(
    AGL = sqrt(AGL),
    absVR = sqrt(absVR),
    abs_angle = sqrt(abs_angle),
    Sn = sqrt(Sn)
  )


#pivoting the variables to fit them into a faceted data
vars_long <- vars_fixed %>%
  pivot_longer(cols = c(Sn, KPH, AGL, VerticalRate, absVR, abs_angle),
               names_to = "variable",
               values_to = "value")


#graphing the faceted histogram for normalized values
ggplot(vars_long, aes(x = value)) +
  geom_histogram(bins = 30, color = "white") +
  facet_wrap(~ variable, scales = "free") +
  theme_minimal() +
  labs(
    title = "Histograms of Focal Variables after Sq. Root Transformation",
    x = "Value",
    y = "Count"
  )

Appendix 3