We want to minimize the dataset since 2 million is a lot of rows. Personally I grouped by Animal ID for random selection of smaller data for optimization.
We want to distinguish points from each other before we look at a PCA biplot.
Clustering methods that will help pin point what points should be grouped together
Is our data normal?
What would a distributions of each variable look like?
We see that abs_angle, absVR, AGL and Sn all very positively skewed- not good.
Fixing our data first
Before we do any work with the data we want to make sure that our data is regularly skewed
I used an Square Root Transformation to normalize the focal variables
Clustering our data
K-means is a very effective way of trying to cluster large datasets related to movement analysis
1st.) Silhouette 2nd.) WSS
Clustered PCA biplot k=2
This PCA biplot is used to distinguish GPS tracking points that were moving and those that were perched so we used 2 clusters.
Takeaways from the PCA Biplot where k=2
In-flight data points are likely overly represented in the first cluster where vectors AGL, KPH,Sn, and absVR were projected far across the primarly first observations.
Perched data points were likely over represented and grouped in the second cluster where where vectors VerticalRate and abs_angle were projected furter across the second group.
The strongest variables were KPH, Sn, & AGL
The weakest variable was verticalRate
67% of variance explained by first two PC in covariates
Clustered PCA biplot k=4
This PCA biplot is used to distinguish between the movement that were flying points-ascending, flapping, descending-and see patterns
Takeaways from the PCA Biplot where k=4
The clusters that had was highest in KPH and in SN, strong in other clusters was the gliding/descending cluster. It also had much lower association with vertical rate and absolute angle.
The cluster strong in AGL, absVR, VerticalRate, and abs_angle was the ascending movement. Lower association with KPH and Sn
The cluster stronger in KPH and Sn, was the flapping movement. Which had lower association with AGL, absVR, VerticalRate, abs_angle.
Accounts for 68.7% of the variance in covariates
Looking at In-flight Behaviors
We want to view ‘in-flight’ behavior by randomly selecting 1 eagle to visualize their KPH over time (11 second time intervals)
This will allow us to deduce what behaviors (the four) are exhibited at certain times of their journey
Behaviors that I deduced: descending had the highest KPH, followed by ascending and flapping. Perching had next to none.
Appendix
#loading libraries and our datasetlibrary(cluster)library(dbscan)library(factoextra)library(tidyverse)library(patchwork)library(ggrepel)library(tidyverse)options(width=10000)#loading our dataset load('eagle_data.Rdata')(eagle_data%>%as_tibble() ) %>% head
Appendix 2
set.seed(123)#Getting a rough grouped sample sample_1000 <- eagle_data %>%group_by(Animal_ID) %>%sample_frac(size =1000/nrow(eagle_data)) %>%ungroup()#Selecting just the important variables of the datasetvars <- sample_1000 %>%select(KPH, Sn, AGL, abs_angle, VerticalRate, absVR)#pivoting those variable vars_long <- vars %>%pivot_longer(cols =c(Sn, KPH, AGL, VerticalRate, absVR, abs_angle),names_to ="variable",values_to ="value")#faceted histogram of each graph ggplot(vars_long, aes(x = value)) +geom_histogram(bins =30, color ="white") +facet_wrap(~ variable, scales ="free") +theme_minimal() +labs(title ="Distribution of Selected Variables",x ="Value",y ="Count" )
#sqaure rooting the variablesvars_fixed <- vars %>%mutate(AGL =sqrt(AGL),absVR =sqrt(absVR),abs_angle =sqrt(abs_angle),Sn =sqrt(Sn) )#pivoting the variables to fit them into a faceted datavars_long <- vars_fixed %>%pivot_longer(cols =c(Sn, KPH, AGL, VerticalRate, absVR, abs_angle),names_to ="variable",values_to ="value")#graphing the faceted histogram for normalized valuesggplot(vars_long, aes(x = value)) +geom_histogram(bins =30, color ="white") +facet_wrap(~ variable, scales ="free") +theme_minimal() +labs(title ="Histograms of Focal Variables after Sq. Root Transformation",x ="Value",y ="Count" )