Let’s start by loading packages and reading the data:

library(tidyverse)
library(rgdal)
library(mapview)
library(mgcv)
library(tidymv)

frogs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-08-02/frogs.csv')

head(frogs)
## # A tibble: 6 x 16
##   Site  Subsite HabType SurveyDate Ordinal Frequency UTME_83 UTMN_83 Interval
##   <chr> <chr>   <chr>   <chr>        <dbl>     <dbl>   <dbl>   <dbl>    <dbl>
## 1 Cran~ SE Pond Pond    9/25/2018      268      164.  597369 4846486        0
## 2 Cran~ SE Pond Pond    10/2/2018      275      164.  597352 4846487        1
## 3 Cran~ SE Pond Pond    10/9/2018      282      164.  597345 4846458        2
## 4 Cran~ SE Pond Pond    10/15/2018     288      164.  597340 4846464        3
## 5 Cran~ SE Pond Pond    10/22/2018     295      164.  597344 4846460        4
## 6 Cran~ SE Pond Pond    11/1/2018      305      164.  597410 4846451        5
## # ... with 7 more variables: Female <dbl>, Water <chr>, Type <chr>,
## #   Structure <chr>, Substrate <chr>, Beaver <chr>, Detection <chr>

Things are relatively clean this week! Let’s start by replicating the image in GitHub. There are a few things worth note. First “Frequency” identifies unique frogs while “UTME_83” and “UTMN_83” indicate east and north coordinate locations. So let’s calculate distance from starting location, defined as interval = 0. Because the geographical scale isn’t particularly large, and so curvature of the Earth need not be taken into account, we will use the Euclidean distance:

dist <- function(x0,x1,y0,y1){
  sqrt((x0-x1)^2 + (y0-y1)^2)
}

To make sure we are on the right track, let’s take a look at the raw locations:

frogs %>% 
  ggplot(aes(x = UTME_83, y = UTMN_83, col = as.factor(Frequency))) +
  geom_point() +
  theme_bw() +
  theme(legend.position = "none",
        aspect.ratio = 1) +
  geom_text(aes(label = Subsite),
            nudge_y = 750, check_overlap = T)

We weren’t provided with any spatial files to re-create maps of this area, but the above comports well with the map provided in the paper.

We can get a little bit closer to that with a bit of spatial wrangling:

frogs %>% 
  dplyr::select(x = UTME_83, y = UTMN_83) -> point

coordinates(point) <- c("x","y")
proj4string(point)=CRS("+init=epsg:26910") 
coord <-spTransform(point,CRS("+init=epsg:4326"))

mapview(coord, xcol = "x", ycol = "y",  crs = 4326)

But that is about as far as I want to go with mapping the data (although someone did a cool dynamic visualization tracking frog movements…). Anyway, back to the main task. Let’s establish the start coordinates for each of our froggy-boiis and calculate their distances:

frogs %>% 
  filter(Interval == 0) %>% 
  dplyr::select(Frequency, x0 = UTME_83, y0 = UTMN_83) %>% 
  merge(.,frogs,by="Frequency") -> frogs

frogs$dist <- dist(x0 = frogs$x0,
                   x1 = frogs$UTME_83,
                   y0 = frogs$y0,
                   y1 = frogs$UTMN_83)

Excellent. Now let’s replicate the graphic. The description in the paper states that they only look at frogs with eight or more locations, so let’s take a gander:

as.Date(frogs$SurveyDate, format = "%m/%d/%Y") %>% 
  format(., "%b %d %Y") %>% 
  as.Date(.,"%b %d %Y") -> frogs$new_date

frogs %>% 
  group_by(Frequency) %>% 
  mutate(obs = max(Interval)) %>% 
  filter(obs >= 8) -> less_frogs

less_frogs %>% 
  ggplot(aes(x = new_date, y = dist, col = as.factor(Frequency))) +
    facet_grid(~HabType) +
    geom_point() +
    geom_line() +
    theme_bw() +
    theme(legend.position = "none",
          axis.text.x = element_text(angle = 45, hjust = 1),
          axis.line = element_line(color='black'),
          plot.background = element_blank(),
          panel.grid.minor = element_blank(),
          panel.grid.major = element_blank()) +
    scale_x_date(date_labels = "%b %d",
                 date_breaks = "7 days") +
    ylab("Distance from Release Site (m)") +
    xlab("")

Good enough for government work; the image is identical to the published image other than (1) coloring by individual frog rather than Habitat Type, which is already indicated by the facet and (2) minor differences in the x-axis.

Now what more can we do? Instead of looking at the distance from the release site, we might instead be interested in the total movement the frog. This could differ from the above if, say, a frog was hopping in a circle equal distance from their capture site.

frogs %>% 
  group_by(Frequency) %>% 
  mutate(x_lag = lag(UTME_83),
         y_lag = lag(UTMN_83),
         move = dist(x0 = x_lag,
                     x1 = UTME_83,
                     y0 = y_lag,
                     y1 = UTMN_83)) -> frogs

frogs %>% 
  group_by(Frequency) %>% 
  mutate(obs = max(Interval)) %>% 
  filter(obs >= 8) -> less_frogs

less_frogs %>% 
  ggplot(aes(x = new_date, y = move, col = as.factor(Frequency))) +
    facet_grid(~HabType) +
    geom_point() +
    geom_line() +
    theme_bw() +
    theme(legend.position = "none",
          axis.text.x = element_text(angle = 45, hjust = 1),
          axis.line = element_line(color='black'),
          plot.background = element_blank(),
          panel.grid.minor = element_blank(),
          panel.grid.major = element_blank()) +
    scale_x_date(date_labels = "%b %d",
                 date_breaks = "7 days") +
    ylab("Distance from Previous Location (m)") +
    xlab("") -> f1
f1
## Warning: Removed 27 rows containing missing values (geom_point).
## Warning: Removed 27 row(s) containing missing values (geom_path).

One might say that the frogs hop around a bit. Let’s do a bit better by rescaling the y axis.

f1 +
  scale_y_continuous(trans = "log")
## Warning: Transformation introduced infinite values in continuous y-axis

## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 27 rows containing missing values (geom_point).
## Warning: Removed 27 row(s) containing missing values (geom_path).

Two things are worth note. First is that there appear to be slight downward trend for each group. Let’s apply a cyclic smoother to these individual curves. Noting that “Ordinal” is the day of the year we will take the quick brutish approach to the problem, estimating a smooth curve for each frog in separate models. To make it a bit more interesting, we will only look at the top-hoppers.

less_frogs$id <- as.factor(less_frogs$Frequency)
ids <- unique(less_frogs$id)


lapply(ids, function(i){
  less_frogs %>% 
    filter(id == i) %>% 
    gam(log(move + 1) ~ s(Ordinal, bs = "cc", k = 5), data = .)
}) -> mods

pred_base <- data.frame(Ordinal = unique(less_frogs$Ordinal))

preds <- list()
for(i in 1:length(mods)){
  mod <- mods[[i]]
  pred_frame <- pred_base
  pred_frame$id <- ids[i]
  pred_frame$p <- predict(mod,pred_frame)
  preds[[i]] <- pred_frame
}

preds <- do.call("rbind",preds)

preds %>% 
  group_by(id) %>% 
  mutate(var = sd(p)) -> preds

preds %>%
  filter(var > 0.5) %>% 
  ggplot(aes(x = Ordinal, y = p, color = id)) +
  geom_line(lwd = 1.01) +
  theme_bw() +
  theme(legend.position = "none") +
  xlab("Time") +
  ylab("Distance") +
  ggtitle("Some Hoppy Boiis")

These frogs all, more or less, hopped away after capture, found a location they liked for a bit, then kept on hopping.

That’s all I am doing with this data. I still think it would be fun to play around with the image of this hoppy boii, but will leave that for another day.

---
title: "TidyTuesday 2022, Week 31"
author: "Christopher Schwarz"
date: "8/02/2022"
pages:
  extra: true
output: 
  html_document:
    code_download: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

Let's start by loading packages and reading the data:

```{r, warning=F,message=FALSE}
library(tidyverse)
library(rgdal)
library(mapview)
library(mgcv)
library(tidymv)

frogs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-08-02/frogs.csv')

head(frogs)
```
Things are relatively clean this week!  Let's start by replicating the image in GitHub.  There are a few things worth note.  First "Frequency" identifies unique frogs while "UTME_83" and "UTMN_83" indicate east and north coordinate locations.  So let's calculate distance from starting location, defined as interval = 0.  Because the geographical scale isn't particularly large, and so curvature of the Earth need not be taken into account, we will use the Euclidean distance:

```{r}
dist <- function(x0,x1,y0,y1){
  sqrt((x0-x1)^2 + (y0-y1)^2)
}
```

To make sure we are on the right track, let's take a look at the raw locations:

```{r, fig.align = 'center'}
frogs %>% 
  ggplot(aes(x = UTME_83, y = UTMN_83, col = as.factor(Frequency))) +
  geom_point() +
  theme_bw() +
  theme(legend.position = "none",
        aspect.ratio = 1) +
  geom_text(aes(label = Subsite),
            nudge_y = 750, check_overlap = T)
```


We weren't provided with any spatial files to re-create maps of this area, but the above comports well with the map provided in the paper.


```{r, echo=FALSE}
# Define variable containing url
url <- "https://bioone.org/ContentImages/Journals/hpet/56/1/20-060/graphic/WebImages/img-z2-1.jpg_08.jpg"
```

<center><img src="`r url`"></center>


We can get a little bit closer to that with a bit of spatial wrangling:

```{r, warning=FALSE, fig.align = 'center'}

frogs %>% 
  dplyr::select(x = UTME_83, y = UTMN_83) -> point

coordinates(point) <- c("x","y")
proj4string(point)=CRS("+init=epsg:26910") 
coord <-spTransform(point,CRS("+init=epsg:4326"))

mapview(coord, xcol = "x", ycol = "y",  crs = 4326)
```

But that is about as far as I want to go with mapping the data (although someone did a cool dynamic visualization tracking frog movements...).  Anyway, back to the main task.  Let's establish the start coordinates for each of our froggy-boiis and calculate their distances:

```{r}

frogs %>% 
  filter(Interval == 0) %>% 
  dplyr::select(Frequency, x0 = UTME_83, y0 = UTMN_83) %>% 
  merge(.,frogs,by="Frequency") -> frogs

frogs$dist <- dist(x0 = frogs$x0,
                   x1 = frogs$UTME_83,
                   y0 = frogs$y0,
                   y1 = frogs$UTMN_83)
```

Excellent.  Now let's replicate the graphic.  The description in the paper states that they only look at frogs with eight or more locations, so let's take a gander:

```{r, fig.align = 'center'}

as.Date(frogs$SurveyDate, format = "%m/%d/%Y") %>% 
  format(., "%b %d %Y") %>% 
  as.Date(.,"%b %d %Y") -> frogs$new_date

frogs %>% 
  group_by(Frequency) %>% 
  mutate(obs = max(Interval)) %>% 
  filter(obs >= 8) -> less_frogs

less_frogs %>% 
  ggplot(aes(x = new_date, y = dist, col = as.factor(Frequency))) +
    facet_grid(~HabType) +
    geom_point() +
    geom_line() +
    theme_bw() +
    theme(legend.position = "none",
          axis.text.x = element_text(angle = 45, hjust = 1),
          axis.line = element_line(color='black'),
          plot.background = element_blank(),
          panel.grid.minor = element_blank(),
          panel.grid.major = element_blank()) +
    scale_x_date(date_labels = "%b %d",
                 date_breaks = "7 days") +
    ylab("Distance from Release Site (m)") +
    xlab("")

```

Good enough for government work; the image is identical to the published image other than (1) coloring by individual frog rather than Habitat Type, which is already indicated by the facet and (2) minor differences in the x-axis.

Now what more can we do?  Instead of looking at the distance from the release site, we might instead be interested in the total movement the frog.  This could differ from the above if, say, a frog was hopping in a circle equal distance from their capture site.

```{r, fig.align = 'center'}

frogs %>% 
  group_by(Frequency) %>% 
  mutate(x_lag = lag(UTME_83),
         y_lag = lag(UTMN_83),
         move = dist(x0 = x_lag,
                     x1 = UTME_83,
                     y0 = y_lag,
                     y1 = UTMN_83)) -> frogs

frogs %>% 
  group_by(Frequency) %>% 
  mutate(obs = max(Interval)) %>% 
  filter(obs >= 8) -> less_frogs

less_frogs %>% 
  ggplot(aes(x = new_date, y = move, col = as.factor(Frequency))) +
    facet_grid(~HabType) +
    geom_point() +
    geom_line() +
    theme_bw() +
    theme(legend.position = "none",
          axis.text.x = element_text(angle = 45, hjust = 1),
          axis.line = element_line(color='black'),
          plot.background = element_blank(),
          panel.grid.minor = element_blank(),
          panel.grid.major = element_blank()) +
    scale_x_date(date_labels = "%b %d",
                 date_breaks = "7 days") +
    ylab("Distance from Previous Location (m)") +
    xlab("") -> f1
f1

```

One might say that the frogs hop around a bit.  Let's do a bit better by rescaling the y axis.

```{r, fig.align = 'center'}
f1 +
  scale_y_continuous(trans = "log")
```

Two things are worth note.  First is that there appear to be slight downward trend for each group.  Let's apply a cyclic smoother to these individual curves.  Noting that "Ordinal" is the day of the year we will take the quick brutish approach to the problem, estimating a smooth curve for each frog in separate models.  To make it a bit more interesting, we will only look at the top-hoppers.

```{r, fig.align = 'center'}
less_frogs$id <- as.factor(less_frogs$Frequency)
ids <- unique(less_frogs$id)


lapply(ids, function(i){
  less_frogs %>% 
    filter(id == i) %>% 
    gam(log(move + 1) ~ s(Ordinal, bs = "cc", k = 5), data = .)
}) -> mods

pred_base <- data.frame(Ordinal = unique(less_frogs$Ordinal))

preds <- list()
for(i in 1:length(mods)){
  mod <- mods[[i]]
  pred_frame <- pred_base
  pred_frame$id <- ids[i]
  pred_frame$p <- predict(mod,pred_frame)
  preds[[i]] <- pred_frame
}

preds <- do.call("rbind",preds)

preds %>% 
  group_by(id) %>% 
  mutate(var = sd(p)) -> preds

preds %>%
  filter(var > 0.5) %>% 
  ggplot(aes(x = Ordinal, y = p, color = id)) +
  geom_line(lwd = 1.01) +
  theme_bw() +
  theme(legend.position = "none") +
  xlab("Time") +
  ylab("Distance") +
  ggtitle("Some Hoppy Boiis")
```

These frogs all, more or less, hopped away after capture, found a location they liked for a bit, then kept on hopping.  

That's all I am doing with this data.  I still think it would be fun to play around with the image of this hoppy boii, but will leave that for another day.

```{r, echo=FALSE, fig.align = 'center'}
# Define variable containing url
url <- "https://github.com/rfordatascience/tidytuesday/blob/master/data/2022/2022-08-02/pic1.png?raw=true"
```

<center><img src="`r url`"></center>

