Exercise_A_Segmentation

Author

Céline Spitzli

Exercise A: Segmentation

Libraries laden

library(readr)
library(sf)
library(dplyr)
library(ggplot2)

Read Data

wildschwein <- read_delim("Daten/wildschwein_BE_2056.csv", ",")

wildschwein |> 
  group_by(TierName) |> 
  summarise(
    max(DatetimeUTC),
    min(DatetimeUTC)
  )

# A tibble: 3 × 3
  TierName `max(DatetimeUTC)`  `min(DatetimeUTC)` 
  <chr>    <dttm>              <dttm>             
1 Rosa     2015-06-29 23:45:11 2014-11-07 07:45:44
2 Ruth     2015-07-27 09:45:15 2014-11-07 18:00:43
3 Sabi     2015-07-27 11:00:14 2014-08-22 21:00:12

# Filter for Ruth between 26.11.2014 and 30.11.2014

ruth <- wildschwein |> 
  st_as_sf(coords = c("E", "N"), crs = 2056, remove = FALSE) |> 
  filter(
    TierName == "Ruth",
    DatetimeUTC >= "2014-11-26",
    DatetimeUTC < "2014-11-30"
  )

# Visualize
ggplot(ruth) +
  geom_sf() +
  geom_path(aes(E,N))

Task 1: Calculate distances

Now, you can Step a): Specify a temporal window v and Step b): Measure the distance to every point within v, which you had used with sabi, on on your own movement data or to a different wild boar using different sampling intervals.

Step a): temporal window v: 1. pos[n-2] to pos[n] 2. pos[n-1] to pos[n] 3. pos[n] to pos[n+1] 4. pos[n] to pos[n+2]

Step b):

distance_by_element <- function(later, now) {
  as.numeric(
    st_distance(later, now, by_element = TRUE)
  )
}


ruth <- ruth |>
    mutate(
        nMinus2 = distance_by_element(lag(geometry, 2), geometry),  # distance to pos -30 minutes
        nMinus1 = distance_by_element(lag(geometry, 1), geometry),  # distance to pos -15 minutes
        nPlus1  = distance_by_element(geometry, lead(geometry, 1)), # distance to pos +15 mintues
        nPlus2  = distance_by_element(geometry, lead(geometry, 2))  # distance to pos +30 minutes
    )

Task 2: Specify and apply threshold d

After calculating the Euclidean distances to positions within the temporal window v in task 1, you can explore these values (we stored them in the column stepMean) using summary statistics (histograms, boxplot, summary()): This way we can define a reasonable threshold value to differentiate between stops and moves. There is no “correct” way of doing this, specifying a threshold always depends on data as well as the question that needs to be answered. In this exercise, use the mean of all stepMean values.

Store the new information (boolean to differentiate between stops (TRUE) and moves (FALSE)) in a new column named static.

ruth <- ruth |> 
  rowwise() |> 
  mutate(stepMean = mean(c(nMinus1, nMinus2, nPlus1, nPlus2))) |> 
  ungroup()

# setup a treshold for stops
treshold = mean(ruth$stepMean, na.rm = TRUE)

# apply the treshold to the data
ruth <- ruth |> 
  mutate(static = stepMean < treshold)

Task 3: Visualize segmented trajectories

Now visualize the segmented trajectory spatially. Just like last week, you can use ggplot with geom_path(), geom_point() and coord_equal(). Assign colour = static within aes() to distinguish between segments with “movement” and without.

ruth |> 
  ggplot(aes(E,N, colour = static)) +
  geom_path() +
  geom_point() +
  coord_equal() +
  theme(legend.position = "top")

# exclude NA in the plot
ruth_clean <- ruth |> 
filter(!is.na(static))  # Remove rows where 'static' is NA

ggplot(ruth_clean, aes(E, N)) + 
  geom_path(aes(colour = static)) +  
  geom_point(aes(colour = ifelse(static, "static", "moving"))) +  
  scale_colour_manual(values = c("static" = "red", "moving" = "blue")) +  
  coord_equal() + 
  theme(legend.position = "top")

Task 4: Segment-based analysis

In applying Laube and Purves (2011), we’ve come as far as step b) in Figure 16.1. In order to complete the last steps (c and d), we need a unique ID for each segment that we can use as a grouping variable. The following function does just that (it assigns unique IDs based on the column static which you created in Task 2). You will learn about functions next week. For now, just copy the following code chunk into your script and run it.

You can use the newly created function rle_id to assign unique IDs to subtrajectories (as shown below). Visualize the moving segments by colourizing them by segment_ID.

rle_id <- function(vec) {
    x <- rle(vec)$lengths
    as.factor(rep(seq_along(x), times = x))
}

ruth <- ruth |> 
  mutate(segment_id = rle_id(static)) |> 
  mutate(timestamp = as.POSIXct(DatetimeUTC))

# filter static segments from the data
ruth_moving <- ruth |> 
  filter(!static)

# visualize moving segments
ggplot(ruth_moving) +
  geom_sf() +
  geom_path(aes(E,N, colour = segment_id))

Then use segment_ID as a grouping variable to determine the segments duration and remove short segments (e.g. segments with a duration < 5 Minutes)

ruth_moving_seg <- ruth_moving |>
  group_by(segment_id) |>
  summarise(
    min_timestamp = min(timestamp),
    max_timestamp = max(timestamp)
    )

# takes only duration longer than 5 minutes
ruth_long_moving_seg <- ruth_moving_seg |>
  mutate(segment_length = max_timestamp - min_timestamp) |>
  filter(segment_length > 5 * 60 * 60)

ruth_long_moving <- ruth_moving |>
  filter(segment_id %in% ruth_moving_seg$segment_id)

ggplot(ruth_long_moving, aes(colour = segment_id)) + 
  geom_sf() +
  geom_path(aes(E, N))