With the skills from Input: Segmentation you can now implement the segmentation algorithm described in Laube and Purves (2011) to either your own movement data or to a different wild boar using different sampling intervals.
library(tidyverse)library(sf)library(tmap)# load data into R. I use the one from the whales to see if it works. whales <-read_delim("Movements of Australia's east coast humpback whales.csv")unique(whales$'individual-local-identifier')
# I choose the first whale (53348) for this exercisesample_whale <- whales |>st_as_sf(coords =c("location-long", "location-lat"), crs =4326, remove =FALSE) |>filter(`individual-local-identifier`=="53348", timestamp >="2008-10-23", timestamp <"2011-07-26" )#delete rows that are doublesample_whale <- sample_whale |>filter(!is.na(`argos:lat`))
Error in `stopifnot()`:
ℹ In argument: `!is.na(`argos:lat`)`.
Caused by error:
! object 'argos:lat' not found
# show with ggplotggplot(sample_whale) +geom_sf() +labs(x ="Longitude",y ="Latitude",title ="Whale GPS Track" ) +theme_minimal()
# convert to a spatial objectRosa <- wildschwein |>st_as_sf(coords =c("E", "N"), crs =2056, remove =FALSE) |>filter( TierName =="Rosa", DatetimeUTC >="2014-11-07", DatetimeUTC <"2014-12-31" )# visualise dataggplot(data = Rosa) +geom_sf() +geom_path(aes(E, N)) # also connect dots with lines. This respects the order the data is in. geom_line would simply go from left to right.
Task 1: Calculate distances
Now, you can Step a): Specify a temporal window v and Step b): Measure the distance to every point within v, which you had used with sabi, on your own movement data or to a different wild boar using different sampling intervals.
distance_by_element <-function(later, now) {as.numeric(st_distance(later, now, by_element =TRUE) )}Rosa <- Rosa |>mutate(nMinus2 =distance_by_element(lag(geometry, 2), geometry)) |># distance to pos -30 minutesmutate(nMinus1 =distance_by_element(lag(geometry, 1), geometry)) |># distance to pos -15 minutesmutate(nPlus1 =distance_by_element(geometry, lead(geometry, 1))) |># distance to pos +15 minutesmutate(nPlus2 =distance_by_element(geometry, lead(geometry, 2))) # distance to pos +30 minutes# now calculate distances Rosa <- Rosa |>rowwise() |>mutate(stepMean =mean(c(nMinus2, nMinus1, nPlus1, nPlus2), na.rm =FALSE)) |>ungroup() #
Task 2: Specify and apply threshold d
After calculating the Euclidean distances to positions within the temporal window v in task 1, you can explore these values (we stored them in the column stepMean) using summary statistics (histograms, boxplot, summary()): This way we can define a reasonable threshold value to differentiate between stops and moves. There is no “correct” way of doing this, specifying a threshold always depends on data as well as the question that needs to be answered. In this exercise, use the mean of all stepMean values.
Store the new information (boolean to differentiate between stops (TRUE) and moves (FALSE)) in a new column named static.
# set a tresholdtreshold <-mean(Rosa$stepMean)head(Rosa)
Simple feature collection with 6 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 2570375 ymin: 1204787 xmax: 2570389 ymax: 1204857
Projected CRS: CH1903+ / LV95
# A tibble: 6 × 12
TierID TierName CollarID DatetimeUTC E N
<chr> <chr> <dbl> <dttm> <dbl> <dbl>
1 016A Rosa 13972 2014-11-07 07:45:44 2570385. 1204812.
2 016A Rosa 13972 2014-11-07 08:00:10 2570389. 1204821.
3 016A Rosa 13972 2014-11-07 08:15:18 2570375. 1204787.
4 016A Rosa 13972 2014-11-07 08:30:23 2570387. 1204828.
5 016A Rosa 13972 2014-11-07 08:46:08 2570387. 1204853.
6 016A Rosa 13972 2014-11-07 09:00:11 2570387. 1204857.
# ℹ 6 more variables: geometry <POINT [m]>, nMinus2 <dbl>, nMinus1 <dbl>,
# nPlus1 <dbl>, nPlus2 <dbl>, stepMean <dbl>
# we define the treshold here as the mean. Rosa <- Rosa |>mutate(static = stepMean <mean(stepMean, na.rm =TRUE))Rosa_filter <- Rosa |>filter(!static) # these points are not static!
Task 3: Visualize segmented trajectories
# shows static points in red Rosa_filter |>ggplot(aes(E, N)) +geom_point(data = Rosa, col ="red") +# this adds the static pointsgeom_path() +geom_point() +coord_fixed() +theme(legend.position ="bottom")
# show moving pointsggplot(Rosa_filter) +geom_path(aes(E, N)) +geom_sf(aes(color = static))
Rosa |>ggplot(aes(E, N, colour = static)) +geom_path() +geom_point() +coord_fixed() +theme(legend.position ="bottom")
Task 4: Segment-based analysis
In applying Laube and Purves (2011), we’ve come as far as step b) in Figure 16.1. In order to complete the last steps (c and d), we need a unique ID for each segment that we can use as a grouping variable. The following function does just that (it assigns unique IDs based on the column static which you created in Task 2). You will learn about functions next week. For now, just copy the following code chunk into your script and run it.
rle_id <-function(vec) { x <-rle(vec)$lengthsas.factor(rep(seq_along(x), times = x))}
You can use the newly created function rle_id to assign unique IDs to subtrajectories (as shown below).
Rosa <- Rosa |>mutate(segment_id =rle_id(static))
Visualize the moving segments by colourizing them by segment_ID. Then use segment_ID as a grouping variable to determine the segments duration and remove short segments (e.g. segments with a duration < 5 Minutes)
# visualise Rosa |>ggplot(aes(E, N, colour = segment_id)) +geom_path() +geom_point() +coord_fixed() +theme(legend.position ="right")
# group by segment ID and filter out duration more than 5mins.Valid_segments <- Rosa |>group_by(segment_id) |>summarise(start =min(DatetimeUTC),end =max(DatetimeUTC), duration =as.numeric(difftime(end, start, units ="mins"))) |>filter(duration >=5)Rosa <- Rosa |>filter(segment_id %in% Valid_segments$segment_id)# now plot again: Rosa |>ggplot(aes(E, N, colour = segment_id, group = segment_id)) +geom_path() +coord_fixed() +theme(legend.position ="")
Exercise B: Similarity
Task 1: Similarity measures
We will now calculate similarities between trajectories using a new dataset pedestrian.csv (available on moodle). Download an import this dataset as a data.frame or tibble. It it a set of six different but similar trajectories from pedestrians walking on a path.
For this task, explore the trajectories first and get an idea on how the pedestrians moved.
pedestrian$TrajID <-as.factor(pedestrian$TrajID)ggplot(pedestrian, aes(E, N, colour = TrajID)) +geom_point() +facet_wrap(~TrajID) +theme(axis.text.x =element_blank(),axis.text.y =element_blank(), ) +labs(x ="E", y ="N",title ="Visual comparison of the 6 trajectories",subtitle ="Each subplot highlights a trajectory")
Task 2: Calculate similarity
library(SimilarityMeasures)
Familiarize yourself with this package by skimming through the function descriptions help(package = “SimilarityMeasures”). Now compare trajectory 1 to trajectories 2-6 using different similarity measures from the package. Your options are. DTW, EditDist, Frechet and LCSS.
All functions in the package need matrices as input, with one trajectory per matrix.
LCSStakes very long to compute. The accuracy of the algorithm (pointSpacing = ,pointDistance = and errorMarg =) can be varied to provide faster calculations. Please see Vlachos et al. (2002) for more information.
Before visualizing your results think about the following: Which two trajectories to you perceive to be most similar, which are most dissimilar?
–> most similar: TrajID 1 and 6
–> most dissimilar: TrajID 1 and 4 or 2
Now visualize the results from the computed similarity measures. Which measure reflects your own intuition the closest?
library(patchwork)p1 + p2 + p3 + p4 +plot_layout(ncol =2, nrow =2) +plot_annotation(title ="Computed similarities using different measures between trajectory 1 and all other trajectories")