As part of an assignment for the Data Science Specialization offered on Coursera by the Johns Hopkins University, I decided to practice the leaflet package on my marathon data from June 2016.
GPS data and race metrics have been recorded using a Garmin Forerunner 220. The data have been downloaded from the Garmin connect website.
# Import the GPS coordinate of the race (format in gpx)
racetrack <- readOGR("activity_1198615565.gpx", layer = "tracks")
## OGR data source with driver: GPX
## Source: "activity_1198615565.gpx", layer: "tracks"
## with 1 features
## It has 12 fields
# Import the race split data
racedf <- read.csv("activity_1198615565.csv")
# Remove last line of summary data
racedf <- racedf[-nrow(racedf), ]
# To keep things managable, only select the split number,
# Average pace, average HR and the running cadence
racedf <- racedf %>%
select(Split, Avg.Pace, Avg.HR, Avg.Run.Cadence) %>%
droplevels()
racedf$Split <- ordered(racedf$Split, levels(as.factor(1:43)))
# Output a summary of the data
summary(racedf)
## Split Avg.Pace Avg.HR Avg.Run.Cadence
## 1 : 1 0:05:36 : 2 Min. :140 Min. : 65.69
## 2 : 1 0:05:38 : 2 1st Qu.:155 1st Qu.:140.28
## 3 : 1 0:05:40 : 2 Median :163 Median :170.03
## 4 : 1 0:05:41 : 2 Mean :161 Mean :157.60
## 5 : 1 0:05:45 : 2 3rd Qu.:167 3rd Qu.:172.78
## 6 : 1 0:05:49 : 2 Max. :173 Max. :174.45
## (Other):37 (Other) :31
# Extracting the GPS information
# Big thanks to R.blogger
# I found this great article to easily extract the info from the gpx file.
# There is much more: https://www.r-bloggers.com/stay-on-track-plotting-gps-tracks-with-r/
# Parse the GPX file
pfile <- htmlTreeParse("activity_1198615565.gpx",
error = function(...) {}, useInternalNodes = T)
# Get all elevations, times and coordinates via the respective xpath
elevations <- as.numeric(xpathSApply(pfile, path = "//trkpt/ele", xmlValue))
times <- xpathSApply(pfile, path = "//trkpt/time", xmlValue)
coords <- xpathSApply(pfile, path = "//trkpt", xmlAttrs)
# Extract latitude and longitude from the coordinates
lats <- as.numeric(coords["lat",])
lons <- as.numeric(coords["lon",])
# Put everything in a dataframe and get rid of old variables
geodf <- data.frame(lat = lats, lon = lons, ele = elevations, time = times)
rm(list = c("elevations", "lats", "lons", "pfile", "times", "coords"))
geodf$time <- strptime(geodf$time, format = "%Y-%m-%dT%H:%M:%OS")
m <- leaflet() %>%
addTiles() # Add default OpenStreetMap map tiles
# start point
m %>% addPolylines(data = racetrack, color = "red", weight = 4) %>%
addMarkers(lng = geodf$lon[1], lat = geodf$lat[1],
popup = "Run Paradise - Starting and Finish line")
The elevation data extracted from the GPS measurement shows a difficult race with two major climb and few minor hills.
This first graph shows that the race went well until approximately the 27th kilometers. Interestingly this is a great example of what hitting the wall does to race performance.
My body just ran out of fuel and after the 30th kilomter, I was simply walking at a pace of about 10min/km.
To further understand what happened during the race, here is what happened to the average heart rate:
ggplot(racedf, aes(x = as.numeric(Split), y = Avg.HR)) +
geom_point(col = "steelblue") +
geom_path(col = "steelblue", alpha = 0.5) +
geom_hline(yintercept = c(136, 146), col = "red", lty = 2, lwd = 1.2) +
labs(x = "Splits (every km)", y = "Avg. HR rate (bpm)") +
ggtitle("Phuket Marathon June 2016 - Split vs Average Heart Rate")
The zone in between the two red lines is my aerobic zone where my body burns fat for fuel. Fat is a very economic resources for energy and is the primary source of fuel for the body in endurance sports.
Above this zone, the energy comes from carbohydrates (sugar) which are in limited supply. In addition, a lot of waste are being produced from this mechanism which lead to cramps and fatigue.
From this graph, we can clearly see that as the sugar is getting consumed, the heart rate increases until exhaustion of this source of fuel. At about 30km, the heart rate drops which coincide with the end of the running activity (see above for the average pace).
In this brief analysis of my marathon race last year, we observed a phenomenon called “the Wall”. We also learn that sustainable long effort is not achieved at high heart rate.
Despite 8 months training for the event, I was clearly not prepared to face this challenge as my aerobic system was not fully developped to cope with the effort.
If you are interested to develop your aerobic system to burn fat for fuel, check out the Maffetone Method. https://philmaffetone.com/method/
sessionInfo()
## R version 3.3.3 (2017-03-06)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 14393)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] XML_3.98-1.6 lubridate_1.6.0 rgdal_1.2-6 sp_1.2-4
## [5] dplyr_0.5.0 ggplot2_2.2.1 leaflet_1.1.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.10 plyr_1.8.4 tools_3.3.3 digest_0.6.12
## [5] jsonlite_1.4 evaluate_0.10 tibble_1.3.0 gtable_0.2.0
## [9] lattice_0.20-34 shiny_1.0.2 DBI_0.6-1 crosstalk_1.0.0
## [13] curl_2.5 yaml_2.1.14 stringr_1.2.0 knitr_1.15.1
## [17] htmlwidgets_0.8 rprojroot_1.2 grid_3.3.3 R6_2.2.0
## [21] rmarkdown_1.4 magrittr_1.5 backports_1.0.5 scales_0.4.1
## [25] htmltools_0.3.5 assertthat_0.2.0 mime_0.5 xtable_1.8-2
## [29] colorspace_1.3-2 httpuv_1.3.3 labeling_0.3 stringi_1.1.5
## [33] lazyeval_0.2.0 munsell_0.4.3