Background

Outdoor cycling is one of a fun activity for workout purpose, this can be done on a main road or even in mountainous area, even as simple as you just go outside your place and you are good to go. You can go by yourself or with some friends to enjoy the view around and get some fresh air (in a mountain). This time i will focus on road bike cycling, the sport that has been my favourite for cardio workout.

In a bicycle, there is an additional component that can measure the speed, distance gain, cadence (rotation of the pedal), heart rate, power (the power of pedaling), average speed, time and so on. I will explore more about power that has been the most accurate parameter that measuring the performance of a cyclist. We will get some interesting fact from the collected data from this activity.

Problem

While there are still some restriction to flock in a crowded place such as gym and many indoor sports place, peoples start to get bored and try to find what is the workout that they can still doing it during restriction, they found cycling as a replacement sport thats suits them well. During these period, the demand of a bicycle are getting high, including roadbike too. As the population are getting high, many newcomer and amateur cyclist tend to improve their performance, but sadly, not equipped with the correct measurement.

Cyclist are only measuring their performance based on the high average speed they can achieve and heart rate measurement, they don’t know that the most accurate way to measure their performance are by using powermeter because the measurement are not affected by the speed and heart rate and any other factor. Beside that, the cost of getting a proper powermeter are quite high (more than Rp 10.000.000 or $1000) play a role on acquiring this components for their bike too.

Let me explore the data to get a glimpse of how good if you have a powermeter to measure the performance. Powermeter is an electronic parts that snapped into the pedal or in a crankset.

Garmin Vector 3 Dual Sensing & Favero Assioma Duo

Goals

    1. Determine the character of the cyclist by the produced power
    1. Measure the effectiveness of improvement of cyclist workout
    1. Determine the Functional Treshold Power (FTP) Power Sustainability for an Hour FTP is: Functional Threshold Power units is watts per kilo, FTP is the average of output power that can be produced for an hour compare to body weight. https://www.cyclingweekly.com/fitness/ftp-cycling-363865
    1. Able to cluster the cyclist characteristic (If we compare it to an athlete or another cyclist)

Methods

    1. Logistic Regression for determine the character and the progress of workout.
    1. Unsupervised Learning - Clustering, categorize the cyclist with other cyclist or an athlete.

Predictor: Time, Cadence, Power Output, Heart Rate and Speed.

Versatility of Methods

In a specific way: The usage of the method can also be implemented into another workout or activity such as Golf or Boxing (To measure the power of the swing or the impact, usually with wearing a smartwatch)

Disclaimer

We took the data from a real life condition, so there are some several external factor that we can’t control and therefore it’s fine for us to ignore it, such as:

  • Traffic Condition

  • Weather (Rain, Wind speed and the angle of attack)

  • Road Condition (Rolling resistance, Surface condition)

Scope and Limitation

  • Cyclist are in good shape (Proper nutrition, no sleep deprivation)

  • No body injury

  • Proper fitted bike

  • No broken/ worn bicycle components

  • Constant elevation (Jakarta)

  • Cyclist position (Time Trialist are usually in a aero position all the time and not allowed to drafting)

Quick description of Time Trial Cycling and Drafting

Load and inspect the data

Legend

  • timer.s == an ongoing timer (seconds). Stoppages are not recorded per se, but rather represented as breaks in the continuity of the timer.
  • timer.min == as above, but in units of minutes.
  • timestamp == “POSIXct” values, describing the actual time of day.
  • delta.t == delta time values (seconds).
  • lat == latitude values (degrees).
  • lng == longitude values (degrees).
  • distance.km == cumulative distance (kilometres).
  • speed.kmh == speed in kilometres per hour.
  • elevation.m == altitude in metres.
  • delta.elev == delta elevation (metres).
  • VAM == “vertical ascent metres per second”.
  • power.W == power readings (Watts).
  • power.smooth.W == an exponentially-weighted 25-second moving average of power values.
  • work.kJ == cumulative work (kilojoules).
  • Wexp.kJ == W’ expended in units of kilojoules. See ?Wbal and references therein.
  • cadence.rpm == pedalling cadence (revolutions per minute).
  • hr.bpm == Heart rate (beats per minute).
  • lap == a numeric vector of lap “levels”. Will only have values > 1 if lap data is available.
##        timer.s      timer.min      timestamp        delta.t            lat 
##              0              0              0              0              0 
##            lon    distance.km      speed.kmh    elevation.m     delta.elev 
##              0              0              0              0              0 
##            VAM        power.W power.smooth.W        work.kJ        Wexp.kJ 
##              0              0              0              0           5680 
##    cadence.rpm         hr.bpm            lap 
##              0              0              0
## Rows: 5,680
## Columns: 18
## $ timer.s        <int> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …
## $ timer.min      <dbl> 0.00000000, 0.01666667, 0.03333333, 0.05000000, 0.0666…
## $ timestamp      <dttm> 2020-06-13 23:24:58, 2020-06-13 23:24:59, 2020-06-13 …
## $ delta.t        <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ lat            <dbl> -6.197816, -6.197863, -6.197918, -6.197969, -6.198023,…
## $ lon            <dbl> 106.7691, 106.7691, 106.7691, 106.7691, 106.7691, 106.…
## $ distance.km    <dbl> 0.00000, 0.00517, 0.01127, 0.01692, 0.02284, 0.02891, …
## $ speed.kmh      <dbl> 17.0316, 18.9792, 19.5156, 20.8944, 21.2616, 21.8016, …
## $ elevation.m    <dbl> 28.4, 28.2, 28.6, 28.8, 29.0, 28.8, 28.8, 28.8, 28.8, …
## $ delta.elev     <dbl> 0.0, -0.2, 0.4, 0.2, 0.2, -0.2, 0.0, 0.0, 0.0, 0.0, 0.…
## $ VAM            <dbl> 0.00000000, 0.00000000, 0.20000000, 0.06666667, 0.0500…
## $ power.W        <int> 183, 181, 238, 188, 190, 190, 157, 162, 155, 148, 126,…
## $ power.smooth.W <dbl> 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, …
## $ work.kJ        <dbl> 0.000, 0.181, 0.419, 0.607, 0.797, 0.987, 1.144, 1.306…
## $ Wexp.kJ        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ cadence.rpm    <int> 59, 63, 66, 69, 69, 72, 72, 73, 76, 79, 79, 84, 84, 87…
## $ hr.bpm         <int> 130, 130, 130, 129, 129, 127, 128, 128, 127, 128, 129,…
## $ lap            <ord> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …

Data Preprocessing

As we can see, some of the data type are still inappropiate and there is some unuse column, lets try to fix it.

## Warning: as.hms() is deprecated, please use as_hms().
## This warning is displayed once per session.

Now we have a proper dataset and ready to be explored.

Data Exploration

Visual Of Power Output, Speed, Heart Rate and Cadence over Time plotly prep

Power Output By Time

## Warning: Removed 2805 rows containing missing values (position_stack).

From the shown graph above, we can see that there is a spark of 742 watts shown, but it can’t be assume that the cyclist can produce such a constant power over time.

Speed - Cyclist speed during activity

## Warning: Removed 2805 rows containing missing values (position_stack).

Heart Rate

## Warning: Removed 2805 rows containing missing values (position_stack).

In cycling, we divide our heart rate beat into 6 zones. From that, we can determine what kind of energy that we use, aerobic or anaerobic zone.

In simple way, the higher your heart beats, the faster your body needs energy, the easiest way the body to get the energy is from blood glucose (consists of carbohydrates, nutrition and other substances). If we are on an aerobic activity (low level of zones, let say 1-2) body will mostly use fat and some protein as energy. To determine all 6 zones of HR, we need to tailored it by our resting heart rate and maximum heart rate.

for reference: https://www.bikeradar.com/advice/fitness-and-training/heart-rate-monitor-training-for-cyclists/#

Cadence - rotation of leg per minute

## Warning: Removed 2805 rows containing missing values (position_stack).
## Warning in ggcorr(cyclist): data in column(s) 'hourmin', 'date', 'text',
## 'textspeed', 'texthr', 'textcad' are not numeric and were ignored

Summary

From the correlation graph above, we can see that there is a high correlation between power.W and cadence.rpm