I’ve been running regularly since last july with a new watch (TomTom). I had some script to retrieve and convert their binary format (ttbin) when the watch is connected (USB). So the goal is to get these data and make my own visualization instead of using TomTom website.
When setting/connecting the running watch, it creates directories like :
/Users/jonathanbouchet/TomTom\ MySports/<date>/
where each ttbin file is.
I then run a python script to loop over all directories and summarize each run into a csv file. The summary/ column names is below :
library(ggplot2)
library(dplyr)
library(plotly)
df<-read.csv('foo.csv',sep=',')
head(df)
## date start end duration latitude longitude distance
## 1 2016-07-16 06:14:19 07:20:37 66 41.12115 -81.46048 7.026884
## 2 2016-07-17 06:18:00 07:28:28 70 41.12118 -81.46054 7.227120
## 3 2016-07-18 05:58:01 06:47:16 49 41.12115 -81.46054 5.103436
## 4 2016-07-20 15:33:06 16:03:41 30 0.00000 0.00000 2.019636
## 5 2016-07-21 06:13:15 06:53:28 40 41.12118 -81.46052 4.103205
## 6 2016-07-22 06:04:43 06:52:30 47 41.12110 -81.46052 5.008434
## type
## 1 running
## 2 running
## 3 running
## 4 treadmill
## 5 running
## 6 running
summary(df)
## date start end duration
## 2016-10-02 : 3 06:05:36 : 3 06:47:16 : 2 Min. : 10.00
## 2016-07-24 : 2 05:58:53 : 2 06:49:44 : 2 1st Qu.: 48.00
## 2016-08-11 : 2 05:59:35 : 2 06:50:57 : 2 Median : 52.00
## 2016-08-14 : 2 06:01:57 : 2 06:51:02 : 2 Mean : 52.72
## 2016-08-21 : 2 06:01:58 : 2 06:52:25 : 2 3rd Qu.: 61.00
## 2016-08-22 : 2 06:02:43 : 2 06:52:30 : 2 Max. :103.00
## (Other) :224 (Other) :224 (Other) :225
## latitude longitude distance type
## Min. : 0.00 Min. :-81.46 Min. : 1.020 running : 66
## 1st Qu.: 0.00 1st Qu.:-81.46 1st Qu.: 4.780 treadmill:171
## Median : 0.00 Median : 0.00 Median : 5.250
## Mean :11.45 Mean :-22.69 Mean : 5.337
## 3rd Qu.:41.12 3rd Qu.: 0.00 3rd Qu.: 6.150
## Max. :41.12 Max. : 0.00 Max. :10.677
##
The next steps are to add some other features for making useful plots :
tmp<-as.Date(df$date, format="%Y-%m-%d")
#create columns fro month(name,numeric) and hours
df$month<-as.numeric(format(tmp,'%m'))
df$month_name<-month.abb[df$month]
df$hour<-as.numeric(format(as.POSIXct(df$start,format="%H:%M:%S"),"%H"))
getAM<-function(x){
if(x<12){return('AM')}
else {return('PM')}
}
#create a new column moring/evening runs
df$TimeInDay<-sapply(df$hour,getAM)
pl <- ggplot(data=df,aes(x=distance,y=duration)) +
geom_point(aes(color=type,shape=TimeInDay),size=5,alpha=.75) +
xlab('Distance [miles]') + ylab('Time [min]') + xlim(0,12) + ylim(0,110) +
ggtitle(' Time vs. Distance')
pl<- pl + geom_smooth(aes(group=1),method='lm',formula=y~x,color='black',size=.5) +
scale_colour_manual(values=c("#E2D200","#46ACC8"))
print(pl)
type=treadmill (because now it’s becoming cold in Ohio)Next for the summary is to aggregate data per month :
df2<-as.data.frame(df %>% group_by(month_name, type) %>%select(duration, distance, type) %>% summarise(totDistance = sum(distance), totTime= sum(duration)))
## Adding missing grouping variables: `month_name`
lev<-c("Jul","Aug","Sep","Oct","Nov","Dec","Jan","Feb","Mar")
df2$MONTH <- factor(df2$month_name, levels = lev)
ggplot(data=df2,aes(x=MONTH,y=totDistance,fill=type)) +
geom_bar(stat='identity') + ylab('total distance [miles]') +
scale_fill_manual(values=c("#E2D200","#46ACC8"))
An interesting plot since it clearly shows that Winter is coming, meaning I am running now more on the treadmill rather than outside
1.3 Comments
type=running) because running on the treadmill is boring