I want to visualize two things using fitbit sleep data:

  1. how much I sleep
  2. when I sleep

First, I want to open my Fitbit data. I download the zip file from the Fitbit website once my entire archive export was complete. I then rename that file as fitbitdata. Within that folder there is a user-site-export subfolder with all the raw data I could want to use. For sleep, I focus on the files called sleep-yyy-mm-dd.json. There is about one per month. To pull in json files, I use jsonlite and then just bind them all together.

library(dplyr);library(jsonlite); library(stringr);

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union
library(ggplot2);library(lubridate); library(tidyr)

Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

sleep<-bind_rows(fromJSON("fitbitdata/user-site-export/sleep-2018-11-08.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2018-12-08.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-01-07.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-02-06.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-03-08.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-04-07.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-05-07.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-06-06.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-07-06.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-08-05.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-09-04.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-10-04.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-11-03.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2019-12-03.json", flatten=TRUE),
                 fromJSON("fitbitdata/user-site-export/sleep-2020-01-02.json", flatten=TRUE))

I realize there are some observations that show up in multiple files – that is, the name of the file is often in two jsons. Jan 2, 2020 sleep is recorded twice in the new file, so I remove duplicate instances via logId.

  select(dateOfSleep, startTime, endTime, minutesAsleep, minutesAwake)


After removing duplicate sleep observations, there are 456 sleep observations.

[1] 425

There are 425 days but 456 sleep instances… So, there are 31 naps recorded. :)

(1) How much?

Plot minutes asleep


[1] 7.533333
[1] 7.414784

I sleep an average of 7.41 hours per day.

I can plot how many minutes of sleep I get per day.

ggplot(sleep_day, aes(x=sleep_tot))+
  theme_minimal()+ theme(text=element_text(family="Palatino", size=13))+
  theme(plot.title = element_text(size = 20))+
  geom_histogram(binwidth = 0.5, fill="#20A387FF", alpha=0.7)+
  labs(x="", y="", caption="Purple line shows mean of 7.41.")+ #theme(axis.text.y=element_text(size=0))+
  geom_vline(xintercept = 7.41, color="#440154FF", linetype=5)+
  scale_x_continuous(breaks=seq(0,13,1), limits = c(0,13))+
  ggtitle("How Many Hours Do I Sleep Per Day?", 
          subtitle = "Data from 425 days of wearing a FitBit")

#save the image
ggsave("graphs/asleep_time.png", width=9, height=6, dpi=400)

(2) When I sleep

Now, I want to look at the distribution of times when I fall asleep/wake up.

To do that, I need to extract the times from the date/times recorded by Fitbit. I also make rounded versions for graphing. (Want to make histograms with times to the nearest 30 minutes.)

sleep$bed<-format(ymd_hms(str_replace(sleep$startTime, "T", " ")), "%H:%M:%S")
sleep$wake<-format(ymd_hms(str_replace(sleep$endTime, "T", " ")), "%H:%M:%S")

#create a rounded version for graphing
sleep$bed_30<-format(round_date(ymd_hms(str_replace(sleep$startTime, "T", " ")), 
                                "30 minutes"), "%H:%M")
sleep$wake_30<-format(round_date(ymd_hms(str_replace(sleep$endTime, "T", " ")), 
                                "30 minutes"), "%H:%M")

Prep data for graphing.

#prep bedtime


#make df for all times
time<-c("00:00", "00:30", "01:00", "01:30", "02:00",  "02:30", 
                            "03:00",  "03:30", 
                            "04:00",  "04:30", 
                            "05:00",  "05:30", 
                            "06:00",  "06:30", 
                            "07:00",  "07:30", 
                            "08:00",  "08:30", 
                            "09:00",  "09:30", 
                            "10:00",  "10:30", 
                            "11:00",  "11:30", 
                            "12:00",  "12:30", 
                            "13:00",  "13:30", 
                            "14:00",  "14:30", 
                            "15:00",  "15:30", 
                            "16:00",  "16:30", 
                            "17:00",  "17:30", 
                            "18:00",  "18:30", 
                            "19:00",  "19:30", 
                            "20:00",  "20:30", 
                            "21:00",  "21:30", 
                            "22:00",  "22:30", 
                            "23:00",  "23:30")

bedtime<-left_join(time, sleep_bed, by="time")
Column `time` joining factor and character vector, coercing into character vector
bedtime<-bedtime %>% 
  mutate(count = replace_na(count, 0))

#prep wake time


waketime<-left_join(time, sleep_wake, by="time")
Column `time` joining factor and character vector, coercing into character vector
waketime<-waketime %>% 
  mutate(count = replace_na(count, 0))

Prep the data

  select(bedcount, time)%>%
  inner_join(waketime, by="time")%>%
  pivot_longer(cols=c(bedcount, wakecount), names_to="type", values_to="count")

Note that I am now using pivot_longer() rather than gather() – thanks to Hadley’s RStudio::conf tidyverse talk.

Plot bed/wake time distributions

Plot this out with polar coordinates (since time calls for that sort of thing!)

ggplot(bed_wake_time, aes(x=time, y=count, color=type,fill=type))+
  geom_bar(stat="identity", alpha=.8,
           data=subset(bed_wake_time, bed_wake_time$type=="wakecount"))+
  geom_bar(stat="identity", alpha=.8, 
           data=subset(bed_wake_time, bed_wake_time$type=="bedcount"))+
  scale_fill_manual(values = c("#440154FF", "#20A387FF"), name="",
                    labels=c("Fall Asleep", "Wake Up"))+
  scale_color_manual(values = c("#440154FF", "#20A387FF"), name="", 
                     labels=c("Fall Asleep", "Wake Up"))+
  labs(x="", y="")+
  theme_minimal()+ theme(text=element_text(family="Palatino", size=13))+
  theme(plot.title = element_text(size = 20))+
  theme(axis.text.y=element_text(size=0))+ theme(axis.text.x=element_text(size=13))+
  scale_x_discrete(labels=c("12am", "", "1am", "", "2am", "", "3am", "", 
                            "4am", "", "5am", "", "6am", "", "7am", "", 
                            "8am", "", "9am", "", "10am", "", "11am", "",
                            "12pm", "", "1pm", "", "2pm", "", "3pm", "",
                            "4pm", "", "5pm", "", "6pm", "", "7pm", "",
                            "8pm", "", "9pm", "", "10pm", "", "11pm", ""))+
  coord_polar()+theme(legend.position = "top")+
  ggtitle("When Do I Fall Asleep/Wake Up?", 
          subtitle = "Data from 425 days of wearing a FitBit")

#save the image
ggsave("graphs/times_polar.png", width=9, height=7, dpi=400)