Whelp Study

Fitbit Activity Metrics Worksheet

This is a worksheet for exploring activity metrics that can be drawn from whelp activity as measured by the Fitbit device.

The Fitbit is a very small wearable digital accelerometer worn by the pups unobtrusively on their collars. The Fitbit is designed for use by humans to measure physical activity, and promoted as a fitness aid. Here we are adapting it for use in an animal study, so that what are measured as “steps” in terms of human movement may not translate directly to puppy steps, but have been observed to be a good measure of relative activity for comparing a sample population of puppies.

Per-minute step counts from Fitbit are loaded wirelessly to an encrypted data store on a personal computer, then uploaded to the Fitbit website. The Fitbit company has generously provisioned us with a research-only Web account and Application Programming Interface (API) to retrieve this time series of minute by minute data collected by the Fitbit devices. The data is fetched by our Java program using the Web API, and loaded into a relational database. We are now using the statistical language R with some add-on packages to investigate activity metrics that can be calculated using empirical data collected by the Fitbit devices.

Get Summarized data

Connect to the Fitbit DB:

library("RODBC")
## Warning: package 'RODBC' was built under R version 2.15.3
conn <- odbcConnect("fitbit")

Get a summary of “steps” for one day of activity

sql1 <- "select dd.DateStr, dd.FBDevice, dd.PupName, SUM(s.Steps)"
sql1 <- paste(sql1, "from DayDevice dd, Steps s ")
sql1 <- paste(sql1, "where dd.DateStr = s.DateStr and dd.FBDevice = s.FBDevice ")
sql1 <- paste(sql1, "and dd.PupName = s.PupName and dd.FullDayData = 1 ")
sql1 <- paste(sql1, "and dd.DateStr not in ('2014-02-16')")  # temp: exclude 2nd full days data
sql1 <- paste(sql1, "group by dd.DateStr, dd.FBDevice, dd.PupName")
sql1 <- paste(sql1, "order by 1,2,3")
sum <- sqlQuery(conn, paste(sql1))

Summary activity levels

As of this run date we have 8 whelp/days of activity being measured.

Activity by whelp/day:

barplot(sum[, 4], col = "lightblue", ylim = c(0, 15000), ylab = "Fitbit 'steps'", 
    names.arg = as.character(sum[, 3]), las = 3)
mtext("Whelp name", side = 1, line = 4)
mtext("Activity per whelp/day", side = 3, line = 1)

plot of chunk unnamed-chunk-3


# A simpler plot, same info: plot(sum[,3], sum[,4], type='h', col='red',
# ylim=c(0,15000), ylab='One day activity', xlab='', las=3) mtext('Whelp
# name', side=1, line=4)

Something to investigate

Noticing what may be a source of measurement bias needing more attention if not due to random chance: So far the daily activity totals from Fitbit Device 2 are consistently lower than Device 1. Seems to suggest control testing comparing the two devices to evaluate if and how they vary in their measurement:

plot(sum[, 4], type = "b", ylim = c(0, 15000), ylab = "Fitbit 'steps'", xlab = "", 
    xaxt = "n")
mtext("Fitbit Device Number", side = 1, line = 2.5)
mtext("A measurement bias by Device?", side = 3, line = 1)
axis(1, at = c(1:length(sum[, 2])), labels = substring(as.character(sum[, 2]), 
    7, 7))

plot of chunk unnamed-chunk-4

Gather more detail

Get 24 hours detail “steps” for each whelp/day (1440 minutes/day)

sql2 <- "select dd.DateStr, dd.FBDevice, dd.PupName, s.TimeStr, s.Steps"
sql2 <- paste(sql2, "from DayDevice dd, Steps s ")
sql2 <- paste(sql2, "where dd.DateStr = s.DateStr and dd.FBDevice = s.FBDevice ")
sql2 <- paste(sql2, "and dd.PupName = s.PupName and dd.FullDayData = 1 ")
sql2 <- paste(sql2, "and dd.DateStr not in ('2014-02-16')")  # temp: exclude 2nd full days data
sql2 <- paste(sql2, "order by dd.DateStr, dd.FBDevice, dd.PupName, s.TimeStr")
dtl <- sqlQuery(conn, paste(sql2))

library(TTR)
## Warning: package 'TTR' was built under R version 2.15.3
## Loading required package: xts
## Warning: package 'xts' was built under R version 2.15.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 2.15.3
## 
## Attaching package: 'zoo'
## 
## The following object(s) are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
# Transform to n cases x 1440 steps matrix
ads <- as.matrix(xtabs(Steps ~ PupName + TimeStr, dtl))
source("C:\\R\\fitbit\\fitbit_lib.R")

Metric: Nap Count

Defined as: Number of Naps per day, where a Nap is: a period of time between 6am and 10pm having at least 10 minutes of no steps counted.

naps <- rep(0, length(ads[, 1]))
for (i in 1:length(naps)) {
    naps[i] <- napCount(ads[i, ], 10, 360, 1320)  # params: 10=10minutes, 360=6am, 1320=10pm
}
barplot(naps, col = "lightblue", ylab = "Naps per Day", names.arg = as.character(sum[, 
    3]), las = 3)
mtext("Nap Count", side = 3, line = 1)

plot of chunk unnamed-chunk-6

More Nap Metrics

A better approach: get a ragged list of NapData vectors, from which more stats can calculated with a single set of assumptions about how “Nap” is defined.

Daily Nap Sleep Time - defined as: Sum of daily Nap times in hours.

Average Nap Time - defined as: Average of daily Nap times in minutes.

Standard Deviation of Nap Times - defined as: Standard deviation of daily Nap times in minutes.

napdat <- list()
for (i in 1:length(ads[, 1])) {
    napdat[[i]] <- napData(ads[i, ], 10, 360, 1320)  # params: 10=10minutes, 360=6am, 1320=10pm
}
# Put nap stats into dataframe
napDf <- data.frame(ct = numeric(0), sum = numeric(0), mean = numeric(0), sd = numeric(0))
for (i in 1:length(napdat)) {
    napDf[i, "ct"] = length(napdat[[i]])
    napDf[i, "sum"] = sum(napdat[[i]])
    napDf[i, "mean"] = mean(napdat[[i]])
    napDf[i, "sd"] = sd(napdat[[i]])
}
# Bar plots
par(mfrow = c(2, 2))
barplot(napDf$ct, col = "lightblue", ylab = "Naps per Day", names.arg = as.character(sum[, 
    3]), las = 3)
mtext("Nap Count", side = 3, line = 1)
barplot(napDf$sum/60, col = "lightgreen", ylab = "Sum of Nap times in hours", 
    names.arg = as.character(sum[, 3]), las = 3)
mtext("Daily Nap Sleep Time", side = 3, line = 1)
barplot(napDf$mean, col = "pink", ylab = "Avg Nap time in minutes", names.arg = as.character(sum[, 
    3]), las = 3)
mtext("Average Nap Time", side = 3, line = 1)
barplot(napDf$sd, col = "orange", ylab = "Std Deviation in minutes", names.arg = as.character(sum[, 
    3]), las = 3)
mtext("Standard Deviation of Nap Times", side = 3, line = 1)

plot of chunk unnamed-chunk-7

Ideas for Further Exploration

Since pups usually come in pairs of siblings, we could also look at times when a pup is active but the sibling is resting, and vice versa - that might be interesting too. Even when they're both active, it could be that one is consistently more active/feisty than the other. That way a confounding factor (the influence of the sibling) can also be an asset - another behavior to study!