This dashboard shows the analysis of my iphone health app and iwatch data in R.Apple Health app and iwatch is capable of storing lots of data points. I am a bit more serious about my workout and wellness and started to rely on Health app a lot. The goal is to visualise and decide how active I am and where am i lacking to improve my health.
Getting the data out of Health app is pretty easy. From Apple Health, you can export it anywhere.Data would be saved as a zip file and I unzipped the archive to start working on actual data.
First we create an xml object containing all of the data as a starting point
Your mileage may vary based on the amount of data you store in Apple Health, and the amount of Apple devices that use Apple Health (iPhone alone or iPhone + Apple Watch). A brief rundown is the following:
Record is the main place where the data is stored. Weight, height, blood pressure, steps, nutrition data, heart rate - all stored here
ActivitySummary is your Apple Watch daily Activity stats: Move, Exercise, Stand data
Workout is your Apple Watch workout activity per workout logged
Location is your location logged during your Apple Watch workouts (useful for runs/hikes)
InstantaneousBeatsPerMinute is exactly that: instantaneous heart rate when measured by AppleWatch
ExportDate is useful to validate what data are you looking at.
I have divided this dashboard into 3 parts based on the data collected by apple health app.
First is the record data frame which records all my basic activities like heart rate, step etc.
Second is the Activity data frame which provides data on calories burned.
The last one is the energy data frame which basically records the data from my apple watch and give details on my exercise and move goals.
Analysis from the heart rate & step count reveals that, I have a sedentary job. I walk to work and walk around during my lunch break, I work out after work (hence a slightly higher step count around 6 PM + higher heart rate).
Analysis from the weekly heat map shows further breaking down the step count by day of the week. This shows that on weekend, I move a bit all the time, versus I do more concentrated activity on weekdays.
My calories consumed and burned chart shows that I burned a lot of energy during the new year’s eve as I was partying and hopping places in NYC.
Based on my apple watch results, I try to stay active 6 days a week out of seven. One day is my cheat day and I try to relax during that day. This data could be visualized in many different ways but I like the boolean metric, and so i have use that.
My streak count tells that I mostly try to stay active. That is why I used a streak count chart, seeing how long of a streak I can get. My longest streak is 15 days of regular move goal met and 14 days of exercise and stand goal.
---
title: "The Quantified Self"
subtitle: "A project for ANLY 512: Data Visualization"
author: "Noble Verma"
date: "April 20, 2019"
output:
flexdashboard::flex_dashboard:
orientation: rows
social: menu
source_code: embed
---
```{r setup, include=FALSE, message=FALSE}
# list of libraries we are using in this dashboard
library(flexdashboard)
library(quantmod)
library(plyr)
library(dygraphs)
library(lubridate)
library(RColorBrewer)
library(DT)
library(XML)
library(tidyverse)
library(scales)
library(ggthemes)
xml <- xmlParse("C:\\Users\\noble_000\\Desktop\\apple_health_data\\export.xml")
summary(xml)
df_record <- XML:::xmlAttrsToDataFrame(xml["//Record"])
str(df_record)
df_activity <- XML:::xmlAttrsToDataFrame(xml["//ActivitySummary"])
str(df_activity)
df_workout <- XML:::xmlAttrsToDataFrame(xml["//Workout"])
str(df_workout)
df_location <- XML:::xmlAttrsToDataFrame(xml["//Location"]) %>%
mutate(latitude = as.numeric(as.character(latitude)),
longitude = as.numeric(as.character(longitude)))
str(df_location)
df <- df_record %>%
mutate(device = gsub(".*(name:)|,.*", "",device),
value = as.numeric(as.character(value)),
endDate = ymd_hms(endDate,tz="America/New_York"),
date = date(endDate),
year = year(endDate),
month = month(endDate),
day = day(endDate),
yday = yday(endDate),
wday = wday(endDate),
hour = hour(endDate),
minute = minute(endDate),
type = str_remove(type, "HKQuantityTypeIdentifier")
)
```
# Project Overview
### Getting the data and general description of the project
* This dashboard shows the analysis of my iphone health app and iwatch data in R.Apple Health app and iwatch is capable of storing lots of data points. I am a bit more serious about my workout and wellness and started to rely on Health app a lot. The goal is to visualise and decide how active I am and where am i lacking to improve my health.
* Getting the data out of Health app is pretty easy. From Apple Health, you can export it anywhere.Data would be saved as a zip file and I unzipped the archive to start working on actual data.
* First we create an xml object containing all of the data as a starting point
* Your mileage may vary based on the amount of data you store in Apple Health, and the amount of Apple devices that use Apple Health (iPhone alone or iPhone + Apple Watch). A brief rundown is the following:
+ Record is the main place where the data is stored. Weight, height, blood pressure, steps, nutrition data, heart rate - all stored here
+ ActivitySummary is your Apple Watch daily Activity stats: Move, Exercise, Stand data
+ Workout is your Apple Watch workout activity per workout logged
+ Location is your location logged during your Apple Watch workouts (useful for runs/hikes)
+ InstantaneousBeatsPerMinute is exactly that: instantaneous heart rate when measured by AppleWatch
+ ExportDate is useful to validate what data are you looking at.
* I have divided this dashboard into 3 parts based on the data collected by apple health app.
+ First is the record data frame which records all my basic activities like heart rate, step etc.
+ Second is the Activity data frame which provides data on calories burned.
+ The last one is the energy data frame which basically records the data from my apple watch and give details on my exercise and move goals.
# Record Data Frame
Column {.tabset}
-----------------------------------------------------------------------
### Apple Health Data - Heart Rate & Step Count
```{r message=FALSE, warning=FALSE}
df %>%
filter(type %in% c('HeartRate', 'StepCount')) %>%
group_by(type, hour) %>%
summarise(value = mean(value)) %>%
ggplot(aes(x = hour, y = value, fill = value)) +
geom_col() +
scale_fill_continuous(low = 'grey70', high = "#008FD5") +
scale_x_continuous(
breaks = c(0, 6, 12, 18),
label = c("Midnight", "6 AM", "Midday", "6 PM")
) +
labs(title = "Apple Health Data",
subtitle = "Hourly Data") +
facet_wrap(~ type)+
guides(fill=FALSE)
```
### Apple Health Data - Weekly Step Count Heat Map
```{r message=FALSE, warning=FALSE}
df %>%
filter(type == 'StepCount') %>%
group_by(date,wday,hour) %>%
summarize(steps=sum(value)) %>%
group_by(hour,wday) %>%
summarize(steps=sum(steps)) %>%
arrange(desc(steps)) %>%
ggplot(aes(x=hour, y=wday, fill=steps)) +
geom_tile(col = 'grey40') +
scale_fill_continuous(labels = scales::comma, low = 'grey95', high = '#008FD5') +
theme(panel.grid.major = element_blank()) +
scale_x_continuous(
breaks = c(0, 6, 12, 18),
label = c("Midnight", "6 AM", "Midday", "6 PM")
) +
scale_y_reverse(
breaks = c(1,2,3,4,5,6,7),
label = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
) +
labs(title = "Weekly Step Count Heatmap") +
guides(fill=FALSE)+
coord_equal()
```
# Energy Data Frame
Column {.tabset}
-----------------------------------------------------------------------
### Calories Consumed & Burned
```{r message=FALSE, warning=FALSE}
energy <- df %>%
filter(endDate >= '2018/11/24' &
date < '2018/12/30' &
type %in% c('BasalEnergyBurned', 'ActiveEnergyBurned', 'DietaryEnergyConsumed')) %>%
select(date, type, value) %>%
group_by(date, type) %>%
summarise(value = sum(value)) %>%
ungroup()
ggplot() +
geom_col(data = energy %>%
filter(type != "DietaryEnergyConsumed"),
aes(x= date, y = value, fill = type)) +
scale_fill_manual(values = c("BasalEnergyBurned" = "#3182bd",
"ActiveEnergyBurned" = "#9ecae1",
"DietaryEnergyConsumed" = "grey30")) +
geom_col(data = energy %>%
filter(type == "DietaryEnergyConsumed"),
aes(x= date, y = value, fill = type), width = 0.7, alpha = 0.6) +
labs(title = "Calories Consumed & Burned")
ggsave('energy1.png', width = 8, height = 6, units = "in")
```
# Activity Data Frame
Column {.tabset}
-----------------------------------------------------------------------
### Apple Watch Goals Completion (Move, Exercise & Stand Goals)
```{r, message=FALSE, warning=FALSE, results='hide'}
str(df_activity)
#Some data clean up and renaming here
df_activity_tidy <- df_activity %>%
select(-activeEnergyBurnedUnit) %>%
mutate_all(as.character) %>%
mutate(date = as.Date(dateComponents)) %>%
filter(date >= "2018-11-13") %>%
select(-dateComponents) %>%
mutate_if(is.character, as.numeric) %>%
rename(move = activeEnergyBurned,
exercise = appleExerciseTime,
stand = appleStandHours,
move_goal = activeEnergyBurnedGoal,
exercise_goal = appleExerciseTimeGoal,
stand_goal = appleStandHoursGoal) %>%
#Now, create 2 new metrics: percent of goal and a "Yes/No" flag.
mutate(move_pct = move/move_goal,
exercise_pct = exercise/exercise_goal,
stand_pct = stand/stand_goal,
move_bool = if_else(move_pct < 1, FALSE, TRUE),
exercise_bool = if_else(exercise_pct < 1, FALSE, TRUE),
stand_bool = if_else(stand_pct < 1, FALSE, TRUE))
df_activity_tall_value <- df_activity_tidy %>%
select(date, Move = move, Exercise = exercise, Stand = stand) %>%
gather(category, value, -date)
df_activity_tall_pct <- df_activity_tidy %>%
select(date, Move = move_pct, Exercise = exercise_pct, Stand = stand_pct) %>%
gather(category, pct, -date)
df_activity_tall_bool <- df_activity_tidy %>%
select(date, Move = move_bool, Exercise = exercise_bool, Stand = stand_bool) %>%
gather(category, boolean, -date)
df_activity_tall <- df_activity_tall_value %>%
left_join(df_activity_tall_pct, by = c("date", "category")) %>%
left_join(df_activity_tall_bool, by = c("date", "category")) %>%
mutate(category = as_factor(category, levels = c("Move", "Exercise", "Stand")),
month = ymd(paste(year(date), month(date), 1, sep = "-")),
week = date - wday(date) + 1,
wday = wday(date),
day = day(date))
```
```{r}
df_activity_tall %>%
ggplot(aes(x = wday, y = week, fill = boolean)) +
geom_tile(col = "grey30", na.rm = FALSE) +
theme(panel.grid.major = element_blank()) +
scale_fill_manual(values = c("grey80", "#1a9641")) +
facet_wrap(~ category) +
coord_fixed(ratio = 0.15) +
guides(fill=FALSE) +
labs(title = "Apple Watch goals completion") +
theme(axis.text.x = element_blank())
```
### Apple Watch Goals Completion (Move, Exercise & Stand Goals)
```{r message=FALSE, warning=FALSE}
df_activity_tall %>%
filter(boolean == TRUE) %>%
ggplot(aes(x = wday, y = week, fill = category)) +
geom_tile(col = "grey50", na.rm = FALSE) +
theme(panel.grid.major = element_blank()) +
facet_wrap(~ category) +
coord_fixed(ratio = 0.15) +
guides(fill=FALSE) +
labs(title = "Apple Watch goals completion")
```
### Streak Count
```{r message=FALSE, warning=FALSE}
df_activity_streak <- df_activity_tall_bool %>%
mutate(category = as_factor(category, levels = c("Move", "Exercise", "Stand"))) %>%
arrange(category, date) %>%
group_by(category,
x = cumsum(c(TRUE, diff(boolean) %in% c(-1))),
y = cumsum(c(TRUE, diff(boolean) %in% c(-1,1)))) %>%
mutate(streak = if_else(boolean == FALSE, 0L, row_number())) %>%
ungroup()
ggplot(df_activity_streak, aes(x = date, y = streak, group = x, col = category)) +
geom_line() +
facet_grid(category~.) +
guides(fill=FALSE) +
labs(title = "Streak Count")
```
# Conclusion
### Observations from the visualisations
* Analysis from the heart rate & step count reveals that, I have a sedentary job. I walk to work and walk around during my lunch break, I work out after work (hence a slightly higher step count around 6 PM + higher heart rate).
* Analysis from the weekly heat map shows further breaking down the step count by day of the week. This shows that on weekend, I move a bit all the time, versus I do more concentrated activity on weekdays.
* My calories consumed and burned chart shows that I burned a lot of energy during the new year's eve as I was partying and hopping places in NYC.
* Based on my apple watch results, I try to stay active 6 days a week out of seven. One day is my cheat day and I try to relax during that day. This data could be visualized in many different ways but I like the boolean metric, and so i have use that.
* My streak count tells that I mostly try to stay active. That is why I used a streak count chart, seeing how long of a streak I can get. My longest streak is 15 days of regular move goal met and 14 days of exercise and stand goal.