Purpose
The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.
With learning more about myself being he inspiration, I’ve used my Fitbit data and my goodreads app data to derive some insights on two activities I spend considerable time outside work - Workout and Books.
I used to be a voracious reader growing up, reading alomst 1 book every day sometimes but somehow I’ve slacked on reading for a couple of years. So I wanted to analyze my Goodreads data to figure out a couple of things:
Similar to reading, I’ve been slacking on my workouts as well. Now that I’m running the 5K in a couple of days, I wanted to explore my workout routine. Also, i’ve been having poor quality of sleep since moving to the city I now stay in. So I wanted to gain some insights on my sleep activity as well since it’s a good indicator of my overall health
Now when I look at the days I’m most active, with my daily step goal being 10000, I seem to be lethargic on Thur or Fri mostly. Wonder if it has to do with me being exhausted with work and classes Mon to Wed, and not having the stamina to workout!
I also wanted to check that on days that I work out more, am I getting deeper sleep so my body can recover sufficiently. The graph suggests that my Deep Sleep patterns are in line with my physical activity meaning that on days I work out more, I’m mostly getting deeper sleep too.
---
title: "The Quantified Self"
subtitle: Analyzing Wearables and Social Media Data
output:
flexdashboard::flex_dashboard:
storyboard: true
source: embed
orientation: columns
vertical_layout: fill
---
```{r setup, include=FALSE}
#Load packages
# load packages
library(httr)
library(tidyverse)
library(stringr)
library(xml2)
library(viridis)
library(knitr)
library(flexdashboard)
library(reshape2)
library(latticeExtra)
library(readxl)
library(lubridate)
knitr::opts_chunk$set(echo = FALSE)
```
### The Quantified Self
``` {r}
knitr::include_graphics("C:/Users/sowmyapk/Desktop/ANLY 512/Final_Proj.jpg")
```
***
Purpose
The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and "Big Data". This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.
With learning more about myself being he inspiration, I've used my Fitbit data and my goodreads app data to derive some insights on two activities I spend considerable time outside work - Workout and Books.
I used to be a voracious reader growing up, reading alomst 1 book every day sometimes but somehow I've slacked on reading for a couple of years. So I wanted to analyze my Goodreads data to figure out a couple of things:
1) Have I started reading a lot more over the years
2) Have I enjoyed reading the books I've read
3) What genre of books do I enjoy the most
Similar to reading, I've been slacking on my workouts as well. Now that I'm running the 5K in a couple of days, I wanted to explore my workout routine. Also, i've been having poor quality of sleep since moving to the city I now stay in. So I wanted to gain some insights on my sleep activity as well since it's a good indicator of my overall health
4) How active have I been in the last month in the run up to my 5K
5) Are there any patters in my sleep over the week
6) On days that I'm active, is my body getting sufficient sleep to repair itself
### Goodreads - Books read over the years
```{r,echo = FALSE, message = FALSE}
#Read goodreads credentials
goodreads_key <- read_xlsx("C:/Users/sowmyapk/Desktop/ANLY 512/goodreads_key.xlsx")
API_KEY <- goodreads_key$API_Key
GR_ID <- goodreads_key$GRI_ID
URL <- "https://www.goodreads.com/review/list?"
#Parse my goodreads Read list of books
get_shelf <- function(GR_ID) {
shelf <- GET(URL, query = list(
v = 2, key = API_KEY, id = GR_ID, shelf = "read", per_page = 200))
shelf_contents <- content(shelf, as = "parsed")
return(shelf_contents)
}
shelf <- get_shelf(GR_ID)
#Parse details of my books read
get_df <- function(shelf) {
title <- shelf %>%
xml_find_all("//title") %>%
xml_text()
rating <- shelf %>%
xml_find_all("//rating") %>%
xml_text()
added <- shelf %>%
xml_find_all("//date_added") %>%
xml_text()
started <- shelf %>%
xml_find_all("//started_at") %>%
xml_text()
read <- shelf %>%
xml_find_all("//read_at") %>%
xml_text()
df <- tibble(
title, rating, added, started, read)
return(df)
}
df <- get_df(shelf)
# get the dates and the year read along with title, rating etc
get_books <- function(df) {
books <- df %>%
gather(date_type, date, -title, -rating) %>%
separate(date,
into = c("weekday", "month", "day", "time", "zone", "year"),
sep = "\\s", fill = "right") %>%
mutate(date = str_c(year, "-", month, "-", day)) %>%
select(title, rating, date_type, date) %>%
mutate(date = as.Date(date, format = "%Y-%b-%d")) %>%
spread(date_type, date) %>%
mutate(title = str_replace(title, "\\:.*$|\\(.*$|\\-.*$", "")) %>%
mutate(started = ifelse(
is.na(started), as.character(added), as.character(started))) %>%
mutate(started = as.Date(started, format ="%Y")) %>%
mutate(rating = as.integer(rating))
return(books)
}
books <- get_books(df)
books %>%
filter(year(read) >=2016) %>%
ggplot(aes(year(read))) +
geom_bar(stat = "count", fill = "lightgreen") +
labs(title = "Books read per year: 2016 - 2019") + xlab("Year") +ylab("# of Books Read")
```
***
1) From the graph it appears that the # of books I've read has been on an upward trend starting from 2016 when I started using the goodreads app. Since I still have half the year left in 2019, I'm confident i'll beat my 2018 target!
### Goodreads - Have I enjoyed reading the books I've read?
```{r}
year_read<-year(books$read)
rating_books<-books$rating
df1<-data.frame(year_read,rating_books)
newData<-df1[which(df1$year_read>=2016),]
ggplot(books, aes(books$read,books$rating, color=books$rating)) +
geom_point() + stat_ellipse() + scale_color_viridis_c() +
labs(title = "Did I like the books I've read over the years?")+ xlab("Year") + ylab("Ratings")
```
***
2) From my ratings of the books I've read, it appears that my choice of books in 2018 and 2019 have made me happier than it has in 2016, 2017. That does make sense to me because I started relying on the book ratings of other friends on the app before picking up a book to read esp last year.
### Goodreads - Genre of books I liked reading
```{r}
goodreads_genre<-read_xlsx("C:/Users/sowmyapk/Desktop/ANLY 512/goodreads.xlsx")
goodreads_genre$genre<-factor(goodreads_genre$genre)
goodreads_genre %>%
filter(rating>=4) %>%
ggplot(aes(genre)) +
geom_bar(data=goodreads_genre, aes(y= (..count..)/sum(..count..), fill = genre)) +
scale_y_continuous(labels = scales::percent) +
coord_flip() +
labs(title = "Which Genre did I like to read most in the top rated books?", x = "Genre", y = "Percentage of Books", fill = "Genre Type")
```
***
3) The genre I've enjoyed reading the most happens to be Drama with Crime fiction coming in a close second. What surprised me about this graph was % of self-help books, humor books & memoirs I've been reading. These are genres I started reading after coming to the US and therefore I'm very wary of picking up one of those genres unless they come highly recommended.
### Fitbit - Activity in the last month
```{r}
#Importing my cleaned fitbit data with just Mar - 20 to Apr 20 , 2019 data
daily <- read_xlsx("C:/Users/sowmyapk/Desktop/ANLY 512/fitbit_clean.xlsx")
#converting to numbers
daily$Steps<-as.numeric(daily$Steps)
daily$hr_Asleep<-as.numeric(daily$Min_Asleep)/60
daily$hr_Awake<-as.numeric(daily$Min_Awake)/60
daily$hr_Deep<-as.numeric(daily$Min_Deep)/60
daily$hr_REM<-as.numeric(daily$Min_REM)/60
daily$Date<-as.Date(daily$Date)
daily$Weekday <- wday(parse_date_time(daily$Date, "%y-%m-%d"))
daily$Weekday <- wday(daily$Date,label=TRUE)
ggplot(daily, aes(Date,Steps)) +
geom_line(stat="identity", color="#3ec1c1") +
scale_y_continuous() +
labs(title="30 Days Of Fitbit - 2019 (Day of the Week)") +geom_smooth(method = "lm")
ggplot(daily, aes(Date,Steps, color=Weekday)) +
geom_point(size = 12) + scale_colour_hue(l=50) +
geom_hline(daily, yintercept=10000, color = "black") + labs(title = "Activity over the week - Goal 10,000 steps")
```
***
4) I'm measuring my activity level using my step count since I'm training for my 5K and therefore running a lot. The trend seems to be an upward slope which is comforting.
Now when I look at the days I'm most active, with my daily step goal being 10000, I seem to be lethargic on Thur or Fri mostly. Wonder if it has to do with me being exhausted with work and classes Mon to Wed, and not having the stamina to workout!
### Fitbit - Sleep activity last month
```{r}
ggplot(daily, aes(x=Weekday, y=hr_Asleep, fill = Weekday)) +
geom_violin() +
stat_summary(fun.y = "mean", geom = "point", shape = 8, size = 3, color = "midnightblue") + coord_flip()
```
***
5) The violin plots give some interesting insights into my sleep patters. I seem to be starting off my week with fewer hours of sleep. Similar to my workout activity, Mon - Wed is when I seem to be getting less rest. As the week progresses, I'm sleeping longer on average. Also most of my data is distributed around the average which means I'm consitently getting ~ average sleep hours. But my sleep schedule is out of whack on Sun and Thur. I need to keep track of my sleep on those days to understand why.
### Fitbit - Tracking Sleep quality
```{r, echo = FALSE}
sleep_data <-data.frame(daily$hr_Deep, daily$hr_REM, daily$hr_Asleep, daily$Weekday)
sleep1 <- melt(sleep_data, id.var="daily.Weekday")
ggplot(sleep1, aes(x = daily.Weekday, y = value, fill = variable)) +
geom_bar(stat = "summary") + labs("Deep and REM Sleep activity over the week")
obj1 <- xyplot(Steps ~ Date, daily, type = "l", main = "Steps vs DEEP Sleep activity", text=c("Steps", lines=TRUE))
obj2 <- xyplot(hr_Deep ~ Date, daily, type = "l", text=c("DEEP in Hrs", lines=TRUE))
doubleYScale(obj1, obj2, add.ylab2 = TRUE)
```
***
6) I also wanted to measure the overall quality of my sleep. REM sleep is when your brain becomes more active and you start dreaming. It plays an important role in mood regulatuon, memory and learning. Deep sleep is when the body promotes physical recovery. From the graph it seems that on average I get 7 hours of sleep, out of which I get about 1 hr of Deep sleep and ~1.5 hrs of REM sleep. It is believed that a healthy adult should get about 1-2 hrs of Deep sleep for every 8 hours of sleep. Clearly, I have to work on my sleep. Maybe meditate before going to bed!
I also wanted to check that on days that I work out more, am I getting deeper sleep so my body can recover sufficiently. The graph suggests that my Deep Sleep patterns are in line with my physical activity meaning that on days I work out more, I'm mostly getting deeper sleep too.