The primary objective of this project is to develop a visualization dashboard based on a series of data about our own life. We have gathered data on sleep and will be analyzing different aspects of it. This approach is called the Quantified Self. The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data. It also produces a wealth of data in a variety of forms.
Reference:
All the research question are aiming at observing relationship between sleep and physical activity along with impact of daily lifestyle habits on sleep quality. It is important to note that caffeine cannot replace sleep, it can temporarily make us feel more alert by blocking sleep-inducing chemicals in the brain and increasing adrenaline productionW. This same reference is proved by the visualization - 4 in the dashboard.Thus when we looked at sleep quality trend it is evident that sleep quality is affected by food habits and varying sleep start time.
Also , we can see strong correlation between sleep time and heart rate as well as sleep time and sleep quality. Good-quality sleep decreases the work of your heart, as blood pressure and heart rate go down at night. People who are sleep-deprived show less variability in their heart rate, meaning that instead of fluctuating normally, the heart rate usually stays elevated.In our graphs we can see that more time was spent in bed i.e. good quality sleep normalized heart rate over time.
Physical activity also has major impact on the quality of the sleep. More active days results into better sleep quality.Exercise has long been associated with better sleep and there is a robust bidirectional relationship between exercise and sleep. This insight is proved by the final visualization in the report.
Th eobjective of the Quantified Self project is to develop a visualization dashboard based on a series of data about our own life. In the current scenario we are analyzing sleep data. The actual data used for this project can range from daily sleep regimes, TV shows watched, types of food eaten, spending habits, commute times to work, travel habits, to blood pressure and nutrient intake. Therefore, designing this project around the QS movement makes perfect sense because it offers the opportunity to be both the data and question provider, the data analyst, the vis designer, and the end user. This means we will be in the unique position of being capable of providing feedback and direction at all points along the data visualization/analysis life cycle.
The Sleep data is collected via the Sleep Cycle iOS app. The data was acquired between 2014-2018 and collected through the Sleep Cycle app from Northcube on iOS. This data has about 8 variables and 887 records. The data variables are as follows:
For the Quantified self-project on sleep data evaluation, we will be producing a series of dashboards to give different insights. These dashboards will highlight correlation between sleep and other activities. We will be using the Flex-board function provided in R to produce dashboard and visualization charts. We are using different libraries provided by R environments like ggplot2, plotly, dygraphs, etc. to produce graphs. After data collection,we performed exploratory data analysis and found out data has missing values for important columns like heart rate which we fixed by filling up the missing values rolling averages to make data usable and would improve accuracy while solving below mentioned research questions.
---
title: "The Quantified Self (Sleep Data)"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
source: embed
---
```{r}
#rm(list = ls())
```
```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
#Fixing date columns
library(readr)
library(lubridate)
library(chron)
library(dplyr)
sleepdata_fixed <- read_csv("C:\\Users\\Administrator\\Desktop\\sleepdata_fixed.csv")
sleepdata_fixed$Start<- as.POSIXct(sleepdata_fixed$Start,format="%m/%d/%Y %H:%M")
sleepdata_fixed$End <- as.POSIXct(sleepdata_fixed$End,format="%m/%d/%Y %H:%M")
```
# Synopsis
Row
-------------------------------------
### Case Scanario
The primary objective of this project is to develop a visualization dashboard based on a series of data about our own life. We have gathered data on sleep and will be analyzing different aspects of it. This approach is called the Quantified Self. The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data. It also produces a wealth of data in a variety of forms.
Reference:
- https://developers.google.com/fit/scenarios/read-sleep-data
- https://www.cdc.gov/sleep/data_statistics.html
- https://www.kaggle.com/danagerous/sleep-data
- https://www.sleepfoundation.org/articles/caffeine-and-sleep
- https://www.webmd.com
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4341978/
### Conclusion
All the research question are aiming at observing relationship between sleep and physical activity along with impact of daily lifestyle habits on sleep quality. It is important to note that caffeine cannot replace sleep, it can temporarily make us feel more alert by blocking sleep-inducing chemicals in the brain and increasing adrenaline productionW. This same reference is proved by the visualization - 4 in the dashboard.Thus when we looked at sleep quality trend it is evident that sleep quality is affected by food habits and varying sleep start time.
Also , we can see strong correlation between sleep time and heart rate as well as sleep time and sleep quality. Good-quality sleep decreases the work of your heart, as blood pressure and heart rate go down at night. People who are sleep-deprived show less variability in their heart rate, meaning that instead of fluctuating normally, the heart rate usually stays elevated.In our graphs we can see that more time was spent in bed i.e. good quality sleep normalized heart rate over time.
Physical activity also has major impact on the quality of the sleep. More active days results into better sleep quality.Exercise has long been associated with better sleep and there is a robust bidirectional relationship between exercise and sleep. This insight is proved by the final visualization in the report.
Row {.tabset .tabset-fade}
-------------------------------------
### Objective
Th eobjective of the Quantified Self project is to develop a visualization dashboard based on a series of data about our own life. In the current scenario we are analyzing sleep data. The actual data used for this project can range from daily sleep regimes, TV shows watched, types of food eaten, spending habits, commute times to work, travel habits, to blood pressure and nutrient intake. Therefore, designing this project around the QS movement makes perfect sense because it offers the opportunity to be both the data and question provider, the data analyst, the vis designer, and the end user. This means we will be in the unique position of being capable of providing feedback and direction at all points along the data visualization/analysis life cycle.
### Data Description
The Sleep data is collected via the Sleep Cycle iOS app. The data was acquired between 2014-2018 and collected through the Sleep Cycle app from Northcube on iOS. This data has about 8 variables and 887 records. The data variables are as follows:
- Start – Date and start time of the sleep time
- End – Date and end time of the sleep time
- Sleep Quality – Quality of the sleep in terms of percentage
- Time In Bed – Total sleep time
- Wake up – Mental state after waking up in the morning
- Sleep Notes – Details on noticeable event throughout the day before going to sleep
- Heart Rate – Heart Rate Measurement
- Activity (steps) – Activity in terms of sleep
### Project Approach
For the Quantified self-project on sleep data evaluation, we will be producing a series of dashboards to give different insights. These dashboards will highlight correlation between sleep and other activities. We will be using the Flex-board function provided in R to produce dashboard and visualization charts. We are using different libraries provided by R environments like ggplot2, plotly, dygraphs, etc. to produce graphs. After data collection,we performed exploratory data analysis and found out data has missing values for important columns like heart rate which we fixed by filling up the missing values rolling averages to make data usable and would improve accuracy while solving below mentioned research questions.
# Research Question-1
Inputs {.sidebar}
-------------------------------------
### Data Exploration
What is the change in sleep quality over the observed period?
Here we are lookimg at the trend of sleep quality and how it changes over observed period. We can evidently see from the bar chart and pie chart that "medium" sleep quality has dominance over other two categories. We will oberving what factors contribute to this from following observations. We have used
Row {.tabset .tabset-fade}
-------------------------------------
### Data Visualization-1
```{r}
library(ggthemes)
sleepdata_fixed$Sleep.quality_new = as.numeric(sub("%","",sleepdata_fixed$`Sleep quality`))/100
sleepdata_fixed$Sleep.quality_new = (sleepdata_fixed$Sleep.quality_new)*10^16
#Sleep.quality_new
rq1 = ggplot(data=sleepdata_fixed, aes(Sleep.quality_new)) +
geom_bar(fill = "#0073C2FF") +
ggtitle("Sleep Quality over Observed Period") +
xlab("Sleep Quality") +
ylab("Variation") +
theme_economist()
rq1
```
### Data Visualization - 2
```{r}
library(dplyr)
sleep_part1 = subset(sleepdata_fixed, Sleep.quality_new <= 50)
a=nrow(sleep_part1)
sleep_part2 = subset(sleepdata_fixed, Sleep.quality_new > 50 & Sleep.quality_new <= 89)
b=nrow(sleep_part2)
sleep_part3 = subset(sleepdata_fixed, Sleep.quality_new >= 90)
c=nrow(sleep_part3)
range = c("low","medium","high")
value=c(16,405,66)
d=data.frame(range,value)
RQ = ggplot(d, aes(x="", y=value, fill=range)) +
geom_bar(stat="identity", width=1, color="white") +
coord_polar("y", start=0) +
ggtitle("Sleep Quality Distribution") +
theme_economist()
RQ
```
# Research Question-2
Inputs {.sidebar}
-------------------------------------
### Data Exploration
What is the heart rate change during the sleep cycle?
In this graph, we are trying to undertand correlation between heart rate and sleep cycle. We have produced two graphs; scatter plot with regression line and area plot for the same. We can see that heart rate normalizes as more time in spent in bed.
Row {.tabset .tabset-fade}
-------------------------------------
### Data Visualization - 1
```{r}
rq3 <- sleepdata_fixed
rq3$tbm <- 60 * 24 * as.numeric(times(rq3$`Time in bed`))
rq3 <- rq3 %>% select(Start, End, `Sleep quality`, `Time in bed`, `Wake up`, `Sleep Notes`, avg_steps_by_week, avg_heart_rate, tbm) %>% filter(Start >= "2015-01-01 00:00:00" & End <= "2016-12-31 00:00:00") %>% group_by(Start=floor_date(Start, "month")) %>% summarize(avg_heart_rate=mean(avg_heart_rate), tbm=mean(tbm))
ggplot(rq3, aes(x = tbm, y = avg_heart_rate)) +
#geom_area(fill="lightblue", color="black") +
#geom_col(size = 1, color = "darkblue", fill = "white") +
geom_point() +
geom_smooth(method = "lm", color = "black") +
#geom_line(size = 1.5, color="red", group = 1)
labs(title = "Heart Rate Change",
x = "Average Time in bed over month",
y = "Average Heart rate") +
theme_economist()
```
### Data Visualization - 2
```{r}
ggplot(rq3, aes(x = tbm, y = avg_heart_rate)) +
geom_area(fill="lightblue", color="black") +
#geom_col(size = 1, color = "darkblue", fill = "white") +
#geom_point() +
#geom_smooth(method = "lm", color = "green") +
#geom_line(size = 1.5, color="red", group = 1)
labs(title = "Heart Rate Change",
x = "Average Time in bed over month",
y = "Average Heart rate") +
theme_economist()
```
# Research Question-3
Inputs {.sidebar}
-------------------------------------
### Data Exploration
Sleep quality and time in bed relationship?
Here we are trying to analyze change in sleep quality with respect to total time spent in bed. We are plotting scatter plot to understand the relationship. From the graph, we can see that there is high accumulation towards right top corner indicating that sleep quality improves with more time spent in the bed.
Row {.tabset .tabset-fade}
-------------------------------------
### Data Visualization
```{r}
sleepdata_fixed$sq_sp = as.numeric(sub("%","",sleepdata_fixed$`Sleep quality`))*100
sleepdata_fixed$tbm_sp <- 60 * 24 * as.numeric(times(sleepdata_fixed$`Time in bed`))
scatter.smooth(x=sleepdata_fixed$sq_sp, y=sleepdata_fixed$tbm_sp, xlab = "Sleep Quality", ylab = "Time in bed")
```
# Research Question-4
Inputs {.sidebar}
-------------------------------------
### Data Exploration
What are the factors caused for different sleep patterns?
This visualization answer the research question based on the notes captured by indiviual might be the factor for sleep and based on our analysis we conclude that coffee, tea and eating late might be the highest factor poor sleep.
Row {.tabset .tabset-fade}
-------------------------------------
### Data Visualization
```{r}
rq5 <- data.frame(sn=sleepdata_fixed$`Sleep Notes`, cnt = sapply(strsplit(sleepdata_fixed$`Sleep Notes`, ":"), length), st=sleepdata_fixed$`Time in bed`)
rq5$tbm <- round(24 * as.numeric(times(rq5$st)))
rq5 <- rq5 %>% filter(tbm > 0) %>% group_by(sn) %>% summarise(cnt = sum(cnt), st_cnt=sum(tbm))
ggplot(rq5, aes(fill=cnt, y=st_cnt, x=sn)) +
geom_bar(position="stack", stat="identity") +
xlab("Hours of Sleep over Notes") + ylab("Factors of Sleep") +
guides(fill=guide_legend(title="Count of Factors")) +
ggtitle("Factors that change sleep pattern") +
coord_flip()
```
# Research Question-5
Inputs {.sidebar}
-------------------------------------
### Data Exploration
Does exercise help in sleep time and improves quality?
This visualization helps in understanding the user pattern based on the exercise (in steps) involved in a month period and how much amount of sleep the indiviual had, from the above visualization it is pretty clear that indiviual has better sleep uin the months of January and Feburary when the number of steps taken by indiviual is more than 150K.
Row {.tabset .tabset-fade}
-------------------------------------
### Data Visualization
```{r}
# add a column to change time in minutes
sleepdata_fixed$tbm <- 60 * 24 * as.numeric(times(sleepdata_fixed$`Time in bed`))
rq6 <- sleepdata_fixed %>% select(Start, End, `Sleep quality`, `Time in bed`, `Wake up`, `Sleep Notes`, avg_steps_by_week, avg_heart_rate, tbm) %>% filter(Start >= "2015-01-01 00:00:00" & End <= "2015-12-31 00:00:00") %>% group_by(Start=floor_date(Start, "month")) %>% summarize(avg_steps_by_week=sum(avg_steps_by_week), tbm=sum(tbm))
rq6$month <- months(as.Date(rq6$Start))
ggplot(rq6) +
geom_col(aes(x = month, y = avg_steps_by_week), size = 0.5, color = "darkblue", fill = "white") +
geom_line(aes(x = month, y = tbm), size = 2, color="red", group = 1) + xlab("Month") + ylab("Steps vs Sleep time (minutes)") + ggtitle("Sleep Time vs Exercise")
```