The objective of this project is to analyze my daily activities. The title ‘Quantified Self’ describes the narrative of this project: quantitative analysis of my health and activity data and describe interesting stories from them using compelling visualizations.
This analysis is done by collecting, cleaning and analyzing approximately 5-6 months of data from my apple watch 3. I utilized the existing tools in R to performing the necessary data cleaning, analysis and visualization. I will use this to answer below 5 questions regarding my personal information:
| ActiveEnergyBurned | BasalEnergyBurned | ExerciseTime | FlightsClimbed | StandHours | HeartRateAvg | HeartRateVariabilitySDNNAvg | DistanceWalkingRunning | RestingHeartRate | StepCount | Date | Weekday | Month |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 395.922 | 1175.431 | 27 | 6 | 13 | 79.05125 | 71.62108 | 6.872239 | 67 | 15553 | 2017-09-22 | Fri | Sep |
| 391.762 | 1910.015 | 3 | 1 | 15 | 69.85714 | 36.24424 | 5.789980 | 58 | 12995 | 2017-09-23 | Sat | Sep |
| 324.977 | 1849.694 | 4 | 1 | 13 | 71.44550 | 37.75742 | 5.927916 | 58 | 12851 | 2017-09-24 | Sun | Sep |
| 493.752 | 1886.428 | 23 | 2 | 12 | 71.47097 | 52.62551 | 7.389529 | 62 | 15654 | 2017-09-25 | Mon | Sep |
| 429.513 | 1906.631 | 14 | 9 | 15 | 73.57714 | 27.45601 | 5.069960 | 62 | 10806 | 2017-09-26 | Tue | Sep |
| 381.145 | 1890.194 | 16 | 6 | 15 | 72.61140 | 31.66821 | 4.338681 | 63 | 9097 | 2017-09-27 | Wed | Sep |
| 426.209 | 1896.761 | 13 | 6 | 14 | 67.82251 | 37.62457 | 6.359798 | 55 | 13167 | 2017-09-28 | Thu | Sep |
| 470.841 | 1893.244 | 22 | 11 | 14 | 73.50307 | 34.12455 | 7.974337 | 58 | 18134 | 2017-09-29 | Fri | Sep |
| 243.413 | 1856.843 | 3 | 2 | 9 | 87.73333 | 35.72139 | 4.012460 | 65 | 9294 | 2017-09-30 | Sat | Sep |
| 666.719 | 1968.482 | 49 | 2 | 18 | 78.90476 | 20.24070 | 7.968635 | 61 | 16106 | 2017-10-01 | Sun | Oct |
| 488.845 | 1918.167 | 29 | 8 | 13 | 76.36937 | 28.23230 | 8.687404 | 65 | 18709 | 2017-10-02 | Mon | Oct |
| 480.031 | 1934.775 | 23 | 11 | 15 | 70.93333 | 44.84705 | 6.920436 | 62 | 15369 | 2017-10-03 | Tue | Oct |
| 344.546 | 1860.248 | 31 | 11 | 11 | 74.56637 | 25.92660 | 5.761536 | 63 | 12555 | 2017-10-04 | Wed | Oct |
| 486.282 | 1907.010 | 21 | 18 | 14 | 71.79558 | 60.14456 | 7.622318 | 59 | 16444 | 2017-10-05 | Thu | Oct |
| 426.467 | 1864.736 | 22 | 4 | 11 | 79.12766 | 26.87350 | 7.244894 | 65 | 16323 | 2017-10-06 | Fri | Oct |
| 335.503 | 1887.975 | 9 | 1 | 10 | 77.61283 | 32.07349 | 4.027449 | 63 | 9124 | 2017-10-07 | Sat | Oct |
| 324.360 | 1901.741 | 9 | 14 | 9 | 79.88618 | 31.78827 | 8.803723 | 65 | 19639 | 2017-10-08 | Sun | Oct |
| 662.124 | 2039.800 | 47 | 33 | 13 | 82.03125 | 25.12737 | 17.238487 | 66 | 39978 | 2017-10-09 | Mon | Oct |
| 854.783 | 1914.751 | 48 | 19 | 16 | 77.15730 | 27.83121 | 8.007189 | 67 | 18387 | 2017-10-10 | Tue | Oct |
| 548.088 | 2009.856 | 29 | 25 | 10 | 80.27536 | 32.44913 | 10.552136 | 67 | 23567 | 2017-10-11 | Wed | Oct |
| 547.892 | 1872.035 | 35 | 11 | 9 | 81.12857 | 35.48722 | 9.841215 | 70 | 22573 | 2017-10-12 | Thu | Oct |
| 484.527 | 1821.395 | 22 | 22 | 11 | 83.43750 | 43.86646 | 8.641040 | 65 | 20961 | 2017-10-13 | Fri | Oct |
| 377.086 | 1933.525 | 26 | 21 | 10 | 82.57447 | 43.42205 | 7.916310 | 140 | 18753 | 2017-10-14 | Sat | Oct |
| 289.644 | 1855.911 | 10 | 12 | 8 | 71.73585 | 37.78603 | 4.104124 | 62 | 9423 | 2017-10-15 | Sun | Oct |
| 309.415 | 1853.242 | 17 | 16 | 11 | 67.69159 | 44.24653 | 5.289809 | 65 | 11829 | 2017-10-16 | Mon | Oct |
The apple watch data is stored in the xml format. I collected this data and parsed it into excel. I utilized excel and R to clean, sort and group data as per my project objectives. This page shows a sample of my dataset with first 25 rows on the display.
The data contains heart rate, heart rate variability, energy spent based on type, distance, count and activity by each date. This data is stored in a data frame within R. I employed ggplot and other R libraries to provide visual narrative to the 5 questions about my personal health and fitness data.
Objective: Does the distance travelled vary by Weekday and Month
Visualization Method: Side-by-Side Box Plot
Summary: The graphs show the side by side comparison of average distance covered categorized by weekday and month. It shows how the mean distance covered trend changes between each weekday and each month. This also shows outlier data at a glance indicating that I did some extra activities on few days.
Objective: What is the trend of my highest step counts based on Weekday and Month
Visualization Method: Grouped Bar-Chart
Summary: This graph demonstrates in form of bars, the steps I took on each Weekday of each Month and the trend of these step counts across months. This provides a visually compelling evidence of the change in trend from Weekday of one Month to the Weekday of another. It also shows the overall trajectory of increase or decrease of steps from Month to Month. This graphs also has an interactive interface that provides the tools to zoom and click on data points in the graph.
Objective: Analyze the distribution of Active Energy spent over this time
Visualization Method: Heat-Map
Summary: This graph shows on the level of active energy burned on different Weekday of different Month. This helps to visually map which Month I burned the most calories due to activity and which Weekday contributed to the highest energy expenditure. It helps map out the overall level across the timeframe.
Objective: Does the average Heart Rate differ based on Exercise Time
Visualization Method: Scatter-plot
Summary: This graph illustrates the changes in Heart Rate based on change in Exercise Time. The graph clearly shows the relationship trend line, which appears to be linear in this case. Adding features to the graphs helps highlights the range of data points that shows the clustering of observations at some points of the line versus the other points.
Objective: Describe the relationship between Stand Hours and Basal Energy spent
Visualization Method: Bubble-Chart
Summary: This graph utilizes multiple variables to show the relationship of Basal Energy spent in comparison to the hours I stood during the day. This helps may out relationship but also add more features explained through color and size of the bubbles, which represent the Weekday and Month in this plot. This allows us to visually distinguish and group around categories and fin the outliers. This graph also has an interactive interface that provides the tools to zoom and click on data points in the graph.
---
title: "Quantified Self - Data Visualization Project"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source: embed
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(flexdashboard)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(lubridate)
library(plotly)
library(ggthemes)
library(metricsgraphics)
activity_data <- read_excel("C:\\Users\\sparth\\Downloads\\Imp Docs\\Documents\\Parth Documents\\Harrisburg U\\Academics\\Trimester 2\\Data Visualization\\Project\\Final Project Data.xlsx")
```
### Project Overview
```{r}
p <- ggplot(data = activity_data, aes(x = Month)) +
geom_bar(position = "dodge", fill = "darkgreen", color = "gold") +
labs(title = "Project Data Summary", x = "Month", y = "Number of observations")+
theme_dark()
p
```
***
The objective of this project is to analyze my daily activities. The title 'Quantified Self' describes the narrative of this project: quantitative analysis of my health and activity data and describe interesting stories from them using compelling visualizations.
This analysis is done by collecting, cleaning and analyzing approximately 5-6 months of data from my apple watch 3. I utilized the existing tools in R to performing the necessary data cleaning, analysis and visualization. I will use this to answer below 5 questions regarding my personal information:
1. Does the distance travelled vary by Weekday and Month
2. What is the trend of my highest step counts based on Weekday and Month
3. Analyze the distribution of Active Energy spent over this time
4. Does the average Heart Rate differ based on Exercise Time
5. Describe the relationship between Stand Hours and Basal Energy spent
### Data preparation
```{r}
kable(activity_data[1:25,], caption="Top 25 Rows of Dataset")
```
***
The apple watch data is stored in the xml format. I collected this data and parsed it into excel. I utilized excel and R to clean, sort and group data as per my project objectives. This page shows a sample of my dataset with first 25 rows on the display.
The data contains heart rate, heart rate variability, energy spent based on type, distance, count and activity by each date. This data is stored in a data frame within R. I employed ggplot and other R libraries to provide visual narrative to the 5 questions about my personal health and fitness data.
### Question 1
```{r}
fill <- "orange"
line <- "black"
p0 <- ggplot(activity_data, aes(x = Weekday, y=DistanceWalkingRunning)) +
geom_boxplot(aes(group = Weekday), fill = fill, colour = line) +
scale_x_discrete(limits=c("Sun","Mon","Tue","Wed","Thu","Fri","Sat")) +
labs(title = "Distance Covered Per Week Day", x = "Weekdays", y = "Miles") +
theme_minimal()
p0
p1 <- ggplot(activity_data, aes(x = Month, y=DistanceWalkingRunning)) +
geom_boxplot(aes(group = Month), fill = fill, colour = line) +
scale_x_discrete(limits=c("Sep","Oct","Nov","Dec","Jan","Feb")) +
labs(title = "Distance Covered Per Month", x = "Weekdays", y = "Miles") +
theme_minimal()
p1
```
***
Objective:
Does the distance travelled vary by Weekday and Month
Visualization Method:
Side-by-Side Box Plot
Summary:
The graphs show the side by side comparison of average distance covered categorized by weekday and month. It shows how the mean distance covered trend changes between each weekday and each month. This also shows outlier data at a glance indicating that I did some extra activities on few days.
### Question 2
```{r}
p2 <- ggplot(data = activity_data, aes(x = Month, y = StepCount, fill = Weekday)) +
geom_bar(position = "dodge",stat = "identity")+
labs(title = "Steps Count per Month per Weekday", x = "Month", y = "Number of Steps")
ggplotly(p2)
```
***
Objective:
What is the trend of my highest step counts based on Weekday and Month
Visualization Method:
Grouped Bar-Chart
Summary:
This graph demonstrates in form of bars, the steps I took on each Weekday of each Month and the trend of these step counts across months. This provides a visually compelling evidence of the change in trend from Weekday of one Month to the Weekday of another. It also shows the overall trajectory of increase or decrease of steps from Month to Month. This graphs also has an interactive interface that provides the tools to zoom and click on data points in the graph.
### Question 3
```{r}
p3 <- ggplot(data = activity_data, aes(x = Month, y = Weekday)) +
geom_tile(aes(fill = ActiveEnergyBurned)) +
labs(title = "Active Energy Burned v/s Weekday and Month", x = "Month", y = "Weekday")
p3
```
***
Objective:
Analyze the distribution of Active Energy spent over this time
Visualization Method:
Heat-Map
Summary:
This graph shows on the level of active energy burned on different Weekday of different Month. This helps to visually map which Month I burned the most calories due to activity and which Weekday contributed to the highest energy expenditure. It helps map out the overall level across the timeframe.
### Question 4
```{r}
p4 <- ggplot(data = activity_data, aes(x = ExerciseTime, y = HeartRateAvg)) +
geom_point() +
stat_smooth(method = "lm") +
ggtitle("Relationship Active Energy and Heart Rate") +
labs(x = "Active Energy Burned", y = "Avg. Heart Rate (per minute)") +
theme_minimal()
p4
```
***
Objective:
Does the average Heart Rate differ based on Exercise Time
Visualization Method:
Scatter-plot
Summary:
This graph illustrates the changes in Heart Rate based on change in Exercise Time. The graph clearly shows the relationship trend line, which appears to be linear in this case. Adding features to the graphs helps highlights the range of data points that shows the clustering of observations at some points of the line versus the other points.
### Question 5
```{r}
p5 <- ggplot(activity_data, aes(x = StandHours, y = BasalEnergyBurned)) +
geom_jitter(aes(colour = Weekday, size = Month, alpha=.02)) +
labs(title = "Basal Energy Change w.r.t Hours Standing", x = "Number of Hours", y = "Basal Energy Burned (kCal")+
theme_economist_white()
ggplotly(p5)
```
***
Objective:
Describe the relationship between Stand Hours and Basal Energy spent
Visualization Method:
Bubble-Chart
Summary:
This graph utilizes multiple variables to show the relationship of Basal Energy spent in comparison to the hours I stood during the day. This helps may out relationship but also add more features explained through color and size of the bubbles, which represent the Weekday and Month in this plot. This allows us to visually distinguish and group around categories and fin the outliers. This graph also has an interactive interface that provides the tools to zoom and click on data points in the graph.