Project Overview


The objective of this project is to analyze my daily activities. The title ‘Quantified Self’ describes the narrative of this project: quantitative analysis of my health and activity data and describe interesting stories from them using compelling visualizations.

This analysis is done by collecting, cleaning and analyzing approximately 5-6 months of data from my apple watch 3. I utilized the existing tools in R to performing the necessary data cleaning, analysis and visualization. I will use this to answer below 5 questions regarding my personal information:

  1. Does the distance travelled vary by Weekday and Month
  2. What is the trend of my highest step counts based on Weekday and Month
  3. Analyze the distribution of Active Energy spent over this time
  4. Does the average Heart Rate differ based on Exercise Time
  5. Describe the relationship between Stand Hours and Basal Energy spent

Data preparation

Top 25 Rows of Dataset
ActiveEnergyBurned BasalEnergyBurned ExerciseTime FlightsClimbed StandHours HeartRateAvg HeartRateVariabilitySDNNAvg DistanceWalkingRunning RestingHeartRate StepCount Date Weekday Month
395.922 1175.431 27 6 13 79.05125 71.62108 6.872239 67 15553 2017-09-22 Fri Sep
391.762 1910.015 3 1 15 69.85714 36.24424 5.789980 58 12995 2017-09-23 Sat Sep
324.977 1849.694 4 1 13 71.44550 37.75742 5.927916 58 12851 2017-09-24 Sun Sep
493.752 1886.428 23 2 12 71.47097 52.62551 7.389529 62 15654 2017-09-25 Mon Sep
429.513 1906.631 14 9 15 73.57714 27.45601 5.069960 62 10806 2017-09-26 Tue Sep
381.145 1890.194 16 6 15 72.61140 31.66821 4.338681 63 9097 2017-09-27 Wed Sep
426.209 1896.761 13 6 14 67.82251 37.62457 6.359798 55 13167 2017-09-28 Thu Sep
470.841 1893.244 22 11 14 73.50307 34.12455 7.974337 58 18134 2017-09-29 Fri Sep
243.413 1856.843 3 2 9 87.73333 35.72139 4.012460 65 9294 2017-09-30 Sat Sep
666.719 1968.482 49 2 18 78.90476 20.24070 7.968635 61 16106 2017-10-01 Sun Oct
488.845 1918.167 29 8 13 76.36937 28.23230 8.687404 65 18709 2017-10-02 Mon Oct
480.031 1934.775 23 11 15 70.93333 44.84705 6.920436 62 15369 2017-10-03 Tue Oct
344.546 1860.248 31 11 11 74.56637 25.92660 5.761536 63 12555 2017-10-04 Wed Oct
486.282 1907.010 21 18 14 71.79558 60.14456 7.622318 59 16444 2017-10-05 Thu Oct
426.467 1864.736 22 4 11 79.12766 26.87350 7.244894 65 16323 2017-10-06 Fri Oct
335.503 1887.975 9 1 10 77.61283 32.07349 4.027449 63 9124 2017-10-07 Sat Oct
324.360 1901.741 9 14 9 79.88618 31.78827 8.803723 65 19639 2017-10-08 Sun Oct
662.124 2039.800 47 33 13 82.03125 25.12737 17.238487 66 39978 2017-10-09 Mon Oct
854.783 1914.751 48 19 16 77.15730 27.83121 8.007189 67 18387 2017-10-10 Tue Oct
548.088 2009.856 29 25 10 80.27536 32.44913 10.552136 67 23567 2017-10-11 Wed Oct
547.892 1872.035 35 11 9 81.12857 35.48722 9.841215 70 22573 2017-10-12 Thu Oct
484.527 1821.395 22 22 11 83.43750 43.86646 8.641040 65 20961 2017-10-13 Fri Oct
377.086 1933.525 26 21 10 82.57447 43.42205 7.916310 140 18753 2017-10-14 Sat Oct
289.644 1855.911 10 12 8 71.73585 37.78603 4.104124 62 9423 2017-10-15 Sun Oct
309.415 1853.242 17 16 11 67.69159 44.24653 5.289809 65 11829 2017-10-16 Mon Oct

The apple watch data is stored in the xml format. I collected this data and parsed it into excel. I utilized excel and R to clean, sort and group data as per my project objectives. This page shows a sample of my dataset with first 25 rows on the display.

The data contains heart rate, heart rate variability, energy spent based on type, distance, count and activity by each date. This data is stored in a data frame within R. I employed ggplot and other R libraries to provide visual narrative to the 5 questions about my personal health and fitness data.

Question 1


Objective: Does the distance travelled vary by Weekday and Month

Visualization Method: Side-by-Side Box Plot

Summary: The graphs show the side by side comparison of average distance covered categorized by weekday and month. It shows how the mean distance covered trend changes between each weekday and each month. This also shows outlier data at a glance indicating that I did some extra activities on few days.

Question 2


Objective: What is the trend of my highest step counts based on Weekday and Month

Visualization Method: Grouped Bar-Chart

Summary: This graph demonstrates in form of bars, the steps I took on each Weekday of each Month and the trend of these step counts across months. This provides a visually compelling evidence of the change in trend from Weekday of one Month to the Weekday of another. It also shows the overall trajectory of increase or decrease of steps from Month to Month. This graphs also has an interactive interface that provides the tools to zoom and click on data points in the graph.

Question 3


Objective: Analyze the distribution of Active Energy spent over this time

Visualization Method: Heat-Map

Summary: This graph shows on the level of active energy burned on different Weekday of different Month. This helps to visually map which Month I burned the most calories due to activity and which Weekday contributed to the highest energy expenditure. It helps map out the overall level across the timeframe.

Question 4


Objective: Does the average Heart Rate differ based on Exercise Time

Visualization Method: Scatter-plot

Summary: This graph illustrates the changes in Heart Rate based on change in Exercise Time. The graph clearly shows the relationship trend line, which appears to be linear in this case. Adding features to the graphs helps highlights the range of data points that shows the clustering of observations at some points of the line versus the other points.

Question 5


Objective: Describe the relationship between Stand Hours and Basal Energy spent

Visualization Method: Bubble-Chart

Summary: This graph utilizes multiple variables to show the relationship of Basal Energy spent in comparison to the hours I stood during the day. This helps may out relationship but also add more features explained through color and size of the bubbles, which represent the Weekday and Month in this plot. This allows us to visually distinguish and group around categories and fin the outliers. This graph also has an interactive interface that provides the tools to zoom and click on data points in the graph.

---
title: "Quantified Self - Data Visualization Project"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(flexdashboard)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(lubridate)
library(plotly)
library(ggthemes)
library(metricsgraphics)

activity_data <- read_excel("C:\\Users\\sparth\\Downloads\\Imp Docs\\Documents\\Parth Documents\\Harrisburg U\\Academics\\Trimester 2\\Data Visualization\\Project\\Final Project Data.xlsx")

```

### Project Overview

```{r}
p <- ggplot(data = activity_data, aes(x = Month)) +
            geom_bar(position = "dodge", fill = "darkgreen", color = "gold") +
  labs(title = "Project Data Summary", x = "Month", y = "Number of observations")+
  theme_dark()
p
```

***

The objective of this project is to analyze my daily activities. The title 'Quantified Self' describes the narrative of this project: quantitative analysis of my health and activity data and describe interesting stories from them using compelling visualizations. 

This analysis is done by collecting, cleaning and analyzing approximately 5-6 months of data from my apple watch 3. I utilized the existing tools in R to performing the necessary data cleaning, analysis and visualization. I will use this to answer below 5 questions regarding my personal information:

1. Does the distance travelled vary by Weekday and Month
2. What is the trend of my highest step counts based on Weekday and Month
3. Analyze the distribution of Active Energy spent over this time
4. Does the average Heart Rate differ based on Exercise Time
5. Describe the relationship between Stand Hours and Basal Energy spent



### Data preparation

```{r}
kable(activity_data[1:25,], caption="Top 25 Rows of Dataset")

```

***

The apple watch data is stored in the xml format. I collected this data and parsed it into excel. I utilized excel and R to clean, sort and group data as per my project objectives. This page shows a sample of my dataset with first 25 rows on the display.

The data contains heart rate, heart rate variability, energy spent based on type, distance, count and activity by each date. This data is stored in a data frame within R. I employed ggplot and other R libraries to provide visual narrative to the 5 questions about my personal health and fitness data.


### Question 1

```{r}
fill <- "orange"
line <- "black"

p0 <- ggplot(activity_data, aes(x = Weekday, y=DistanceWalkingRunning)) + 
  geom_boxplot(aes(group = Weekday), fill = fill, colour = line) +
  scale_x_discrete(limits=c("Sun","Mon","Tue","Wed","Thu","Fri","Sat")) +
  labs(title = "Distance Covered Per Week Day", x = "Weekdays", y = "Miles") +
  theme_minimal()
p0

p1 <- ggplot(activity_data, aes(x = Month, y=DistanceWalkingRunning)) + 
  geom_boxplot(aes(group = Month), fill = fill, colour = line) +
  scale_x_discrete(limits=c("Sep","Oct","Nov","Dec","Jan","Feb")) +
  labs(title = "Distance Covered Per Month", x = "Weekdays", y = "Miles") +
  theme_minimal()
p1

```

***
Objective:
Does the distance travelled vary by Weekday and Month

Visualization Method: 
Side-by-Side Box Plot

Summary:
The graphs show the side by side comparison of average distance covered categorized by weekday and month. It shows how the mean distance covered trend changes between each weekday and each month. This also shows outlier data at a glance indicating that I did some extra activities on few days. 


### Question 2

```{r}
p2 <- ggplot(data = activity_data, aes(x = Month, y = StepCount, fill = Weekday)) +
  geom_bar(position = "dodge",stat = "identity")+
  labs(title = "Steps Count per Month per Weekday", x = "Month", y = "Number of Steps")
ggplotly(p2)
```

***

Objective:
What is the trend of my highest step counts based on Weekday and Month

Visualization Method: 
Grouped Bar-Chart 

Summary:
This graph demonstrates in form of bars, the steps I took on each Weekday of each Month and the trend of these step counts across months. This provides a visually compelling evidence of the change in trend from Weekday of one Month to the Weekday of another. It also shows the overall trajectory of increase or decrease of steps from Month to Month. This graphs also has an interactive interface that provides the tools to zoom and click on data points in the graph.

### Question 3

```{r}
p3 <- ggplot(data = activity_data, aes(x = Month, y = Weekday)) +
  geom_tile(aes(fill = ActiveEnergyBurned)) +
  labs(title = "Active Energy Burned v/s Weekday and Month", x = "Month", y = "Weekday")
p3
```

***

Objective:
Analyze the distribution of Active Energy spent over this time

Visualization Method: 
Heat-Map

Summary:
This graph shows on the level of active energy burned on different Weekday of different Month. This helps to visually map which Month I burned the most calories due to activity and which Weekday contributed to the highest energy expenditure. It helps map out the overall level across the timeframe. 


### Question 4

```{r}

p4 <- ggplot(data = activity_data, aes(x = ExerciseTime, y = HeartRateAvg)) +
  geom_point() +
  stat_smooth(method = "lm") +
  ggtitle("Relationship Active Energy and Heart Rate") +
  labs(x = "Active Energy Burned", y = "Avg. Heart Rate (per minute)") +
  theme_minimal()
p4
```

***

Objective:
Does the average Heart Rate differ based on Exercise Time

Visualization Method: 
Scatter-plot

Summary:
This graph illustrates the changes in Heart Rate based on change in Exercise Time. The graph clearly shows the relationship trend line, which appears to be linear in this case. Adding features to the graphs helps highlights the range of data points that shows the clustering of observations at some points of the line versus the other points. 


### Question 5

```{r}
p5 <- ggplot(activity_data, aes(x = StandHours, y = BasalEnergyBurned)) +
    geom_jitter(aes(colour = Weekday, size = Month, alpha=.02)) + 
    labs(title = "Basal Energy Change w.r.t Hours Standing", x = "Number of Hours", y = "Basal Energy Burned (kCal")+
    theme_economist_white()
ggplotly(p5)
```

***

Objective:
Describe the relationship between Stand Hours and Basal Energy spent

Visualization Method: 
Bubble-Chart

Summary:
This graph utilizes multiple variables to show the relationship of Basal Energy spent in comparison to the hours I stood during the day. This helps may out relationship but also add more features explained through color and size of the bubbles, which represent the Weekday and Month in this plot. This allows us to visually distinguish and group around categories and fin the outliers. This graph also has an interactive interface that provides the tools to zoom and click on data points in the graph.