INTRODUCTION:
The Quantified Self is a movement motivated to leverage the synergy of Wearables, analytics and ‘Big Data’. Dynamic Visualization is representations that go beyond traditional static forms. Animation, Interaction and Real-time, any one of these features meet the definition of Dynamic Visualization. When data size gets increasingly large, our cognitive skills does not understand the increased complexity of data. Dynamic Visualization offers new means for dealing with complexity, accessibility and validity.
DATA COLLECTION: For this project, I have collected data from wearable Apple Watch for past one month. For Quantified Self, I took Apple watch’s data related to my daily activities related to following parameters.
METHODS: I extracted data from my iphone watch app in XML format and converted the file into CSV format for data cleansing and organizing purpose. I selected data range from 09/01/2017 to 10/01/2017. I used R to analyze and plot the dashboard. I used ggplot2 as it provides flexible, mature and complete graphic syste. I used ggplot themes system for polished plot appearance. As ggplot is consistent in underlying grammer of graphics, I used aesthetic mapping, faceting, themes and geometric object graphical display for analyzing my daily actitivies from past one month.
While I was analyzing, I came up with following questions. I used story board to explain these questions.
Link to Storyboard: http://rpubs.com/ArtiH/317629
DATA EXPLANATION:
As we can see below, my data has seven columns, i.e. Calibration, Day, Time, Output, Month, Time_of_day and Dates. Calibration column has five different values, i.e. mi, min, count, kcal,count/min. Output column has values of those five different values from Calibration column based on Time_of_day(Morning, Noon, Evening,Night), Time (24 hours), Day (Monday to Sunday - 7 days) and Dates(09/01/17 to 09/30/17).
## [1] 35100 7
## Calibration Day Time Output Month Time_of_day Dates
## 1 count/min Friday 12:09 70 September Night 9/1/17
## 2 count/min Friday 10:07 82 September Morning 9/1/17
## 3 count/min Friday 10:27 88 September Morning 9/1/17
## 4 count/min Friday 10:38 83 September Morning 9/1/17
## 5 count/min Friday 10:44 74 September Morning 9/1/17
## 6 count/min Friday 10:46 80 September Morning 9/1/17
## [1] "Calibration" "Day" "Time" "Output" "Month"
## [6] "Time_of_day" "Dates"
## 'data.frame': 35100 obs. of 7 variables:
## $ Calibration: Factor w/ 6 levels "","count","count/min",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ Day : Factor w/ 8 levels "","Friday","Monday",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ Time : Factor w/ 721 levels "","1:00","1:01",..: 191 69 89 100 106 108 115 120 124 129 ...
## $ Output : num 70 82 88 83 74 80 83 82 75 83 ...
## $ Month : Factor w/ 2 levels "","September": 2 2 2 2 2 2 2 2 2 2 ...
## $ Time_of_day: Factor w/ 5 levels "","Evening","Morning",..: 4 3 3 3 3 3 3 3 5 5 ...
## $ Dates : Factor w/ 31 levels "","9/1/17","9/10/17",..: 2 2 2 2 2 2 2 2 2 2 ...
## Calibration Day Time Output
## : 1239 Saturday :7077 : 1239 Min. : 0.000
## count : 4886 Friday :6896 6:42 : 132 1st Qu.: 0.098
## count/min: 2925 Tuesday :4993 6:33 : 129 Median : 0.322
## kcal :20763 Thursday :4323 6:34 : 125 Mean : 17.367
## mi : 4680 Monday :4185 6:35 : 124 3rd Qu.: 16.769
## min : 607 Wednesday:3628 6:47 : 124 Max. :1521.000
## (Other) :3998 (Other):33227 NA's :1239
## Month Time_of_day Dates
## : 1239 : 1239 9/30/17: 2782
## September:33861 Evening:14737 9/29/17: 1867
## Morning: 4184 9/23/17: 1796
## Night : 3506 9/15/17: 1625
## Noon :11434 9/28/17: 1475
## 9/11/17: 1472
## (Other):24083
From below graph, we can see that kcal has highest number of rows compared rest of four calibration. The min has the lowest numbers in dataset. Thus, the Apple watch has maximum data on Active Energey Burned (kcal) and the least data on Exercise Time (min) for my past one month activities.
From the below graph, we can see that Apple watch collected exercise data, walking/running data,no of steps taken data, calorie burned and heart rate data all seven days a week.
## `geom_smooth()` using method = 'gam'
## Warning: Removed 1239 rows containing non-finite values (stat_smooth).
## Warning: Removed 1239 rows containing missing values (geom_point).
From the below graph, we can see that maximum Output data values are collected during noon of the day. Thus, I am maximum active during day time.
## `geom_smooth()` using method = 'gam'
## Warning: Removed 1239 rows containing non-finite values (stat_smooth).
## Warning: Removed 1239 rows containing missing values (geom_point).
Below is the Histogram of Output values.
## Warning: Removed 1239 rows containing non-finite values (stat_bin).
In below diagram, I have analyzed data on Output and Time_of_day for Calibration “count” and “mi”. The below diagram illustrates that I have maximum outputs during Noon time of the day for count (steps taken).
The diagram below illustrates the relationship between Day (Monday, Wednesday and Friday), Time_of_day and Output. On Wednesday during Evening time of day, I have maximum output values compared to the rest of the time and seleted days.
## Warning: Removed 1239 rows containing missing values (geom_text).
Below graph is faceting in R. It shows the relationship among Time_of_day, Output with Calibration. Count has maximum Output value during Noon and Highest Heart Rate is captured during Evening time.
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x
## $y, : font family not found in Windows font database