Data collection:
I collected two sets of data from iPhone. Both were exported from the “Health” app. One dataset is the distance that I walked everyday in 2018 (unit in meters). The other my weight and BMI data that I measured through the Renpho App and synced with my Apple Health account. The second dataset only consists records from June, July and December, so the records are pretty limited.
Data Manipulation:
After exporting data, I first converted xml to .csv and .xlsx formats for easier processing in R. For the first data set, Walking Distance, it documents my distances walked a few times in a day. So First I summarized distance walked for each day, and grouped data by month. The second data set doesn’t contains to many records so I leave it as is.
By visulization of the two datasets, I’m trying to provide answers to the below questions:
Visualization Methods:
Visualization Summary:
From visualization it’s quite obvious that my walking distances varied the most in January, and varied the least in November. We can also tell that on average, I walked the longest distance in May. So May is my most active month. In October, there is a very abnormal outlier: I walked over 20k meters in a day, which is very unlikely. That could be a system error that might be explained if I know how Apple tracks my walking distance better. Zooming in and out the second line graph, we can see that my walking pattern varies a lot from date to date, but not as much from month to month. The pattern tends to stablize if we observe for a long time.
Looking at the scatter plot for my body data, the linear relationship between my weight and BMI is very obvious, even with limited data points. Weight and BMI is positively correlated. As I gain weight, my BMI value also increases incrementally.
---
title: "ANLY512 Final Project"
author: Yumeng Du
output:
flexdashboard::flex_dashboard:
orientation: columns
source_code: embed
vertical_layout: fill
---
```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
library(plotly)
library(readxl)
library(dygraphs)
library(zoo)
library(xts)
```
Column {data-width=650}
-----------------------------------------------------------------------
Summary
=====================================
Data collection:
I collected two sets of data from iPhone. Both were exported from the “Health” app. One dataset is the distance that I walked everyday in 2018 (unit in meters). The other my weight and BMI data that I measured through the Renpho App and synced with my Apple Health account. The second dataset only consists records from June, July and December, so the records are pretty limited.
Data Manipulation:
After exporting data, I first converted xml to .csv and .xlsx formats for easier processing in R. For the first data set, Walking Distance, it documents my distances walked a few times in a day. So First I summarized distance walked for each day, and grouped data by month. The second data set doesn't contains to many records so I leave it as is.
By visulization of the two datasets, I'm trying to provide answers to the below questions:
1. Out of one year, in which month I'm most active (walk the most)?
2. In which month does my activity level tends to vary the most?
3. Can we see any abnormal distance in my records?
4. Is there a general pattern I can observe from my walking?
5. Is there any correlation between my weight and BMI value?
6. From the visualization, can I tell the relationship between my weight and BMI value?
Visualization Methods:
1. Walking Distance: The visulization that I created for Walking Distance consists of two parts. First, I used a boxplot to demonstrate my walking distance distribution for each month. The second chart is a line graph that shows the general pattern of my walking for each day. It's an interactive chart so I can zoom in and out for higher or lower level patterns.
2. For Body data, I created a scatter plot and fitted a line to better visualize the relationship between my weight and BMI.
Visualization Summary:
1. From visualization it's quite obvious that my walking distances varied the most in January, and varied the least in November. We can also tell that on average, I walked the longest distance in May. So May is my most active month. In October, there is a very abnormal outlier: I walked over 20k meters in a day, which is very unlikely. That could be a system error that might be explained if I know how Apple tracks my walking distance better. Zooming in and out the second line graph, we can see that my walking pattern varies a lot from date to date, but not as much from month to month. The pattern tends to stablize if we observe for a long time.
2. Looking at the scatter plot for my body data, the linear relationship between my weight and BMI is very obvious, even with limited data points. Weight and BMI is positively correlated. As I gain weight, my BMI value also increases incrementally.
Walking Distance Visualization
=====================================
### Walking Distance
```{r}
MyData <- read.csv("WalkingDistance.csv")
ggplot(MyData, aes(x = Month, y = Meters, fill = Month)) + geom_boxplot(outlier.color = "red", outlier.shape = 8, outlier.size = 2)
```
```{r}
MyData2<-read_excel("MyData2.xlsx")
Date<- as.Date(MyData2$Date)
time_series <- xts(MyData2, order.by = Date)
interactive <- dygraph(time_series, main = "Walking Distance") %>% dyRangeSelector()
interactive
```
Column {data-width=350}
-----------------------------------------------------------------------
Body Data Visualization
=====================================
### Chart B
```{r}
SelfData <- read.csv("SelfData.csv")
ggplot(SelfData, aes(x=Weight, y=BMI)) + geom_point() + geom_smooth(method=lm)
```