Introduction and Data Collection

In current world, data is full of people’ lives everywhere. Almost every smartphone users has various applications to document our personal data, including location, activity.

For this personal project, I want to study the Apple Health Data that saved in my IPhone. The IPhone health function will record your workouts, steps, walking distance. Plus, Health could synchronize data from other applications. In this specific project, an application called Sleeptracker provides my daily sleep hours and heart rate.

I will introduce the process of collecting data, cleanse data and analyzing the data.


Data Collection:


Total 7 variables:

[1] "export_cda.xml" "export.xml"    
$nameCounts

       Record MetadataEntry       Workout    ExportDate    HealthData 
       661539          1858           180             1             1 
           Me 
            1 

$numNodes
[1] 663580
                                   type
1                             HeartRate
2                             StepCount
3                DistanceWalkingRunning
4                    ActiveEnergyBurned
5                        FlightsClimbed
6                              NikeFuel
7 HKCategoryTypeIdentifierSleepAnalysis

Five Questions:

Q1. What is the average hourly heart rate and step count every day?


From chart above, the average steps in 2015 is below 5000 steps, which means I am not willing to do extra activities besides basic movements. However, average step counts are about 5000 steps per day after 2016.

In 2018 and 2019, the sptes above 10000 steps are obviously much more than 2015 and 2016. That illustrate that I am more willing to move than before.

I actually set a goal to move at least 5000 steps per day. From the chart above, I could see that I already achieve my goal.

Q2. What is my average step count?


From chart above, the average steps in 2015 is below 5000 steps, which means I am not willing to do extra activities besides basic movements. However, average step counts are about 5000 steps per day after 2016.

In 2018 and 2019, the sptes above 10000 steps are obviously much more than 2015 and 2016. That illustrate that I am more willing to move than before.

I actually set a goal to move at least 5000 steps per day. From the chart above, I could see that I already achieve my goal.

Q3. Which season I am more willing to do walking?


Usually I do many walks and step long distances during travels. From the heat map above, I could see I did lots of walks during July May and December, which has long vacation during these months. I usually go travel with my friends. That is the reason why those months generate more distances.

Q4. How often I go to gym?


I was using Nike Training Camp to track my exercise time during workouts. The chart shows the calories value every time I went to gym. The chart indicated that the calories burnt are mostly created during last summer. I went to gym very often during months before 2018 summer. In July 2018, the calories burnt reaches to maximum.

Q5. If heart rate has similar trend as step count?


I want to compare if increasing activities will boost heart rate. As the chart indicated, that 2018 summer has the more steps than rest of time frame. In comparison, the heart rate in summer 2018 is averagily higher than rest of time. And so I believe more step counts will boost heart rate.

---
title: "Final Project ANLY 512 90"
Author: "Feng Ye"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    orientation: columns
    vertical_layout: fill
---

```{r setup, include=FALSE}
library(flexdashboard)
```
### Introduction and Data Collection
In current world, data is full of people' lives everywhere. Almost every smartphone users has various applications to document our personal data, including location, activity. 

For this personal project, I want to study the Apple Health Data that saved in my IPhone. The IPhone health function will record your workouts, steps, walking distance. Plus, Health could synchronize data from other applications. In this specific project, an application called Sleeptracker  provides my daily sleep hours and heart rate. 

I will introduce the process of collecting data, cleanse data and analyzing the data. 

***

Data Collection:

- Open application called Health → Export Health Data
-	Data will be exported as XML zipped file
-	Use R code to unzip the file

***
Total 7 variables: 

- HeartRate 
- StepCount
- DistanceWalkingRunning
- ActiveEnergyBurned
- FLightsClimbed
- NikeFuel
- HKCategoryTypeIdentifierSleepAnalysis

```{r}
library(flexdashboard)
library(XML)
library(tidyverse)
library(lubridate)
library(scales)
library(ggthemes)

path <- '/Users/juliaye/Desktop'
zip <- paste(path, 'export.zip', sep = '/')
unzip(zip, exdir = path)

list.files(paste0(path,'/apple_health_export'))
xml <- xmlParse(paste0(path, '/apple_health_export/export.xml'))
summary(xml)

df_record <-   XML:::xmlAttrsToDataFrame(xml["//Record"])
df_workout <-  XML:::xmlAttrsToDataFrame(xml["//Workout"])

 df <- df_record %>%
  mutate(device = gsub(".*(name:)|,.*", "",device),
         value = as.numeric(as.character(value)),
         endDate = ymd_hms(endDate,tz="America/New_York"),
         date = date(endDate),
         year = year(endDate),
         month = month(endDate),
         day = day(endDate),
         yday = yday(endDate),
         wday = wday(endDate),
         hour = hour(endDate),
         minute = minute(endDate),
         type = str_remove(type, "HKQuantityTypeIdentifier")
         )
 df %>% select(type) %>% distinct()
```
### Five Questions:
- Q1. What is the average hourly heart rate and step count every day?

- Q2. What is my average step count? 

- Q3. Which season I am more willing to do walking? 

- Q4. When I spend more time in gym?

###Q1. What is the average hourly heart rate and step count every day?
```{r}
df %>%
  filter(type %in% c('HeartRate', 'StepCount')) %>% 
  group_by(type, hour) %>% 
  summarise(value = mean(value)) %>% 
  ggplot(aes(x = hour, y = value, fill = value)) +
  geom_col() +
  scale_fill_continuous(low = 'white', high = "green") +
  scale_x_continuous(
    breaks = c(0, 6, 12, 18),
    label = c("Midnight", "6 AM", "Noon", "6 PM")
  ) +
  labs(title = "Apple Health Data Hear Rate vs Step Count",
       subtitle = "Hourly Data") +
  facet_wrap(~ type)
```

***
From chart above, the average steps in 2015 is below 5000 steps, which means I am not willing to do extra activities besides basic movements. However, average step counts are about 5000 steps per day after 2016. 

In 2018 and 2019, the sptes above 10000 steps are obviously much more than 2015 and 2016. That illustrate that I am more willing to move than before.

I actually set a goal to move at least 5000 steps per day. From the chart above, I could see that I already achieve my goal. 


###Q2. What is my average step count? 
```{r}
df %>%
  filter(type %in% c('StepCount')) %>% 
 
  group_by(type, date) %>% 
  summarise(value = sum(value)) %>% 
  
  ggplot(aes(x= date, y = value)) +
   
   geom_point(alpha = 0.5) +
   
   labs(title = "Average Sstep Count by year")
```

***
From chart above, the average steps in 2015 is below 5000 steps, which means I am not willing to do extra activities besides basic movements. However, average step counts are about 5000 steps per day after 2016. 

In 2018 and 2019, the sptes above 10000 steps are obviously much more than 2015 and 2016. That illustrate that I am more willing to move than before.

I actually set a goal to move at least 5000 steps per day. From the chart above, I could see that I already achieve my goal. 

###Q3. Which season I am more willing to do walking? 
```{r}
df %>%
    filter(type == 'DistanceWalkingRunning') %>% 
  group_by(date,month,year) %>% 
  summarize(steps=sum(value)) %>% 
  group_by(month,year) %>% 
  summarize(steps=sum(steps)) %>% 
  

  ggplot(aes(x=year, y=month,  fill=steps)) + 
    geom_tile(col = 'grey40') + 
    scale_fill_continuous(labels = scales::comma, low = 'white', high = 'red') +
    theme(panel.grid.major = element_blank()) +
    scale_y_continuous(
      breaks = c(1,2,3,4,5,6,7,8,9,10,11,12),
      label = c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")
    ) +
   
    labs(title = "Monthly Distance and Walking Heatmap") +
    coord_equal()
```

***
Usually I do many walks and step long distances during travels. From the heat map above, I could see I did lots of walks during July May and December, which has long vacation during these months. I usually go travel with my friends. That is the reason why those months generate more distances. 

###Q4. How often I go to gym? 
```{r}
df %>%
  arrange(endDate) %>% 
  filter(type == 'ActiveEnergyBurned') %>% 
  # Had to reduce sourceName to these 2 sources to avoid double-counting
  # by other apps that use BodyMass and then store it back into Health
  filter(sourceName %in% c("Health", "Shortcuts")) %>% 
  
  ggplot(aes(x= date, y = value)) +
    geom_point(alpha = 0.3) +
    geom_smooth(span = 0.2, col = "grey30", se = FALSE) +
    labs(title = "Energy Burnt from NTC") +
    theme(axis.text.y = element_blank()) # you s
```

***
I was using Nike Training Camp to track my exercise time during workouts. The chart shows the calories value every time I went to gym. 
The chart indicated that the calories burnt are mostly created during last summer. I went to gym very often during months before 2018 summer.  In July 2018, the calories burnt reaches to maximum. 

###Q5. If heart rate has similar trend as step count? 
```{r}
df %>%
  filter(type %in% c('HeartRate', 'StepCount')) %>% 

  group_by(type, date) %>% 
  summarise(value = sum(value)) %>% 
  
  ggplot(aes(x= date, y = value)) +
 
    geom_smooth(span = 0.7, alpha = 0.2, col = "blue", se = FALSE) +
    geom_point(alpha = 0.5) +
    facet_wrap(~type, scales="free") +
    labs(title = "Step Count vs Heart Rate")
  
```

***
I want to compare if increasing activities will boost heart rate. As the chart indicated, that 2018 summer has the more steps than rest of time frame. In comparison, the heart rate in summer 2018 is averagily higher than rest of time.  And so I believe more step counts will boost heart rate.