Observations: 459
Variables: 21
$ Date <date> 2017-01-08, 2017-01-09, 2017-01-10, …
$ `Calories (kcal)` <dbl> 426.0002, 1994.0402, 1916.8030, 1867.…
$ `Distance (m)` <dbl> NA, 2789.18842, 1146.94802, 794.62506…
$ `Average heart rate (bpm)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Max heart rate (bpm)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Min heart rate (bpm)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Low latitude (deg)` <dbl> 41.14666, 41.14651, 41.14354, 41.1413…
$ `Low longitude (deg)` <dbl> -81.33350, -81.34815, -81.50716, -81.…
$ `High latitude (deg)` <dbl> 41.14666, 41.14920, 41.16173, 41.1663…
$ `High longitude (deg)` <dbl> -81.33350, -81.33345, -81.33323, -81.…
$ `Average speed (m/s)` <dbl> NA, 1.5028356, 4.7231350, 1.1079743, …
$ `Max speed (m/s)` <dbl> NA, 2.65, 6.59, 2.90, 1.48, 2.28, NA,…
$ `Min speed (m/s)` <dbl> NA, 0.50, 0.50, 0.59, 1.48, 0.50, NA,…
$ `Step count` <dbl> 392, 7732, 5187, 4473, 4980, 2759, 48…
$ `Average weight (kg)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Max weight (kg)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Min weight (kg)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Biking duration (ms)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ `Inactive duration (ms)` <dbl> 20812461, 79883688, 78438091, 8193932…
$ `Walking duration (ms)` <dbl> 51518, 4645626, 3309991, 2297268, 214…
$ `Running duration (ms)` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
ANLY 512 - What is my fit?
The Quantified Self movement grew from the popularity and growth of the internet of things, the mass collection of personal information, and mobile technologies (primarily wearable computing).
The goal of the project is to collect, analyze and visualize the data using the tools and methods covered in class. Additionally, using the data-driven approach, I will create a summary which answers the following questions based on the data collected - Data Preprocessing & Analysis - Where have I travelled in the world? - Where have I travelled in Ohio? - What is the trend to my step counts? - How have I been losing calories? - What is my walking pattern?
In this final visualization project, I decided to use my google fit data to perform some analysis and visualization. The data is stored in the Google Fit in my phone which is then synchronized to the google fit account. All the data for the google are centralized to a location available via link: https://takeout.google.com/settings/takeout
I downloaded the data as csv from the google takeout. The data was not very clean, so I used the visualization to find out the missing vakues and look into the data that are really available for us. Moreover, I performed various manipulation to the data to prepare it for the visualization.
In the past three year, I have travelled almost 5 countries around the world, but mostly been around the United States and concenterating in various parts of Ohio. We will look Ohio next.
Cleveland, Ohio being my hometown for the last four years, most of my heat map is shown around Cleveland. However, I have visited almost all the parts of Ohio.
I imputed the missing values for the steps with the mean. This gives a pretty good picture of my trends on the step counts.
I performed this analysis only on the data of 2018 as the data for other years were pretty sparse.
This time-series chart again show my walking trends over the last three years.
quant_self
It was a great way to visualize your own personal data and see them in the charts. The visualization created helped me gain some insights about my activities. It was good overall.
---
title: "The Quantified Self!"
subtitle: "ANLY 512 - Data Visualization - Final Project"
author: "Laxman Panthi"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source: embed
orientation: columns
vertical_layout: fill
---
```{r global, include=FALSE, echo=FALSE}
library(flexdashboard)
library(geojsonio)
library(sp)
library(tidyverse)
library(RColorBrewer)
library(lubridate)
library(naniar)
library(ggmap)
raw <- read_csv("DailySummary_Google.csv")
```
### Overview
```{r}
glimpse(raw)
```
***
ANLY 512 - What is my fit?
The Quantified Self movement grew from the popularity and growth of the internet of things, the mass collection of personal information, and mobile technologies (primarily wearable computing).
The goal of the project is to collect, analyze and visualize the data using the tools and methods covered in class. Additionally, using the data-driven approach, I will create a summary which answers the following questions based on the data collected
- Data Preprocessing & Analysis
- Where have I travelled in the world?
- Where have I travelled in Ohio?
- What is the trend to my step counts?
- How have I been losing calories?
- What is my walking pattern?
In this final visualization project, I decided to use my google fit data to perform some analysis and visualization. The data is stored in the Google Fit in my phone which is then synchronized to the google fit account. All the data for the google are centralized to a location available via link: https://takeout.google.com/settings/takeout
### Data Pre-Processing
```{r}
vis_miss(raw)
```
***
I downloaded the data as csv from the google takeout. The data was not very clean, so I used the visualization to find out the missing vakues and look into the data that are really available for us. Moreover, I performed various manipulation to the data to prepare it for the visualization.
```{r, include=FALSE, echo=FALSE}
raw_cols <- c("date","calories","distance","avg_heart_rate","max_heart_rate","min_heart_rate","low_lat","low_lon","high_lat","high_lon","avg_speed","max_speed","min_speed","step_count","avg_weight","max_weight","min_weight","biking_duration","inactive_duration","walking_duration","running_duration")
colnames(raw) <- raw_cols
raw <- raw%>%
select(-avg_heart_rate,-max_heart_rate,-min_heart_rate,-avg_weight,-max_weight,-min_weight,-inactive_duration,-running_duration,-biking_duration)
raw$state <- NA
usa <- geojson_read(
"http://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_500k.json",
what = "sp"
)
# compare points
for (i in 1:nrow(raw)) {
coords <- c(raw$low_lon[i], raw$low_lat[i])
if (any(is.na(coords)))
next
point <- sp::SpatialPoints(matrix(coords,
nrow = 1))
sp::proj4string(point) <- sp::proj4string(usa)
polygon_check <- sp::over(point, usa)
raw$state[i] <- as.character(polygon_check$NAME)
}
raw <- raw%>%complete(date = seq.Date(min(raw$date), max(raw$date), by="day"))
raw$year <- year(raw$date)
raw$year <- as.factor(raw$year)
raw$month <- month(raw$date)
raw$dayofweek <-wday(raw$date, label=TRUE, abbr=FALSE)
```
### The World Traveller
```{r}
mapDF <- raw %>%
select(low_lat,low_lon,state) %>%
filter(!is.na(low_lat),!is.na(low_lon))
qmplot(low_lon, low_lat, data = mapDF)
```
***
In the past three year, I have travelled almost 5 countries around the world, but mostly been around the United States and concenterating in various parts of Ohio. We will look Ohio next.
### The Buckeye State
```{r}
qmplot(low_lon, low_lat, data = mapDF%>%filter(state=="Ohio"))
```
***
Cleveland, Ohio being my hometown for the last four years, most of my heat map is shown around Cleveland. However, I have visited almost all the parts of Ohio.
### Counting steps
```{r}
mean_steps <- mean(raw$step_count,na.rm=T)
steps <- raw%>%select(date,step_count,year)%>%
mutate(step_count=if_else(is.na(step_count),mean_steps,step_count))
p1 <- ggplot(steps,aes(x=date, y=step_count, group=year)) +
geom_line(aes(colour=year))+
ggtitle("Mean Step Counts over the months")
steps$date <- as.Date(steps$date,"%Y-%m-%d")
p1 +
scale_x_date(date_labels = "%b/%Y")
```
***
I imputed the missing values for the steps with the mean. This gives a pretty good picture of my trends on the step counts.
### Burning calories
```{r}
f <- raw %>%
filter(year==2018) %>%
mutate(week_date = ceiling(day(date) / 7)) %>%
group_by(week_date, month, dayofweek) %>%
summarise(total_cal = sum(calories))
ggplot(f,
aes(dayofweek, week_date, fill = f$total_cal)) +
geom_tile(colour = "white") +
facet_wrap(~month) +
theme_bw() +
scale_fill_gradient(name = "Total \nCalories",
low ="#56B1F7" , high = "#132B43") +
labs(x = "Week of the Month",
y = "Week number") +
scale_y_continuous(trans = "reverse")
```
***
I performed this analysis only on the data of 2018 as the data for other years were pretty sparse.
### Walking around
```{r}
mean_walking <- mean(raw$walking_duration,na.rm=T)
walkDF <- raw%>%select(date,walking_duration,year)%>%
mutate(walking_duration=if_else(is.na(walking_duration),mean_walking,walking_duration))
ggplot(walkDF,aes(date,walking_duration)) +
geom_point(color = "blue") +
geom_smooth(color = "red") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
labs(title = 'Average Walking Duration Year-Wise', x = 'Year', y= 'Duration')
```
***
This time-series chart again show my walking trends over the last three years.
### Conclusion

***
It was a great way to visualize your own personal data and see them in the charts. The visualization created helped me gain some insights about my activities. It was good overall.