Overview

Data preparation

The Quantified Self, also known as lifelogging, is a specific movement by Gary Wolf and Kevin Kelly from Wired magazine, which began in 2007 and tries to incorporate technology into data acquisition on aspects of a person’s daily life. People collect data in terms of food consumed, quality of surrounding air, mood, skin conductance as a proxy for arousal, pulse oximetry for blood oxygen level, and performance, whether mental or physical. Wolf has described quantified self is “self-knowledge through self-tracking with technology”.[1]

In this project I am going to perform “Quantified self” project by using Apple watch and the data is export from the health app developed by Apple. The data collected with Health app is in the format of XML. I used R to for data processing and applied several packages including ggplot2, plotly to produce the visualization of my healthdata. I will focus on five main topics: Steps, Energy Burned, Distance, Heart Rate, Menstrual Cycle and figure out questions about these five topics.

The data was collected from my history of spendings from May 2017 to 2018. The expenses were categorized in order to visualize the pattern of my expenses.

The project will use data visualization to try to answer the following 5 questions:

Q1: What is the percentage of each expense?

Q2: Does my spending vary from month to month?

Q3: In which category I spend the most?

Q4: What those categories are made off?

Q5: What are the expenses that I can reduce or stop?

Expenses %

Q1: What is the percentage of each expense?

Spending Variations

Q2: How does my spending vary from month to month?

Most Spendings Category

Q3: In which category I spend the most?

Total spendings

Q4: How much did I spend per month for each category?

Which Expense To Reduce

Q5: What are the expenses that I can reduce or stop?

Summary

Summary

Q1. We can observe that school is the biggest expense. The spendings for school is more important than all the other categories of expense combined.

Q2. We notice that the amount spent varies significantly due to the tuitions.

Q3.We can observe that school expenses are over USD 14,000/year while housing expenses do not reach USD 10,000/year despite the fact that I live in New York.

Q4.

Q5.

---
title: "Final Project - The Quantified Self"
author: "Louis D'Ambrosio"
date: "June 8, 2018"
output:
    flexdashboard::flex_dashboard:
      vertical_layout: fill
      source: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(ggplot2)
library(plotly)
library(dplyr)
library(lubridate)
library(reshape2)
```

Overview {data-orientation=rows}
====================================================
### Data preparation



The Quantified Self, also known as lifelogging, is a specific movement by Gary Wolf and Kevin Kelly from Wired magazine, which began in 2007 and tries to incorporate technology into data acquisition on aspects of a person's daily life. People collect data in terms of food consumed, quality of surrounding air, mood, skin conductance as a proxy for arousal, pulse oximetry for blood oxygen level, and performance, whether mental or physical. Wolf has described quantified self is "self-knowledge through self-tracking with technology".[1]



In this project I am going to perform "Quantified self" project by using Apple watch and the data is export from the health app developed by Apple. The data collected with Health app is in the format of XML. I used R to for data processing and applied several packages including ggplot2, plotly to produce the visualization of my healthdata. I will focus on five main topics: Steps, Energy Burned, Distance, Heart Rate, Menstrual Cycle and figure out questions about these five topics.



The data was collected from my history of spendings from May 2017 to 2018.
The expenses were categorized in order to visualize the pattern of my expenses.



The project will use data visualization to try to answer the following 5 questions:  

Q1: What is the percentage of each expense?  

Q2: Does my spending vary from month to month?  

Q3: In which category I spend the most?  

Q4: What those categories are made off?  

Q5: What are the expenses that I can reduce or stop?  


Expenses % {data-orientation=rows}
====================================================
### Q1: What is the percentage of each expense?

```{r message=FALSE, warning=FALSE, echo=FALSE}

SpendingsTotal <- read.csv("SpendingsTotal.xlsx")

library(ggplot2)
library(plotly)

SpendingsTotal <- read.table(header = T, text = 
"Expenses	    TotalExp
'Housing'	        7800
'Gym'	             390
'Food'	          1357
'Utilities'	       737
'Internet'	       234
'Metro'	          1573
'Phone'	           650
'School'	       15100
'Extra'	          1782")

plot_ly(SpendingsTotal, labels = ~ Expenses, values = ~ TotalExp, type = 'pie', 
        textposition = 'inside', textinfo = 'label+percent') %>%
layout (title='What is the percentage of each expense in a year?',
        showlegend = TRUE)
```

Spending Variations {data-orientation=rows}
====================================================
### Q2: How does my spending vary from month to month?

```{r message=FALSE, warning=FALSE, echo=FALSE}
Spendings <- read.csv("Spendings.xlsx")

library(ggplot2)
library(plotly)

SpendingsMonth <- read.table(header = T, text = 
"Month	       	  TotalMonth
'May2017'	         	    1183
'June2017'	       	    1174
'July2017'	       	    1035
'August2017'	          1119
'September2017'	        1955
'October2017'	     	    5265
'November2017'	        2568
'December2017'	        2612
'January2018'	          2486
'February2018'	        2457
'March2018'	            2478
'April2018'	            2630
'May2018'	              2661")	

plot_ly(SpendingsMonth, x = ~ Month, y = ~ TotalMonth, type = "scatter") %>%
layout (title='How does my spending vary from month to month?')
```

Most Spendings Category {data-orientation=rows}
====================================================
### Q3: In which category I spend the most? 

```{r}
SpendingsTotal <- read.csv("SpendingsTotal.xlsx")

library(ggplot2)
library(plotly)

SpendingsTotal <- read.table(header = T, text = 
"Expenses	    TotalExp
'Housing'	        7800
'Gym'	             390
'Food'	          1357
'Utilities'	       737
'Internet'	       234
'Metro'	          1573
'Phone'	           650
'School'	       15100
'Extra'	          1782")

plot_ly(SpendingsTotal, x = ~ Expenses, y = ~ TotalExp, type = "bar") %>%
layout (title='In which category I spend the most?')
```

Total spendings {data-orientation=rows}
====================================================
### Q4: How much did I spend per month for each category?

```{r}
Spendings <- read.csv("Spendings.xlsx")

library(ggplot2)
library(plotly)

Spendings <- read.table(header = T, text = 
"Month	       Housing	 Gym	Food	Utilities	Internet	Metro	Phone	School	Extra	  TotalMonth
'May2017'	         600	  30	 158	       56	      18	  121	   50	     0	  150	        1183
'June2017'	       600	  30	 206	       47	      18	  121	   50    	 0	  102	        1174
'July2017'	       600	  30	  75	       45	      18	  121	   50	     0	   96	        1035
'August2017'	     600	  30	  87	       49	      18	  121	   50	     0	  164	        1119
'September2017'	   600	  30	  92	       44	      18	  121	   50	  1000	    0	        1955
'October2017'	     600	  30	  96	       50	      18	  121	   50	  4300	    0	        5265
'November2017'	   600	  30	 122	       52	      18	  121	   50	  1400	  175	        2568
'December2017'	   600	  30	 103	       65	      18	  121	   50	  1400	  225	        2612
'January2018'	     600	  30	 100	       73	      18	  121	   50	  1400	   94	        2486
'February2018'	   600	  30	  79	       72	      18	  121	   50	  1400	   87	        2457
'March2018'	       600	  30	  74	       75	      18	  121	   50	  1400	  110	        2478
'April2018'	       600	  30	  80	       62	      18	  121	   50	  1400	  269	        2630
'May2018'	         600	  30	  85	       47	      18	  121	   50	  1400	  310	        2661")	
```

Which Expense To Reduce {data-orientation=rows}
====================================================
### Q5: What are the expenses that I can reduce or stop?

```{r}
Spendings <- read.csv("Spendings.xlsx")

library(ggplot2)
library(plotly)

Spendings <- read.table(header = T, text = 
"Month	       Housing	 Gym	Food	Utilities	Internet	Metro	Phone	School	Extra	  TotalMonth
'May2017'	         600	  30	 158	       56	      18	  121	   50	     0	  150	        1183
'June2017'	       600	  30	 206	       47	      18	  121	   50    	 0	  102	        1174
'July2017'	       600	  30	  75	       45	      18	  121	   50	     0	   96	        1035
'August2017'	     600	  30	  87	       49	      18	  121	   50	     0	  164	        1119
'September2017'	   600	  30	  92	       44	      18	  121	   50	  1000	    0	        1955
'October2017'	     600	  30	  96	       50	      18	  121	   50	  4300	    0	        5265
'November2017'	   600	  30	 122	       52	      18	  121	   50	  1400	  175	        2568
'December2017'	   600	  30	 103	       65	      18	  121	   50	  1400	  225	        2612
'January2018'	     600	  30	 100	       73	      18	  121	   50	  1400	   94	        2486
'February2018'	   600	  30	  79	       72	      18	  121	   50	  1400	   87	        2457
'March2018'	       600	  30	  74	       75	      18	  121	   50	  1400	  110	        2478
'April2018'	       600	  30	  80	       62	      18	  121	   50	  1400	  269	        2630
'May2018'	         600	  30	  85	       47	      18	  121	   50	  1400	  310	        2661")	
```

Summary {data-orientation=rows}
====================================================
###Summary  




Q1. We can observe that school is the biggest expense. The spendings for school is more important than all the other categories of expense combined.  

Q2. We notice that the amount spent varies significantly due to the tuitions.

Q3.We can observe that school expenses are over USD 14,000/year while housing expenses do not reach USD 10,000/year despite the fact that I live in New York.

Q4.  

Q5.