Introduction

The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data. The website http://quantifiedself.com/ is a great place to start to understand more about the QS movement.

The value of the QS for our class is that its core mandate is to visualize and generate questions and insights about a topic that is of immense importance to most people - themselves. It also produces a wealth of data in a variety of forms. Therefore, designing this project around the QS movement makes perfect sense because it offers you the opportunity to be both the data and question provider, the data analyst, the vis designer, and the end user. This means you will be in the unique position of being capable of providing feedback and direction at all points along the data visualization/analysis life cycle.

Project Objectives

Develop a visualization dashboard based on a series of data about your own life. The actual data used for this project can range from daily sleep regimes, TV shows watched, types of food eaten, to spatial positioning information at 1 minute intervals, to blood pressure and nutrient intake. The amount of data you collect and harvest will differ based on your specified objectives.

Ultimately the project must meet certain key objectives:

You must provide an written summary of your data collection, analysis and visualization methods, including the why you chose your methods, and what tools you utilized. Your summary must outline ??? 5 questions that can be evaluated using a data-driven approach. These questions should be more than just “How many miles did I run”, although a couple of your questions could be stated that way. You must collect, manage, and store the data necessary for this visualization. You must design and create an appropriate set of visualizations (try not to use just one type of visualization) within a dashboard/storyboard that provides insight into your specified questions, with a minimum of ??? 1 interactive graphical element.

Project Deliverables

To complete this assignment you must deliver two items:

A ??? 3 page write up of the questions, data acquisition, storage, manipulation, visualization methods and a written summary of each visualization. This summary should be part of your final dashboard not a separate document.

The final visualization.

Question 1: What category did I spend the most money?

Summary

Data set I used is the credit card charge history from my bank. In order to answer that question, I need to use bar chart to plot the category to find the category I sprent the most money on. After plotting the chart, I observed shopping category then home category is where I spent most money on. Then I want to see what is the decription for the charge in shopping category to see where I like spending money, so I use filter funtion to slice the dataset by only shopping category. In shopping category, I spent most money in apple store purchase and costco.

Question 2: What is my favorite restaurant?

Summary

Data set I used is the credit card charge history from my bank. In order to answer that question, I need to use bar chart and fill the chart by description. For the dataset, I need to filter on food & drink category. After plotting the chart, I observed Chipotle and Raising cane’s are where I spent most money on which means I like these 2 restaurants the best over all restaurants.

Question 3: What is the monthly trend in food & drink expense?

Summary

In order to find the monthly trend, I need to manipulate the dataset a little bit to change daily transaction record into monthly transaction record. After that, I need to filter the dataset by food & drink category. I am using bar chart to plot because it is pretty easy to read. After plotting, I observed that my spend on food & drink is increasing over time.

Question 4: How is my consumption on gas?

Summary

I filter the dataset by gas category and change the dataset from daily transaction into by month transaction. I used pie chart to plot gas consumption in dollars by month to see the trend. Overall there is not an obvious trend. I spent the most in June 2019 because I had a road trip back then.

Question 5: Do I dine out more than eat at home?

Summary

In order to answer this question, I need to filter the dataset by groceries category or food & drink category. Then I used bar chart to plot the frequency and use pie chart to plot the dollar amount spent in each category for easy comparison. As we can see from the results, I dine out more often than eating at home and I spent more money in restaurants than in groceries store.

---
title: "Final project"
author: "Yiheng Hu"
output:
  flexdashboard::flex_dashboard:
    orientation: rows
    social: [ "menu" ]
    source: embed
    vertical_layout: fill
---

## Introduction
The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and "Big Data". This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data. The website http://quantifiedself.com/ is a great place to start to understand more about the QS movement.

The value of the QS for our class is that its core mandate is to visualize and generate questions and insights about a topic that is of immense importance to most people - themselves. It also produces a wealth of data in a variety of forms. Therefore, designing this project around the QS movement makes perfect sense because it offers you the opportunity to be both the data and question provider, the data analyst, the vis designer, and the end user. This means you will be in the unique position of being capable of providing feedback and direction at all points along the data visualization/analysis life cycle.

```{r load data, echo = FALSE}
library(ggplot2)
library(tidyverse)
library(readr)
data <- read_csv("C:/Users/yg/Desktop/HU Master/ANLY 512 - Data Visualization/Final project/Chase0645_Activity20171231_20190725_20190725.CSV")
#View(data)
colnames(data)<-c('Transaction_Date','Post_Date','Description','Category','Type','Amount')
```

## Project Objectives
Develop a visualization dashboard based on a series of data about your own life. The actual data used for this project can range from daily sleep regimes, TV shows watched, types of food eaten, to spatial positioning information at 1 minute intervals, to blood pressure and nutrient intake. The amount of data you collect and harvest will differ based on your specified objectives.

Ultimately the project must meet certain key objectives:

You must provide an written summary of your data collection, analysis and visualization methods, including the why you chose your methods, and what tools you utilized.
Your summary must outline ??? 5 questions that can be evaluated using a data-driven approach. These questions should be more than just "How many miles did I run", although a couple of your questions could be stated that way.
You must collect, manage, and store the data necessary for this visualization.
You must design and create an appropriate set of visualizations (try not to use just one type of visualization) within a dashboard/storyboard that provides insight into your specified questions, with a minimum of ??? 1 interactive graphical element.

## Project Deliverables
To complete this assignment you must deliver two items:

A ??? 3 page write up of the questions, data acquisition, storage, manipulation, visualization methods and a written summary of each visualization. This summary should be part of your final dashboard not a separate document.

The final visualization.


#Question 1: What category did I spend the most money?

### **Summary** 
Data set I used is the credit card charge history from my bank. In order to answer that question, I need to use bar chart to plot the category to find the category I sprent the most money on. After plotting the chart, I observed shopping category then home category is where I spent most money on. Then I want to see what is the decription for the charge in shopping category to see where I like spending money, so I use filter funtion to slice the dataset by only shopping category. In shopping category, I spent most money in apple store purchase and costco.





```{r Q1,echo = FALSE}
ggplot(data,aes(x=Category,y=Amount,fill=Category))+
  geom_boxplot()+
  coord_flip()+
  theme(text = element_text(size=10),
        axis.text.x = element_blank())+#element_text(angle=45, hjust=1))
  labs(title="Expense by Category",x="Category",
         y="Amount Spent $")

shopping_category<-filter(data, Category == 'Shopping')
ggplot(shopping_category,aes(x=Description,y=Amount,fill=Description))+
  geom_col()+
  theme(text = element_text(size=10),
        axis.text.x = element_text(angle=45, hjust=1),
        legend.position = 'none')+
  labs(title="Expense by Description",x="Description",
         y="Amount Spent $")
```

#Question 2: What is my favorite restaurant?

### **Summary** 
Data set I used is the credit card charge history from my bank. In order to answer that question, I need to use bar chart and fill the chart by description. For the dataset, I need to filter on food & drink category. After plotting the chart, I observed Chipotle and Raising cane's are where I spent most money on which means I like these 2 restaurants the best over all restaurants.





```{r Q2, echo=FALSE}
food_drink_category<-filter(data, Category == 'Food & Drink')
ggplot(food_drink_category,aes(x=Description,fill=Description))+
  geom_bar()+
  theme(text = element_text(size=10),
        axis.text.x = element_text(angle=45, hjust=1),
        legend.position = 'none')+
  labs(title="Frequency in Food & Drink",x="Description",
         y="Frequency")
```


#Question 3: What is the monthly trend in food & drink expense?

### **Summary** 
In order to find the monthly trend, I need to manipulate the dataset a little bit to change daily transaction record into monthly transaction record. After that, I need to filter the dataset by food & drink category. I am using bar chart to plot because it is pretty easy to read. After plotting, I observed that my spend on food & drink is increasing over time.





```{r Q3, echo = FALSE}
food_drink_category_bymonth<-separate(food_drink_category,col = Transaction_Date,into = c('month','date','year'),sep = '/')
food_drink_category_bymonth<-mutate(food_drink_category,yearmonth = paste(food_drink_category_bymonth$year,food_drink_category_bymonth$month,sep='/'))
ggplot(food_drink_category_bymonth,aes(x=yearmonth,y=Amount))+
  geom_col()+
  labs(title="Expense Trend in Food & Drink",x="Month & Year",
         y="$ Amount Spent")
```

#Question 4: How is my consumption on gas?

### **Summary** 
I filter the dataset by gas category and change the dataset from daily transaction into by month transaction. I used pie chart to plot gas consumption in dollars by month to see the trend. Overall there is not an obvious trend. I spent the most in June 2019 because I had a road trip back then.







```{r Q4, echo=FALSE}
gas_category<-filter(data, Category == 'Gas')
gas_category_bymonth<-separate(gas_category,col = Transaction_Date,into = c('month','date','year'),sep = '/')
gas_category_bymonth<-mutate(gas_category,yearmonth = paste(gas_category_bymonth$year,gas_category_bymonth$month,sep='/'))

ggplot(gas_category_bymonth,aes(x=yearmonth,y=Amount,fill=yearmonth))+
  geom_col()+
  coord_polar('y',start=0)+
  labs(title="Expense Trend in Gas",x="Month & Year",
         y="$ Amount Spent")

```

#Question 5: Do I dine out more than eat at home?

### **Summary** 
In order to answer this question, I need to filter the dataset by groceries category or food & drink category. Then I used bar chart to plot the frequency and use pie chart to plot the dollar amount spent in each category for easy comparison. As we can see from the results, I dine out more often than eating at home and I spent more money in restaurants than in groceries store.






```{r Q5, echo=FALSE}
groceries_food_drink_category<-filter(data, Category %in% c('Groceries','Food & Drink'))
ggplot(groceries_food_drink_category,aes(x=Transaction_Date,y=Amount,color=Category))+
  geom_point()+
  theme(axis.text.x = element_text(angle=45, hjust=1))

ggplot(groceries_food_drink_category,aes(x=Category,fill=Category))+
  geom_bar()+
  labs(title="Frequency Comparison",x="Category",
         y="Frequency")

ggplot(groceries_food_drink_category,aes(x='',y=Amount,group=Category,fill=Category))+
  geom_col()+
  coord_polar('y',start=0)+
  theme(axis.title.x = element_blank(),axis.title.y = element_blank())
```