Project Overview


The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.

The goal of this project is to collect our transaction data, create visualizations on this dashboard, and analysize the graphs.

This project uses data-driven approach. These are the questions inspired by the dataset.

  1. In 2019, what are our top-spending categories?

  2. What is the allocation of expenses by month?

  3. What is the allocation of expenses under person?

  4. How often do we go to restaurant?

  5. How many transactions in every month?

  6. How much do we spend monthly?

Image source: https://cdn.pixabay.com/photo/2017/09/25/17/34/chart-2785962_960_720.jpg

Dataset Overview

Sample Dataset
Trans_Date Post_Date Amount Category Person
7/27/19 7/27/19 1826.01 Travel/ Entertainment Wenxue
7/27/19 7/27/19 439.23 Services Wenxue
7/27/19 7/27/19 404.24 Travel/ Entertainment Wenxue
7/30/19 7/25/19 583.54 Payments and Credits Wenxue
12/20/19 12/20/19 375.99 Travel/ Entertainment Wenxue
11/11/19 11/11/19 345.00 Merchandise Wenxue
4/25/19 4/25/19 340.39 Merchandise Wenxue
7/16/19 7/16/19 310.31 Department Stores Wenxue
3/21/19 3/21/19 301.00 Services Wenxue
5/19/19 5/19/19 249.10 Merchandise Wenxue

Dataset Overview

This dataset includes our credit card transactions in 2019.

We get all the raw data from our banks. This dataset includes all the spending data for three people from 01/2019 to 12/2019 based on different spending categories. we want to know more about the monthly spending for all our accounts in 2019.

Q1: In 2019, what are our top-spending categories?


Summary:

In 2019, the top-spending categories are Travel/Enterainment and Merchandise.

The top-spending categories are Travel/Entertainment and Payments for Wenxue, Merchandise and Travel/Entertainment for Arju, and Education and Medical for Koryn.

All of us didn’t spend much on Home Improvement, Government Services and Fees.

Q2: What is the allocation of expenses by month?


Summary:

We used plotly and ggplot2 to create a interactive bar plot group by months. From this plot, we want to find out the change of total expenses over the 12 months in 2019. The graph showed very clear that we spent a lot of the money in July, especially in Travel/Entertainment. We all spent the least in August.

Q3: What is the allocation of expenses under person?


Summary:

We used the pie chart showing the spending under each person. we can see that Wenxue spent the most of the three people in 2019 and Koryn only spent 18% of the total amount.

Q4: How often do we go to restaurant?


Summary:

We used bar chart to show the frequency of restaurant visits for each person from January to December. Accroding to the graphs, Wenxue go to the restaurant a lot more than Koryn. The graphs also show that July is the month that we go to the restaurant the most.

Q5: How many transactions in every month?


Summary:

These graphs show the amount of transactions we made each month. Wenxue and Arju made the most tranactions in the month of June: Koryn made the most transaction in the month of January. We can also see from the graphs that Wenxue made the most transaction amoung the group.

Q6: How much do we spend monthly?


Summary:

The graphs here represent the monthly spending trending for the trailing tweleve months for all team members. As we can see from the graphs, all monthly expenses were steady with the exception of two months, June and July. This is due to the increase in entertainment expenses especially in the summer. Both Wenxue and Arju, had a huge spike in expenses in those two months.

Conclusion


Conclusion:

To sum up, our top spending categories are Travel/Entertainment and Merchandise, the least spending is home improvement. our hypothesis was higher number of restaruant visits will lead to higher monthly spendings. From our findings, there is a positive coorelation between the number of restaurant visits and the spending total by month. Therefore, we agree with our hypothesis.

Looking at our individual data, Wenxue’s top spending category is Travel/Entertainment; Arju’s top spending is Merchanise; and Koryn’s top spending is Education. We can also see that we spent the most in the month of July. This coincides with the time frame when majority of the population spend their vacations.

---
title: "ANLY 512 Final Project"
author: "Wenxue Sun, Arju Begum, Yeuk Sze Koryn Chiu"
date: "`r Sys.Date()`"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
---

```{r setup, include=FALSE}
library(flexdashboard)
library(knitr)
library(ggplot2)
library(dplyr)
library(plotly)

```

### Project Overview

```{r}
imagePath <- "https://cdn.pixabay.com/photo/2017/09/25/17/34/chart-2785962_960_720.jpg"
include_graphics(imagePath)
```

***

The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data.

The goal of this project is to collect our transaction data, create visualizations on this dashboard, and analysize the graphs.

This project uses data-driven approach. These are the questions inspired by the dataset.

1. In 2019, what are our top-spending categories?

2. What is the allocation of expenses by month?

3. What is the allocation of expenses under person?

4. How often do we go to restaurant?

5. How many transactions in every month?

6. How much do we spend monthly?



Image source: https://cdn.pixabay.com/photo/2017/09/25/17/34/chart-2785962_960_720.jpg


### Dataset Overview

```{r}

data1 <- read.csv("data1.csv")
data2 <- read.csv("data2.csv")
data3 <- read.csv("data3.csv")

data <- do.call("rbind", list(data1, data2, data3))

colnames(data)[1] <- "Trans_Date"
colnames(data)[2] <- "Post_Date"

kable(data[1:10,], caption="Sample Dataset")

```

***

Dataset Overview

This dataset includes our credit card transactions in 2019.

We get all the raw data from our banks. This dataset includes all the spending data for three people from 01/2019 to 12/2019 based on different spending categories. we want to know more about the monthly spending for all our accounts in 2019.


### Q1: In 2019, what are our top-spending categories?

```{r}
p<-ggplot(data=data, aes(x=Category, y=Amount, fill=Person)) + 
            geom_bar(stat="identity", position=position_dodge()) + coord_flip()
ggplotly(p, tooltip = c("Person"))
```

***

Summary:

In 2019, the top-spending categories are Travel/Enterainment and Merchandise.

The top-spending categories are Travel/Entertainment and Payments for Wenxue, Merchandise and Travel/Entertainment for Arju, and Education and Medical for Koryn. 

All of us didn't spend much on Home Improvement, Government Services and Fees.


### Q2: What is the allocation of expenses by month?

```{r}
data$Trans_Date <- as.Date(data$Trans_Date,
  "%m/%d/%y")
data$Trans_Date <- as.Date(cut(data$Trans_Date,
  breaks = "month"))

p2 <- ggplot(data, aes(x = Person, y = Amount))+geom_bar(aes(fill = Category), stat = "identity", color = "white",position = position_dodge(0.9))+facet_wrap(~Trans_Date)+labs(title = "Allocation of Expenses",x="Person",y="Amount")+theme(legend.position="top",axis.text=element_text(size = 6))

ggplotly(p2, tooltip = c("Category"))

```

***

Summary:

We used plotly and ggplot2 to create a interactive bar plot group by months. From this plot, we want to find out the change of total expenses over the 12 months in 2019. The graph showed very clear that we spent a lot of the money in July, especially in Travel/Entertainment. We all spent the least in August.


### Q3: What is the allocation of expenses under person?

```{r}
data_pie <- data %>%
  group_by(Person) %>%
  summarise(volume = sum(Amount)) %>%
  mutate(share=volume/sum(volume)*100.0) %>%
  arrange(desc(volume))

data_pie$Person <- factor(data_pie$Person, levels = rev(as.character(data_pie$Person)))

ggplot(data_pie, aes("", share, fill = Person)) +
    geom_bar(width = 1, size = 1, color = "white", stat = "identity") +
    coord_polar("y") +
    geom_text(aes(label = paste0(round(share), "%")), 
              position = position_stack(vjust = 0.5)) +
    labs(x = NULL, y = NULL, fill = NULL, 
         title = "Total Expenses by Person in 2019") +
    guides(fill = guide_legend(reverse = TRUE)) +
    scale_fill_manual(values = c("#ffd700", "#bcbcbc", "#ffa500")) +
    theme_classic() +
    theme(axis.line = element_blank(),
          axis.text = element_blank(),
          axis.ticks = element_blank(),
          plot.title = element_text(hjust = 0.5))
```

***

Summary:

We used the pie chart showing the spending under each person. we can see that Wenxue spent the most of the three people in 2019 and Koryn only spent 18% of the total amount.

### Q4: How often do we go to restaurant?

```{r}
Dataq3<-filter(data,Category=="Restaurants")

ggplot(Dataq3, aes(Trans_Date)) +geom_bar(fill="#86C29C")+theme(legend.position="top",axis.text=element_text(size = 5))+scale_x_date(date_breaks="1 month", date_labels="%b")+labs(title = "How Often do We Go to a Restaurant?",x="Month",y="Count")+facet_wrap(~Person)

```

***

Summary:

We used bar chart to show the frequency of restaurant visits for each person from January to December. Accroding to the graphs, Wenxue go to the restaurant a lot more than Koryn. The graphs also show that July is the month that we go to the restaurant the most.

### Q5:  How many transactions in every month?

```{r}
ggplot(data, aes(Trans_Date)) +geom_bar(fill="#ADD8E6")+scale_x_date(date_breaks="1 month", date_labels="%b")+labs(title = "Tranactions by Month per Person",x="Month",y="Count")+facet_wrap(~Person)+theme(legend.position="top",axis.text=element_text(size = 5))

```

***

Summary:

These graphs show the amount of transactions we made each month. Wenxue and Arju made the most tranactions in the month of June: Koryn made the most transaction in the month of January. We can also see from the graphs that Wenxue made the most transaction amoung the group.

### Q6: How much do we spend monthly?

```{r}
ggplot(data,aes(y=Amount,x=Trans_Date, col=Person))+geom_line()+labs(title = "Monthly Spending",x="Month",y="Amount")+scale_x_date(date_breaks="1 month", date_labels="%b")


                  
ggplot(data, aes(x = Person, y = Amount))+geom_bar(aes(), stat = "identity", color = "white",position = position_dodge(0.9))+facet_wrap(~Trans_Date)+labs(title = "Spending per Person",x="Person",y="Amount")+theme(legend.position="top",axis.text=element_text(size = 6))


```

***

Summary:

The graphs here represent the monthly spending trending for the trailing tweleve months for all team members. As we can see from the graphs, all monthly expenses were steady with the exception of two months, June and July. This is due to the increase in entertainment expenses especially in the summer. Both Wenxue and Arju, had a huge spike in expenses in those two months.

### Conclusion

```{r}
imagePath <- "https://thumbor.forbes.com/thumbor/960x0/https%3A%2F%2Fspecials-images.forbesimg.com%2Fdam%2Fimageserve%2F1049501112%2F960x0.jpg%3Ffit%3Dscale"
include_graphics(imagePath)
```

***
Conclusion:

To sum up, our top spending categories are Travel/Entertainment and Merchandise, the least spending is home improvement. our hypothesis was higher number of restaruant visits will lead to higher monthly spendings. From our findings, there is a positive coorelation between the number of restaurant visits and the spending total by month. Therefore, we agree with our hypothesis.

Looking at our individual data, Wenxue's top spending category is Travel/Entertainment; Arju's top spending is Merchanise; and Koryn's top spending is Education. We can also see that we spent the most in the month of July. This coincides with the time frame when majority of the population spend their vacations.