For this personal data project, I decided to analyze my Bank card transactions. This is a good chance to visualize how I spent my money. My plan was to collect my transaction data, clean it up, create visualizations in this dashboard, and use the graphs and data analysis to answer five questions about my spending trends.
To collect my data, I downloaded my transaction data from last 12 month. The source is my online bank account. But seems I do not have spending for sepecific month so below some graphs are not continous.
Next let’s take a look at my data and my 5 questions.
It also has some other columns but irrlevant to my analysis so I ignore them.
As you can see, Transfer, Dining, Online Services and Credit Card payment are my 4 top-spending cateogires.
The reason why I have postive and negative data is because these transaction data are including both my credit card and my bank checking/saving account, some use bank account to pay credit card will be recognized as positive, all the regular spending is negative number.
Credit card payment and online services including Venmo, bank account transfer are the source of positive amount, and this is not weird. I have some habbit to transfer money between bank accounts to make them balance, that’s why number is large.
Calling delivery food is a major source of spending during this pandemic period. I almost call the delivery everyday. Transfer is regarding to the venmo money transfer. I live with my roomates and he paid the rent for both of use every month and I venmo him, there where the money from.
Feb is my highest spending month. Feb’s spend was much higher than July and June, mainly due to Feb and March are the starting of Covid-19, and I spent a lot buying foods, groceries, and also taking transpotation to differet places to make the preparation. That is a very good explanation of the pattern of large spending on Jan/Feb while much smaller on Jun/Jul.
Not surprisingly, I made the most number of transactions in Dec, Nov, and also some spik on Feb/March. Same reason as before, I use cards more frequently before pandemic, and had to make preparation of working at home just before the big breakout. Once the WFH is in effective, except for calling delivery, I hardly use my card to spend.
Here are the interesting grphs that my most major spending is on credit card payment. Well that’s normal as I pay my credit card every month, but please note these are positive amount. The other side is negative, the real spending, transfer is big becuase I use venmo, tax is also big because I report my income tax before April, and that is a big amount of money.
As you can see I have 2 bank accounts and 3 credit cards. Why? I am not sure I just got them. No matter from positive pay back or negative spending, the AMEX card is the major contribution. It is not hard to understand because it is an Hilton Honors Cards, so every dollar I spent through it I collected rewards/points that can be used to book Hilton hotel all over the word.To be honestly every time I went to Harrisburg I lived in Hilton, so I am happy to see I earned a lot of points from it.
The second major one Citibank 9246, this is my checking account and I usully use it to pay my credit card and other kinds of online payment. This is my most frequently used bank account.
I also want to say the citi credit card I believe it will be more and more important in my life going forward because it provided cash back. The ratio is 2%. Every 100 dollar I paied via it I will get back 2. I recently use more and more this credit card and I strongly believe if I do the same anaylisis next year, very likely the citi credit card will surpass the Amex Hilton one.
---
title: "The Quantified Self - Final Project ANLY 512"
author: "Runhao Wang"
output:
flexdashboard::flex_dashboard:
orientation: columns
social: menu
storyboard: true
source: embed
vertical_layout: fill
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
#install.packages("ggplot2")
library(flexdashboard)
library(ggplot2)
library(tidyverse)
```
### Introduction
For this personal data project, I decided to analyze my Bank card transactions. This is a good chance to visualize how I spent my money. My plan was to collect my transaction data, clean it up, create visualizations in this dashboard, and use the graphs and data analysis to answer five questions about my spending trends.
To collect my data, I downloaded my transaction data from last 12 month. The source is my online bank account. But seems I do not have spending for sepecific month so below some graphs are not continous.
Next let's take a look at my data and my 5 questions.
### Data variables
- Date = Exact date the transaction was settled
- Month = Represented by number; e.g. July is 7
- BankName
- Amount
- Category
It also has some other columns but irrlevant to my analysis so I ignore them.
### My five questions
1. What are my top-spending categories?
2. What does my month-by-month spend look like?
3. How often do I use my card each month?
4. What do my typical purchases look like in terms of amount and category?
5. What are top-spending bank acccount?
### What are my top-spending categories?
```{r}
boadata <- read.csv("/Users/RunhaoWang/Desktop/512 O/MyTransactionHistory copy.csv", header=TRUE)
category <- ggplot(boadata, aes(x=Category, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "LightBlue")+
labs(title = "Spend by Category", x = "Category", y = "Amount") +
coord_flip()
category
```
***
As you can see, Transfer, Dining, Online Services and Credit Card payment are my 4 top-spending cateogires.
The reason why I have postive and negative data is because these transaction data are including both my credit card and my bank checking/saving account, some use bank account to pay credit card will be recognized as positive, all the regular spending is negative number.
Credit card payment and online services including Venmo, bank account transfer are the source of positive amount, and this is not weird. I have some habbit to transfer money between bank accounts to make them balance, that's why number is large.
Calling delivery food is a major source of spending during this pandemic period. I almost call the delivery everyday. Transfer is regarding to the venmo money transfer. I live with my roomates and he paid the rent for both of use every month and I venmo him, there where the money from.
### What does my month-by-month spend look like?
```{r}
#month <- ggplot(boadata, aes(x=Date, y=Amount)) +
# geom_line(color = "#00AFBB", size=2)+
#labs(title = "Spend")
month <- ggplot(boadata, aes(x=Month, y=Amount)) +
geom_bar(aes(fill = Category), stat="identity") +
labs(title = "Spend by Month and Category", x = "Month", y = "Amount") +
scale_x_continuous(breaks=c(3,4,5,6,7,8)) +
theme(legend.position = "Right") +
scale_fill_discrete(name="Category") +
theme(legend.position = "right")
month
#month <- ggplot(boadata, aes(x=Date, y=Amount)) +
#geom_area(aes(color = "#00AFBB", fill="#00AFBB"),
#alpha = 0.5, position=position_dodge(0.8))
```
***
Feb is my highest spending month. Feb's spend was much higher than July and June, mainly due to Feb and March are the starting of Covid-19, and I spent a lot buying foods, groceries, and also taking transpotation to differet places to make the preparation. That is a very good explanation of the pattern of large spending on Jan/Feb while much smaller on Jun/Jul.
### How often do I use my card each month?
```{r}
library(plotly)
card <- ggplot(boadata, aes(x=Month)) +
stat_count(geom='line', aes(y=..count..)) +
labs(title = "Transactions", x = "Month", y = "Count")
(ggcard <- ggplotly(card))
```
***
Not surprisingly, I made the most number of transactions in Dec, Nov, and also some spik on Feb/March. Same reason as before, I use cards more frequently before pandemic, and had to make preparation of working at home just before the big breakout. Once the WFH is in effective, except for calling delivery, I hardly use my card to spend.
### What do my typical purchases look like in terms of amount and category?
```{r}
purchases <- ggplot(boadata, aes(x = Date, y = Amount)) +
geom_point(aes(col=Category, size=Amount)) +
labs(title = "My Purchases", x = "Date", y = "Amount")
(ggpurchases <- ggplotly(purchases))
```
***
Here are the interesting grphs that my most major spending is on credit card payment. Well that's normal as I pay my credit card every month, but please note these are positive amount. The other side is negative, the real spending, transfer is big becuase I use venmo, tax is also big because I report my income tax before April, and that is a big amount of money.
### What are top-spending bank acccount?
```{r}
category <- ggplot(boadata, aes(x=AccountName, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "LightBlue")+
labs(title = "Spend by Category", x = "Category", y = "Amount") +
coord_flip()
category
```
***
As you can see I have 2 bank accounts and 3 credit cards. Why? I am not sure I just got them. No matter from positive pay back or negative spending, the AMEX card is the major contribution. It is not hard to understand because it is an Hilton Honors Cards, so every dollar I spent through it I collected rewards/points that can be used to book Hilton hotel all over the word.To be honestly every time I went to Harrisburg I lived in Hilton, so I am happy to see I earned a lot of points from it.
The second major one Citibank 9246, this is my checking account and I usully use it to pay my credit card and other kinds of online payment. This is my most frequently used bank account.
I also want to say the citi credit card I believe it will be more and more important in my life going forward because it provided cash back. The ratio is 2%. Every 100 dollar I paied via it I will get back 2. I recently use more and more this credit card and I strongly believe if I do the same anaylisis next year, very likely the citi credit card will surpass the Amex Hilton one.