For this personal data project, I decided to analyze my credit card transactions. In the past 3 years, I have opened 1 new credit card a year and I wanted to understand how my spend has evolved across cards and categories. I also wanted to gain a deeper understanding of how my spending habits in terms of amount and frequency have changed with the increased exposure to credit and this was a chance to visualize that. My plan was to collect my transaction data, clean it up, create visualizations in this dashboard, and use the graphs and data analysis to answer five questions about my spending trends.
To collect my data, I downloaded my transaction data from 3 different credit cards between January 1,2019 to April 20, 2020. The data sources are my online credit card accounts
Next let’s take a look at my data and how it cleaned it for analysis.
Data Cleaning Process
---
title: "The Quantified Self - Final Project ANLY 512"
author: "Abhilasha Vyas"
output:
flexdashboard::flex_dashboard:
orientation: columns
social: menu
storyboard: true
source: embed
vertical_layout: scroll
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, fig.align = "center")
#install.packages("ggplot2")
#install.packages("plotly")
library(flexdashboard)
library(ggplot2)
library(tidyverse)
library(MASS)
library(plotly)
library(reshape2)
library(readr)
library(readxl)
library(dplyr)
```
# Introduction {data-navmenu="Project Sections"}
For this personal data project, I decided to analyze my **credit card transactions**. In the past 3 years, I have opened 1 new credit card a year and I wanted to understand how my spend has evolved across cards and categories. I also wanted to gain a deeper understanding of how my spending habits in terms of amount and frequency have changed with the increased exposure to credit and this was a chance to visualize that. My plan was to collect my transaction data, clean it up, create visualizations in this dashboard, and use the graphs and data analysis to answer five questions about my spending trends.
To collect my data, I downloaded my transaction data from 3 different credit cards between **January 1,2019 to April 20, 2020**. The data sources are my online credit card accounts
Next let's take a look at my data and how it cleaned it for analysis.
# Data variables {data-navmenu="Project Sections"}
- **Card** Name of the credit card
- **Date** Exact date of the transaction
- **Day** Day of the transaction e.g '24'
- **Year** Year of the transaction e.g '2019'
- **Month** Month of the transaction e.g '4'
- **Amount** Transaction Amount in Dollars
- **Description** Description of the transaction
- **Category** Category of spend e.g 'travel'
- **Transaction Size** Transactions have been divided into four buckets : 0-12, 12-100, 100-1000, 1000+ ( $)
- **Store** Store at which the purchase was made
**Data Cleaning Process**
- Transaction categories across differnet credit cards were combines
- Date was split into - day , month and year for deeper analysis
- Transaction size column was created to categorize transactions based on their amount
- Column "store" was added by parsing the spend description to analyze which stores do I frequent the most
# My five questions {data-navmenu="Project Sections"}
1. Which credit card do I use the most?
2. What categories do I spend the most on?
3. What does my month to month spend look like ?
4. What is my typical transaction size (Amount)?
5. Which stores do I frequent the most (Number of transactions)?
# Which credit card do I use the most? {data-navmenu="Project Sections"}
## column {.sidebar}
### **Analysis Insight**
I was using my Discover credit card the most in 2019 but this pattern reversed in 2020 when I signed up for a Amex credit card with better rewards and made that my primary card
Row
-----------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE,fig.width=12, fig.height=8}
# get data
Card_Transactions <- read_excel("C:/Users/vyasa/OneDrive/Documents/Card_Transactions.xlsx")
cc <- Card_Transactions
# plot
transactions <- ggplot(cc, aes(x = Card, y = Amount))+
geom_bar(
aes(fill = Card), stat = "identity" )+
labs(y = "Amount Spent in $",
x = "Credit card",
title = "Amount Spent Per Credit Card",
subtitle = "Graph depicting amount spent in dollars for each credit card")+theme_bw()+ facet_grid(. ~ cc$Year) + coord_flip()
transactions
```
# What categories do I spend the most on ? {data-navmenu="Project Sections"}
## column {.sidebar}
### **Analysis Insight**
To analyze the impact of COVID-19 on spend categories, I analyzed the data by year
2019 - It's obvious why my biggest expensed in 2019 were merchendise ( which includes online shopping) , travel and restaurants. I made two international trips and one of them was to get married!
2020 - It's a stark contrast to 2019 and the there's negligible spend on trvel and restaurants due to stay at hoem orders. Online shopping for everything has made me spend quite a bit on merchendise in just 4 months
Row
-----------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE,fig.width=12, fig.height=8}
# get data
Card_Transactions <- read_excel("C:/Users/vyasa/OneDrive/Documents/Card_Transactions.xlsx")
cc <- Card_Transactions
# plot
Category_spend <- ggplot(cc, aes(x = cc$Category, y =cc$Amount ))+
geom_bar(
aes(fill = Category), stat = "identity" )+
labs(y = "Amount Spent in $",
x = "Spend Categories",
title = "Amount Spent per Category",
subtitle = "Graph depicting amount spent in dollars for each spend category")+theme_bw() + facet_wrap(. ~Year)
Category_spend
```
***
# What does my month to month spend look like? {data-navmenu="Project Sections"}
## column {.sidebar}
### **Analysis Insight**
2019 - My monthly spend spiked in April and MAy when my family came to visit me and we travelled all across the US! It also spiked arroud october when i was shopping for my wedding and booking travel
2020 - There's a steep decline in montly spend as compared to 2019 due to the COVID situation
Row
-----------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE,fig.width=12, fig.height=8}
# get data
library(readxl)
MOM_spend <- read_excel("C:/Users/vyasa/OneDrive/Documents/MOM_spend.xlsx")
mom_spend <-MOM_spend
# Modify data
data1 = mom_spend %>%
group_by(Year,Month) %>%
summarise(count = sum(count), Amount = sum(Amount))
# plot
mom_spend_plot = ggplot() + theme_bw() +
geom_line(aes(y = Amount, x = Month), size=1.5, data = data1,
stat="identity") + facet_grid(. ~ data1$Year) + scale_x_discrete()+labs(y = "Amount Spent in $ per month",
x = "Months from Jan to December",
title = "Amount Spent per Month",
subtitle = "Graph depicting month over month amount spent in dollars for years 2019 and 2020")
mom_spend_plot
```
***
# Which stores do I frequent the most? {data-navmenu="Project Sections"}
## column {.sidebar}
### **Analysis Insight**
I most frequently spend at my company cafetaria, followed by uber/lyft , metro, amazon and the salad chain sweetgreen!
Row
-----------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE,fig.width=12, fig.height=8}
# get data
Card_Transactions <- read_excel("C:/Users/vyasa/OneDrive/Documents/Card_Transactions.xlsx")
cc2 <- Card_Transactions
# plot
store_spend_freq <- ggplot(cc2, aes(x = Store, y = count))+
geom_bar(
aes(fill = Store), stat = "identity" )+
labs(y = "Amount Spent in $",
x = "Stores",
title = "Number of transactions Per Store",
subtitle = "Graph depicting number of transactions done for each Store")+theme_bw()+ coord_flip()
store_spend_freq
```
***
# What is my typical transaction size (Amount)? {data-navmenu="Project Sections"}
## column {.sidebar}
### **Analysis Insight**
My typical daily transactions stay in the 0-$12 and 12-100 buckets. They seldon spike above 500 or 1000 and these instances denote large purchases like furniture, jewelley or international plane tickets
Row
-----------------------------------------------------------------------
```{r, echo = FALSE, message = FALSE,fig.width=12, fig.height=8}
# get data
Card_Transactions <- read_excel("C:/Users/vyasa/OneDrive/Documents/Card_Transactions.xlsx")
cc2 <- Card_Transactions
#plot
fig <- plot_ly(
cc2, x = ~Date, y = ~Amount,
# Hover text:
text = ~paste("Transaction_size: ", Transaction_Size, 'Category:', Category),
color = ~Amount, size = ~Amount
)
fig
```
***