For this ANLY 512 Quantified Self project, I decided to perform some analysis on my credit card transactions during the year of 2019 from January to October. The plan of my project was to collect the transaction data from the credit card statements, pre-process to clean up the data, come up with five analysis questions, create visualization graphs to answer the questions and also be able to get some better understanding about my daily spendings.
The credit card transaction data were collected from the credit card statements from January 1 to October 31, 2019, which were provided in the online banking system. The data were saved down in datasets which were converted in CSV format.
Here I chose bar chart to compare the spendings from my credit card by month from January through October 2019.
It could be found from the chart that I had the most spendings this year in March and April, and spent more than 1,000 dollars in both month. while I was able save quite a bit in January and July, when I spent about 750 dollars.
Another bar chart was used here to show my spendings in 2019 in terms of different categories.
Shopping was leading in the spendings in all categories so far with more than 4,500 dollars, which was more than 4 months total spendings as shown in the 1st chart.
The 2nd top spending category was Food and Drink, which had a sum of about 2,600 dollars so far.
After identifying that Shopping was the top spenidng category, I decided to look into my spendings in shopping breaking down by months by using another bar chart.
The bar chart presented a very consistent trend as compared to my total spending in this year by months. I shopped the most in March and April, and those two months were exactly the same two months I had the most spending in total.
Instead of using a bar chart, here I drew a line to demonstrate the trend of times I used my credit card throughout the year so far.
It could be discovered that I made the most number of transactions in February through April.
Compared to my spendings in each of those three months, it could be inplied that even though I made the most purchased in February, I made far more large amount transactions in April.
A scatterplot was used here in order to present my transactions throughout the year.
Following the previous charts, we could see that most of my top amount transactions I made in 2019 were among the Shopping category, with 2 of them for more than 400 dollars each, and quite a bit for around 150 to 200 dollars.
In addition, though Food and Drinks were my 2nd largest spent category, most of my spendings were stayed less than 50 dollars.
---
title: "ANLY 512 Final Project"
author: "Xingchao Zhou"
date: "November, 2019"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source: embed
orientation: columns
vertical_layout: fill
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(flexdashboard)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(lubridate)
```
### Introduction
For this ANLY 512 Quantified Self project, I decided to perform some analysis on my **credit card transactions** during the year of 2019 from January to October. The plan of my project was to collect the transaction data from the credit card statements, pre-process to clean up the data, come up with five analysis questions, create visualization graphs to answer the questions and also be able to get some better understanding about my daily spendings.
### Data Collection
The credit card transaction data were collected from the credit card statements from **January 1 to October 31, 2019**, which were provided in the online banking system. The data were saved down in datasets which were converted in CSV format.
### Data variables
- Transaction Date: Exact date the transaction was settled
- Month: Represented by number; e.g. January is 1
- Vendor
- Amount
- Category
```{r}
setwd("C:/Users/xingc/Documents/Harrisburg/Fall 2019/512 Data Visilization - Thursday/Final project data preparation")
bankdata1 = read.csv("Spending1.csv", header=TRUE)
bankdata2 = read.csv("Spending2.csv", header=TRUE)
bankdata3 = read.csv("Spending3.csv", header = TRUE)
bankdata4 = read.csv("Spending4.csv", header = TRUE)
bankdata5 = read.csv("Spending5.csv", header = TRUE)
```
### Five questions
1. What are the total spendings by month in 2019?
2. What are my top-spending categories in 2019?
3. For Shopping category, what are the spendings by month in 2019?
4. How often do I use this credit card each month in 2019?
5. What do my typical purchases look like in terms of amount and category?
### Q1 What are the total spendings by month in 2019?
```{r}
fill <- "gold1"
line <- "goldenrod2"
p<- ggplot(bankdata1, aes(x = Month, y=Amount)) +
geom_bar(stat = "identity", fill = "Blue")+
scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
labs(title = "Spending per Month", x = "Month", y = "Amount") +
theme_minimal()
p
```
***
Here I chose bar chart to compare the spendings from my credit card by month from January through October 2019.
It could be found from the chart that I had the most spendings this year in March and April, and spent more than 1,000 dollars in both month. while I was able save quite a bit in January and July, when I spent about 750 dollars.
### Q2 What are my top-spending categories in 2019?
```{r}
p<- ggplot(bankdata2, aes(x = Category, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "Blue")+
labs(title = "Spending per Category", x = "Category", y = "Amount") +
coord_flip()
p
```
***
Another bar chart was used here to show my spendings in 2019 in terms of different categories.
Shopping was leading in the spendings in all categories so far with more than 4,500 dollars, which was more than 4 months total spendings as shown in the 1st chart.
The 2nd top spending category was Food and Drink, which had a sum of about 2,600 dollars so far.
### Q3 For Shopping category, what are the spendings by month in 2019?
```{r}
p<- ggplot(bankdata3, aes(x = Month, y=Amount)) +
geom_bar(stat = "identity", fill = "Blue")+
scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
labs(title = "Spending per Month", x = "Month", y = "Amount") +
theme_minimal()
p
```
***
After identifying that Shopping was the top spenidng category, I decided to look into my spendings in shopping breaking down by months by using another bar chart.
The bar chart presented a very consistent trend as compared to my total spending in this year by months. I shopped the most in March and April, and those two months were exactly the same two months I had the most spending in total.
### Q4 How often do I use my card each month?
```{r}
library(plotly)
p <- ggplot(bankdata4, aes(Month)) +
stat_count(geom='line', aes(y=..count..)) +
labs(title = "Transactions", x = "Month", y = "Count") +
scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"))
(ggcard <- ggplotly(p))
```
***
Instead of using a bar chart, here I drew a line to demonstrate the trend of times I used my credit card throughout the year so far.
It could be discovered that I made the most number of transactions in February through April.
Compared to my spendings in each of those three months, it could be inplied that even though I made the most purchased in February, I made far more large amount transactions in April.
### Q5 What do my typical purchases look like in terms of amount and category?
```{r}
library(plotly)
purchases <- ggplot(bankdata5, aes(x = Date, y = Amount)) +
geom_point(aes(col=Category, size=Amount)) +
labs(title = "My Purchases", x = "Date", y = "Amount")
(ggpurchases <- ggplotly(purchases))
```
***
A scatterplot was used here in order to present my transactions throughout the year.
Following the previous charts, we could see that most of my top amount transactions I made in 2019 were among the Shopping category, with 2 of them for more than 400 dollars each, and quite a bit for around 150 to 200 dollars.
In addition, though Food and Drinks were my 2nd largest spent category, most of my spendings were stayed less than 50 dollars.