quant_self
Final Project for ANLY512-51-B: Data Visualization
The popularity and growth of the internet of things(IOT), the mass collection of personal information, and mobile technologies grew the Quantified Self movement of every individual(primarily in predicting the wearable computing). In my final project for this data visualization class I have used a collection of 24 months or 2 years of my financial data on spending and payments captured by my Discover credit card which is one of my major spending credit cards from a quite a long period of time.
The goal of my project here is collecting, analyzing and visualizing my Finanacial data from discover credit card using the tools and methods covered in our class. Additionally, using the data-driven approach, I will also create a summary of my finance data which answers the following questions based on the data i have collected.
| Trans..Date | Post.Date | Month | Year | Description | Amount | Category |
|---|---|---|---|---|---|---|
| 8/2/2016 | 8/3/2016 | 8 | 2016 | DFW AIRPORT PARKING DFW AIRPORT TX00266R | 2.00 | Travel/ Entertainment |
| 8/2/2016 | 8/3/2016 | 8 | 2016 | RACETRAC 99 PLANO TX | 5.01 | Gasoline |
| 8/5/2016 | 8/5/2016 | 8 | 2016 | CASHBACK BONUS REDEMPTION PYMT/STMT CRDT | -1.28 | Awards and Rebate Credits |
| 8/5/2016 | 8/5/2016 | 8 | 2016 | INTERNET PAYMENT - THANK YOU | -33.26 | Payments and Credits |
| 8/6/2016 | 8/6/2016 | 8 | 2016 | PP*SVUCA.EDU 402-935-2244 CA | 1849.50 | Services |
| 8/10/2016 | 8/10/2016 | 8 | 2016 | INTERNET PAYMENT - THANK YOU | -1049.50 | Payments and Credits |
| 8/12/2016 | 8/12/2016 | 8 | 2016 | INTERNET PAYMENT - THANK YOU | -800.00 | Payments and Credits |
| 8/22/2016 | 8/22/2016 | 8 | 2016 | BAL TRANS FEE | 145.50 | Fees |
| 8/22/2016 | 8/22/2016 | 8 | 2016 | BANK OF AMERICA ******8691TI DEAPR 0.00% EXPIRES 09/2017 | 4850.00 | Balance Transfers |
| 9/6/2016 | 9/6/2016 | 9 | 2016 | CASHBACK BONUS REDEMPTION PYMT/STMT CRDT | -65.00 | Awards and Rebate Credits |
| 9/13/2016 | 9/13/2016 | 9 | 2016 | INTERNET PAYMENT - THANK YOU | -50.50 | Payments and Credits |
| 9/21/2016 | 9/21/2016 | 9 | 2016 | INTERNET PAYMENT - THANK YOU | -80.00 | Payments and Credits |
| 10/3/2016 | 10/3/2016 | 10 | 2016 | CASHBACK BONUS REDEMPTION PYMT/STMT CRDT | -103.00 | Awards and Rebate Credits |
| 10/7/2016 | 10/7/2016 | 10 | 2016 | INTERNET PAYMENT - THANK YOU | -97.00 | Payments and Credits |
| 10/27/2016 | 10/27/2016 | 10 | 2016 | INTERNET PAYMENT - THANK YOU | -1300.00 | Payments and Credits |
There Data set i have considered has 7 variables and 359 transactions. They are given below:
Category
Excluding the payments and credits out of all the transactions, There are 305 observations which is the main dataset for analysizng and visualizing the data.
The variable Month is categorical while Amount is continous. So, I aimed to compare the total amount in each month for each catergory, A bar chart was used to summarize the dataset.I summarized the total amount spent in each month and used a bar chart to display the numbers in different colors.
From the plot we could find that, average spending over month is over $1000 to $1250. Total spending in August 2016, June 2018 alongside September and November 2017 secure high spending months title with Balance transfers taking the Lions share followed by general services, Travel/Entertainment and tution fees.
While on the other hand my credit card was idle with zero transactions during the June, July and August Months of the year 2017.
Based on the same reason stated above, A bar chart was used here. Plotting Category on y-axis and total amount on x-axis, Here We see the category with the most money is Balance transfers with a whooping amount of nearly $15000, which is almost or even double to the next major category services with ~$7000, followed by than the other categories Travel/Entertainment with $3000 and Education/Tution Fees just over $2500.
If you can closely observe my transactions i spend way less amount on Merchandise, Restuarants and Department stores.
Most Importantly i barely paid any interest to the bank which I feel myself as an achievement for being so disciplined about my Finances.
We see that for most months, the spending on Services was more in the month of February and August.
However in March and November, it is half of the major spending months.
From the plot we can see that, the most spending Catgory is Travel/Entertainment in March followed by Services leaving Merchandise and Restuarants to later positions with almost equal spendings.
Besides, in September, the Highest spending is Merchandise with just over $300 and Travel/Entertainment took the second place with $250 closely followed by Supermarkets
In September, The rewards I accumulated surpassed other categories except those which stood tall in the first three places.
Since the purpose was to compare the spending between 2016, 2017, and 2018:
A grouped bar chart was plotted to analyze and compare the differences. From the obtained plots we can see that, overall spending in 2016 is much greater than in 2018 adn 2017. Especially Balance Transfers in every year of my transactions.
Second greater spending is Services with 2018 being the least followed by 2017 and 2016. One thing that I have obsered over here is that even though the trasactions for 2016 were considered from August till December.
I spend no value of amount towards education in 2016 while it was as high as $1000 and $1700 in the following years of 2017 and 2018.
---
title: "Prof. Alan Hitch ANLY 512 Final Project"
author: "Trinadh Siva Nanupathruni_227319"
date: "August 17, 2018"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source: embed
orientation: columns
vertical_layout: fill
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(flexdashboard)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(lubridate)
setwd("C:/Users/trina/Documents")
```
###Overview of the Quantified Self movement

***
Final Project for ANLY512-51-B: Data Visualization
The popularity and growth of the internet of things(IOT), the mass collection of personal information, and mobile technologies grew the Quantified Self movement of every individual(primarily in predicting the wearable computing). In my final project for this data visualization class I have used a collection of 24 months or 2 years of my financial data on spending and payments captured by my Discover credit card which is one of my major spending credit cards from a quite a long period of time.
The goal of my project here is collecting, analyzing and visualizing my Finanacial data from discover credit card using the tools and methods covered in our class. Additionally, using the data-driven approach, I will also create a summary of my finance data which answers the following questions based on the data i have collected.
1) What is the total spending by month in 2017?
2) Which category costed the most in 2017?
3) For the Category Services, what is the spending by month?
4) In march and November, what is the spending in each category?
5) Compared to 2017, what is the difference?
### Data preparation
```{r}
data <- read.csv("Discover-AllAvailable-20180816.csv")
data1 <- subset(data, data$Category !="Payments and Credits")
kable(data[1:15,], caption="Sample raw data in 2017")
```
***
- There Data set i have considered has 7 variables and 359 transactions. They are given below:
- Tran. Date: Transaction Date
- Post Date
- Month
- Year Mid 2016 to Aug 2018
- Description: Detailed Description of purchases
- Amount: The total amount spent
- Category
- Excluding the payments and credits out of all the transactions, There are 305 observations which is the main dataset for analysizng and visualizing the data.
###Q1: What is the total spending by month in 2016,2017 and 2018?
```{r}
# plot
library(ggplot2)
fill <- "gold1"
line <- "goldenrod2"
p<- ggplot(data1, aes(x = Month, y=Amount, fill = Category)) +
geom_bar(stat = "identity")+
scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
labs(title = "Spending per Month", x = "Month", y = "Amount") +
theme_bw() +facet_wrap(~ Year)
library(plotly)
ggplotly(p + theme(axis.text.x = element_text(angle = 45, hjust = 1)))
```
***
The variable Month is categorical while Amount is continous. So, I aimed to compare the total amount in each month for each catergory, A bar chart was used to summarize the dataset.I summarized the total amount spent in each month and used a bar chart to display the numbers in different colors.
From the plot we could find that, average spending over month is over $1000 to $1250. Total spending in August 2016, June 2018 alongside September and November 2017 secure high spending months title with Balance transfers taking the Lions share followed by general services, Travel/Entertainment and tution fees.
While on the other hand my credit card was idle with zero transactions during the June, July and August Months of the year 2017.
###Q2: Which category costed the most in 2016,2017 and 2018?
```{r}
library(ggplot2)
p<- ggplot(data1, aes(x = Category, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "Green")+
labs(title = "Spending per Category", x = "Category", y = "Amount") +
coord_flip()
p
```
***
Based on the same reason stated above, A bar chart was used here. Plotting Category on y-axis and total amount on x-axis,
Here We see the category with the most money is Balance transfers with a whooping amount of nearly $15000, which is almost or even double to the next major category services with ~$7000, followed by than the other categories Travel/Entertainment with $3000 and Education/Tution Fees just over $2500.
If you can closely observe my transactions i spend way less amount on Merchandise, Restuarants and Department stores.
Most Importantly i barely paid any interest to the bank which I feel myself as an achievement for being so disciplined about my Finances.
###Q3: For the category services, what is the spending by month?
```{r}
data2<- subset(data1, Category=='Services')
p1<- ggplot(data2, aes(x = Month, y=Amount)) +
geom_bar(stat = "identity", fill = "Green")+
scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
labs(title = "Spending per Month", x = "Month", y = "Amount") +
theme_minimal()
library(plotly)
ggplotly(p1)
```
***
We see that for most months, the spending on Services was more in the month of February and August.
However in March and November, it is half of the major spending months.
###Q4: what is the spending in each category for the month of March and September?
```{r}
data2<- subset(data1, Month=='3')
p1<- ggplot(data2, aes(x = Category, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "Green")+
labs(title = "Spending per Category", x = "Category", y = "Amount") +
coord_flip()
p1
data3<- subset(data1, Month=='5' )
p2<- ggplot(data3, aes(x = Category, y=Amount, fill=Category)) +
geom_bar(stat = "identity", fill = "Blue")+
labs(title = "Spending per Category", x = "Category", y = "Amount") +
coord_flip()
p2
```
***
From the plot we can see that, the most spending Catgory is Travel/Entertainment in March followed by Services leaving Merchandise and Restuarants to later positions with almost equal spendings.
Besides, in September, the Highest spending is Merchandise with just over $300 and Travel/Entertainment took the second place with $250 closely followed by Supermarkets
In September, The rewards I accumulated surpassed other categories except those which stood tall in the first three places.
###Q5: Compared to 2017, what is the difference in 2016 and 2018
```{r}
#comp<-data.frame(comp)
#comp$Year <- as.factor(comp$Year)
data1$Year <- as.factor(data1$Year)
ggplot(data=data1, aes(x=Category, y=Amount)) +
geom_bar(aes(fill = Year), position = "dodge", stat="identity") +
labs(title = "Spending per Category", x = "Category", y = "Amount") +
theme(legend.position = "Right") +
scale_fill_discrete(name="Year") +
theme(legend.position = "bottom") +
coord_flip()
```
***
Since the purpose was to compare the spending between 2016, 2017, and 2018:
A grouped bar chart was plotted to analyze and compare the differences. From the obtained plots we can see that, overall spending in 2016 is much greater than in 2018 adn 2017. Especially Balance Transfers in every year of my transactions.
Second greater spending is Services with 2018 being the least followed by 2017 and 2016. One thing that I have obsered over here is that even though the trasactions for 2016 were considered from August till December.
I spend no value of amount towards education in 2016 while it was as high as $1000 and $1700 in the following years of 2017 and 2018.