Overview of the Quantified Self movement

quant_self

quant_self


Final Project for ANLY512-51-B: Data Visualization

The popularity and growth of the internet of things(IOT), the mass collection of personal information, and mobile technologies grew the Quantified Self movement of every individual(primarily in predicting the wearable computing). In my final project for this data visualization class I have used a collection of 24 months or 2 years of my financial data on spending and payments captured by my Discover credit card which is one of my major spending credit cards from a quite a long period of time.

The goal of my project here is collecting, analyzing and visualizing my Finanacial data from discover credit card using the tools and methods covered in our class. Additionally, using the data-driven approach, I will also create a summary of my finance data which answers the following questions based on the data i have collected.

  1. What is the total spending by month in 2017?
  2. Which category costed the most in 2017?
  3. For the Category Services, what is the spending by month?
  4. In march and November, what is the spending in each category?
  5. Compared to 2017, what is the difference?

Data preparation

Sample raw data in 2017
Trans..Date Post.Date Month Year Description Amount Category
8/2/2016 8/3/2016 8 2016 DFW AIRPORT PARKING DFW AIRPORT TX00266R 2.00 Travel/ Entertainment
8/2/2016 8/3/2016 8 2016 RACETRAC 99 PLANO TX 5.01 Gasoline
8/5/2016 8/5/2016 8 2016 CASHBACK BONUS REDEMPTION PYMT/STMT CRDT -1.28 Awards and Rebate Credits
8/5/2016 8/5/2016 8 2016 INTERNET PAYMENT - THANK YOU -33.26 Payments and Credits
8/6/2016 8/6/2016 8 2016 PP*SVUCA.EDU 402-935-2244 CA 1849.50 Services
8/10/2016 8/10/2016 8 2016 INTERNET PAYMENT - THANK YOU -1049.50 Payments and Credits
8/12/2016 8/12/2016 8 2016 INTERNET PAYMENT - THANK YOU -800.00 Payments and Credits
8/22/2016 8/22/2016 8 2016 BAL TRANS FEE 145.50 Fees
8/22/2016 8/22/2016 8 2016 BANK OF AMERICA ******8691TI DEAPR 0.00% EXPIRES 09/2017 4850.00 Balance Transfers
9/6/2016 9/6/2016 9 2016 CASHBACK BONUS REDEMPTION PYMT/STMT CRDT -65.00 Awards and Rebate Credits
9/13/2016 9/13/2016 9 2016 INTERNET PAYMENT - THANK YOU -50.50 Payments and Credits
9/21/2016 9/21/2016 9 2016 INTERNET PAYMENT - THANK YOU -80.00 Payments and Credits
10/3/2016 10/3/2016 10 2016 CASHBACK BONUS REDEMPTION PYMT/STMT CRDT -103.00 Awards and Rebate Credits
10/7/2016 10/7/2016 10 2016 INTERNET PAYMENT - THANK YOU -97.00 Payments and Credits
10/27/2016 10/27/2016 10 2016 INTERNET PAYMENT - THANK YOU -1300.00 Payments and Credits

Q1: What is the total spending by month in 2016,2017 and 2018?


The variable Month is categorical while Amount is continous. So, I aimed to compare the total amount in each month for each catergory, A bar chart was used to summarize the dataset.I summarized the total amount spent in each month and used a bar chart to display the numbers in different colors.

From the plot we could find that, average spending over month is over $1000 to $1250. Total spending in August 2016, June 2018 alongside September and November 2017 secure high spending months title with Balance transfers taking the Lions share followed by general services, Travel/Entertainment and tution fees.

While on the other hand my credit card was idle with zero transactions during the June, July and August Months of the year 2017.

Q2: Which category costed the most in 2016,2017 and 2018?


Based on the same reason stated above, A bar chart was used here. Plotting Category on y-axis and total amount on x-axis, Here We see the category with the most money is Balance transfers with a whooping amount of nearly $15000, which is almost or even double to the next major category services with ~$7000, followed by than the other categories Travel/Entertainment with $3000 and Education/Tution Fees just over $2500.

If you can closely observe my transactions i spend way less amount on Merchandise, Restuarants and Department stores.

Most Importantly i barely paid any interest to the bank which I feel myself as an achievement for being so disciplined about my Finances.

Q3: For the category services, what is the spending by month?


We see that for most months, the spending on Services was more in the month of February and August.

However in March and November, it is half of the major spending months.

Q4: what is the spending in each category for the month of March and September?


From the plot we can see that, the most spending Catgory is Travel/Entertainment in March followed by Services leaving Merchandise and Restuarants to later positions with almost equal spendings.

Besides, in September, the Highest spending is Merchandise with just over $300 and Travel/Entertainment took the second place with $250 closely followed by Supermarkets

In September, The rewards I accumulated surpassed other categories except those which stood tall in the first three places.

Q5: Compared to 2017, what is the difference in 2016 and 2018


Since the purpose was to compare the spending between 2016, 2017, and 2018:

A grouped bar chart was plotted to analyze and compare the differences. From the obtained plots we can see that, overall spending in 2016 is much greater than in 2018 adn 2017. Especially Balance Transfers in every year of my transactions.

Second greater spending is Services with 2018 being the least followed by 2017 and 2016. One thing that I have obsered over here is that even though the trasactions for 2016 were considered from August till December.

I spend no value of amount towards education in 2016 while it was as high as $1000 and $1700 in the following years of 2017 and 2018.


---
title: "Prof. Alan Hitch ANLY 512 Final Project"
author: "Trinadh Siva Nanupathruni_227319"
date: "August 17, 2018"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    orientation: columns
    vertical_layout: fill
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(flexdashboard)
library(knitr)
library(ggplot2)
library(tidyverse)
library(readxl)
library(dplyr)
library(xts)
library(zoo)
library(lubridate) 
setwd("C:/Users/trina/Documents")
```

###Overview of the Quantified Self movement
![quant_self](spending.jpg)
 

***
Final Project for ANLY512-51-B: Data Visualization

The popularity and growth of the internet of things(IOT), the mass collection of personal information, and mobile technologies grew the Quantified Self movement of every individual(primarily in predicting the wearable computing). In my final project for this data visualization class I have used a collection of 24 months or 2 years of my financial data on spending and payments captured by my Discover credit card which is one of my major spending credit cards from a quite a long period  of time.

The goal of my project here is collecting, analyzing and visualizing my Finanacial data from discover credit card using the tools and methods covered in our class. Additionally, using the data-driven approach, I will also create a summary of my finance data which answers the following questions based on the data i have collected.

1) What is the total spending by month in 2017?
2) Which category costed the most in 2017?
3) For the Category Services, what is the spending by month?
4) In march and November, what is the spending in each category?
5) Compared to 2017, what is the difference?

### Data preparation

```{r}
data <- read.csv("Discover-AllAvailable-20180816.csv")
data1 <- subset(data, data$Category !="Payments and Credits")
kable(data[1:15,], caption="Sample raw data in 2017")

```

***

- There Data set i have considered has 7 variables and 359 transactions. They are given below:

- Tran. Date: Transaction Date
- Post Date
- Month
- Year Mid 2016 to Aug 2018
- Description: Detailed Description of purchases
- Amount: The total amount spent	
- Category

- Excluding the payments and credits out of all the transactions, There are 305 observations which is the main dataset for analysizng  and visualizing the data. 


  
###Q1: What is the total spending by month in 2016,2017 and 2018?

```{r}
# plot
library(ggplot2)
fill <- "gold1"
line <- "goldenrod2"
p<- ggplot(data1, aes(x = Month, y=Amount, fill = Category)) + 
  geom_bar(stat = "identity")+
  scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
  labs(title = "Spending per Month", x = "Month", y = "Amount") +
  theme_bw() +facet_wrap(~ Year)
library(plotly)
 ggplotly(p + theme(axis.text.x = element_text(angle = 45, hjust = 1)))
```

***
The variable Month is categorical while Amount is continous. So, I aimed to compare the total amount in each month for each catergory, A bar chart was used to summarize the dataset.I summarized the total amount spent in each month and used a bar chart to display the numbers in different colors.

From the plot we could find that, average spending over month is over $1000 to $1250. Total spending in August 2016, June 2018 alongside September and November 2017 secure high spending months title with Balance transfers taking the Lions share followed by general services, Travel/Entertainment and tution fees.

While on the other hand my credit card was idle with zero transactions during the June, July and August Months of the year 2017.

###Q2: Which category costed the most in 2016,2017 and 2018? 

```{r}
library(ggplot2)
p<- ggplot(data1, aes(x = Category, y=Amount, fill=Category)) + 
  geom_bar(stat = "identity", fill = "Green")+
  labs(title = "Spending per Category", x = "Category", y = "Amount") +
  coord_flip()
p
```

***
Based on the same reason stated above, A bar chart was used here. Plotting Category on y-axis and total amount on  x-axis, 
Here We see the category with the most money is Balance transfers with  a whooping amount of nearly $15000, which is almost or even double to the next major category services with ~$7000, followed by than the other categories Travel/Entertainment with $3000 and Education/Tution Fees just over $2500.

If you can closely observe my transactions i spend way less amount on Merchandise, Restuarants and Department stores. 

Most Importantly i barely paid any interest to the bank which I feel myself as an achievement for being so disciplined about my Finances.

###Q3: For the category services, what is the spending by month?

```{r}
data2<- subset(data1, Category=='Services')
p1<- ggplot(data2, aes(x = Month, y=Amount)) + 
  geom_bar(stat = "identity", fill = "Green")+
  scale_x_discrete(limits=c("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")) +
  labs(title = "Spending per Month", x = "Month", y = "Amount") +
  theme_minimal()
library(plotly)
ggplotly(p1)

```

***
We see that for most months, the spending on Services was  more in the month of February and August. 

However in March and November, it is half of the major spending months.

###Q4: what is the spending in each category for the month of March and September?

```{r}
data2<- subset(data1, Month=='3')
p1<- ggplot(data2, aes(x = Category, y=Amount, fill=Category)) + 
  geom_bar(stat = "identity", fill = "Green")+
  labs(title = "Spending per Category", x = "Category", y = "Amount") +
  coord_flip()
p1

data3<- subset(data1, Month=='5' )
p2<- ggplot(data3, aes(x = Category, y=Amount, fill=Category)) + 
  geom_bar(stat = "identity", fill = "Blue")+
  labs(title = "Spending per Category", x = "Category", y = "Amount") +
  coord_flip()
p2
```

***
From the plot we can see that, the most spending Catgory is Travel/Entertainment in March followed by Services leaving Merchandise and Restuarants to later positions with almost equal spendings.

Besides, in September, the Highest spending is Merchandise with just over $300 and Travel/Entertainment took the second place with $250 closely followed by Supermarkets

In September, The rewards I accumulated surpassed other categories except those which stood tall in the first three places.

###Q5: Compared to 2017, what is the difference in 2016 and 2018

```{r}
#comp<-data.frame(comp)
#comp$Year <- as.factor(comp$Year)
data1$Year <- as.factor(data1$Year)
ggplot(data=data1, aes(x=Category, y=Amount)) +   
  geom_bar(aes(fill = Year), position = "dodge", stat="identity") +
    labs(title = "Spending per Category", x = "Category", y = "Amount") +
  theme(legend.position = "Right") +
  scale_fill_discrete(name="Year") +
  theme(legend.position = "bottom") +
  coord_flip()
```


***
Since the purpose was to compare the spending between 2016, 2017, and 2018: 

A grouped bar chart was plotted to analyze and compare the differences. From the obtained plots we can see that, overall spending in 2016 is much greater than in 2018 adn 2017. Especially Balance Transfers in every year of my transactions.

Second greater spending is Services with 2018 being the least followed by 2017 and 2016. One thing that I have obsered over here is  that even though the trasactions for 2016 were considered from August till December.

I spend no value of amount towards education in 2016 while it was as high as $1000 and $1700 in the following years of 2017 and 2018.