Introduction

The Quantified Self (QS) is a movement motivated to leverage the synergy of wearables, analytics, and “Big Data”. This movement exploits the ease and convenience of data acquisition through the internet of things (IoT) to feed the growing obsession of personal informatics and quotidian data. The website http://quantifiedself.com/ is a great place to start to understand more about the QS movement.

The value of the QS for our class is that its core mandate is to visualize and generate questions and insights about a topic that is of immense importance to most people - themselves. It also produces a wealth of data in a variety of forms. Therefore, designing this project around the QS movement makes perfect sense because it offers you the opportunity to be both the data and question provider, the data analyst, the vis designer, and the end user. This means you will be in the unique position of being capable of providing feedback and direction at all points along the data visualization/analysis life cycle.

Motivation

The tag line, The Quantified Self is self-knowledge through numbers. For long, it was used to track health indicators like level of physical activity, heart rate etc. But there are a lot of things where we don’t think as much as we should. For many, tracking spending habits, credit rates and payments are unpleasant tasks that serve as a direct detraction from quality of life. This has enabled the Quantified Self movement to come to personal finance. Some early applications such as Mint and Bill Guard have enjoyed relative success in this area. This piqued my interest to build a customized dashboard for my needs. ## Questions I am a student and also an employee. Taking into account on how I manage my finances, my questions are,

What are the top categories where I spend the most? What are the top categories where I spend the most using my credit card? What is the trend in my month-wise spending in each category using my credit card? What is the trend of my income and expenses over time? Credit card spending should be 30% of the credit limit each month to maximize credit score. Am I following this? What is the amount that remains after my expenses each month?

I followed the process steps explained below to answer the questions above.

Data Acquisition: The data was acquired by downloading my bank statement and my credit card statement from the period between February 2016 to August 2017 to give me a good read of my income and expenses.

Storage: The statements were downloaded in CSV format and read into the R program. Manipulation Formatted the dates and added months in excel for easy manipulation. The bank statement had negative signs for expenses and positive signs for income. I changed the sign for expenses to help in creating the donut visualizations. I also used ‘dplyr’ to aggregate data by category of expense and time and to aggregate to month-year level. I also formatted date variables and certain categorical variables as factors to help sort graphs.

Visualization methods: In ‘Telling stories with Data Visualization’ (Rodriguez, Nunes, Devezas, 2015), the authors defined 3 formats based on how much author driven involvement is in the crafting of the path of the story - Martini Glass Structure, Interactive Slideshow and Drill-Down story. The format I have chosen is the Interactive Slideshow format. The visualization is presented using the storyboard layout of the flex dashboard package. This is analogous to a regular slideshow. The user is able to interact with certain aspects of the presentation before allowing the story to advance to the next stage. I have used text annotations by using the commentary section. This section elaborates the question and the answer to it.

To answer the first question I need to show portions of the whole. I used donut charts instead of pie charts as arc lengths have better perception than areas and angles (Cleveland and McGill, 1984). Donut charts de-emphasize the use of area to make the viewer focus on the changes in overall values. Also, it is easier to compare two donut charts than two pie charts. The chart has percentage labels. I have used plotly for this which shows category labels and the number value on hovering over it. This adds an interactive element to the graph. The second donut chart keeps the format same with only a change in dimension which helps in an easy transition.

The second question is answered using faceted bar plots created using ggplot. The charts compare 3 variables -month no., category and Amount spent. Faceting by category helps improve perception and gives a clear idea of the amount spent on each category. This is not achieved in a stacked bar graph. The two bar plots are arranged side by side using grid arrange, have aspect ratios > 1 and sparse lightened gridlines to enhance perception.

The third question is answered using a time series chart drawn using dygraphs. I have added a slider which adds an interactive element to the graph. This helps me check if there is a consistent pattern and notice any sudden jumps or spikes.

The fourth question uses a bar plot faceted by Year. It uses red color to pop out the months which are not compliant. This improves perception (Wu, E., Jiang, L., Xu, L., & Nandi, A. 2013).

The fifth question uses the dropdown feature of plotly to create a fully interactive graph. Keeping negative values creates a mirrored bar plot but here it is intuitive as negative values indicate expenses or deficits. The dropdown feature allows the user to see the bar plot of each variable separately or as an interspersed bar plot with all 3 variables.

Storyboard created for the project is below:

library(ggplot2)
# 1. ALL EXPENSES: Donut chart tells you that it is Tution fees, Student loan and Rent. This is followed by Credit card payments and Living expenses. 
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/1a.png")

# 2. CREDIT CARD DEBT: The two categories of spending are Services (internet, phone) and Travel/Entertainment
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/2a.png")

# 3. CATEGORY SPENDING BY MONTH USING CREDIT CARD: The faceted barplots help explain the expense by category per month using my credit card. Most of my expenses have been consistent. Services are the most consistent as they include phone and internet. Travel expenses increased towards end of 2016 as tickets are expensive during holiday season. 
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/3a.png")

# 4. INCOME EXPENSES OVER TIME: I have plotted a time series to understand the trend of my income and expenses over time. We see that the time series pattern is pretty consistent. The postive peaks are when salary is credited and negative lows are the expenses.The expenses generally maxed out at the end of the month.
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/4a.png")

# 5. CREDIT CARD LIMIT: In 2016, I have always kept my credit card spending less than 30% of the credit limit. However, in May and June of 2017, I exceeded 30% of the credit limit due to vacations taken.In 2016, before September the spending has been less than 10%. The optimum range is between 20% to 30% which is what I should maintain.
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/5a.png")

# 6. INCOME EXPENDITURE: The income is shown in postive bars and the expenses are shown in negative bars. The amount that remains (surplus/deficit) is calculated as income minus expenses. The income is fairly constant. The jumps in income are due to sharing expenses with roomates who they pay me their share. The expenses are really high every 3 months as that is when I pay my tution fees. This is also when time deficits occur (expenses > income). Otherwise, the surplus is generally around $2000-$3000. Please note that the graph shows income-expenses each month as shown in bank statement and does not consider money rolled over from previous months. The deficits are paid for by using surpluses from previous months.
knitr::include_graphics("C:/Users/Calmth of Life/Dropbox/Harrisburg Semesters/ANLY 512/Final Project - The Quantified Self/6a.png")

References:

http://www.institutionalinvestor.com/blogarticle/3315313/blog/the-quantified-self-movement-reaches-personal-finance.html#/.WdFrtbVrxdh

Hullman, J., Drucker, S., Riche, N. H., Lee, B., Fisher, D., & Adar, E. (2013). A Deeper Understanding of Sequence in Narrative Visualization. IEEE Transactions on Visualization and Computer Graphics, 19(12), 2406-2415.doi:10.1109/tvcg.2013.119

Wu, E., Jiang, L., Xu, L., & Nandi, A. (2013).Graphical Perception in Animated Data Visualizations. Retrieved from http://arxiv.org/abs/1604.00080Rodrigez, M. T., Nunes, S., & Devezas, T. (n.d.). Telling stories with data visualization.

Cleveland, W. S., &Mcgill, R. (1984). Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods. Journal of the American Statistical Association, 79(387), 531.doi:10.2307/2288400

Zubiaga, A., & MacNamee, B. (2016, June 06). Graphical Perception of Value Distributions: An Evaluation .Retrieved August 4, 2017, from http://hdl.handle.net/10197/8324

Wu, E., Jiang, L., Xu, L., & Nandi, A. (2013).Graphical Perception in Animated Data Visualizations. Retrieved from http://arxiv.org/abs/1604.00080