The Quantified Self

---
title: "The Quantified Self"
output:
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    theme:
      version: 4
      bg: "#d7b5d8"
      fg: "white" 
      primary: "grey"
      navbar-bg: "#d7b5d8"
      base_font: 
        google: Prompt
      heading_font:
        google: Sen
      code_font:
        google: 
          # arguments to sass::font_google() 
          family: JetBrains Mono
          local: false
---
```{r setup, include=FALSE}
library(flexdashboard)
library(knitr)
library(ggplot2)
library(plotly)
library(dplyr)
library(dygraphs)
library(xts)
library(tidyr)
library(zoo)
library(lubridate)
library(tidyverse)
library(corrplot)
library(extrafont)
library(plotly)
library(RColorBrewer)


# load the data files
df_caloric <- read.csv("QSCaloricIntake.csv")
df_water <- read.csv("QSWaterIntake.csv")
df_steps <- read.csv("QSAverageDailySteps.csv")
df_mood <- read.csv("QSMoodTracking.csv")
df_screen_time <- read.csv("QSScreenTime.csv")
df_meditation <- read.csv("QSMeditation.csv")
df_financial <- read.csv("QSFinancialTracking.csv")
df_exercise <- read.csv("QSExerciseTracking.csv")

#combine all df
df_all <-list(df_mood, df_caloric,df_water,df_steps, df_screen_time,
                df_meditation, df_financial, df_exercise) %>% 
  reduce(inner_join, by='date')
df_all$date <- as.Date(df_all$date, format = "%Y-%m-%d")

```


# Conclusion {.sidebar}

**Table of Contents:**

Project Overview

Case 1: Screen Time

Case 2: Financial Tracking

Case 3: Caloric Intake

Case 4: Water Intake

Case 5: Average Daily Steps 

Case 6: Mood Tracking

Case 7: Meditation

Conclusion
  

# **Agenda**

Row
-------------------------------------


The Quantified Self Final Project is a demonstration of the potential of wearables, analytics, and "Big Data" to achieve personal informatics and quantified self. The project aims to use personal data to answer eight important Cases related to personal health, wellness, and productivity. The Cases include:

* What is the average daily screen time over the past month? 
* What is the financial tracking over the past month?  
* What is the average daily caloric intake over the past month? 
* What is the average daily water intake over the past month?
* What are the average daily steps taken over the past month?  
* What is the mood tracking over the past month?  
* What is the meditation tracking over the past month?  

To answer these Cases, the project utilizes various technologies and tools to gather, analyze, and visualize personal data. The data for this project was randomly generated for demonstration purposes and was stored in CSV files. The data was then imported using the read.csv function.

The project uses a dashboard visualization to demonstrate the various data trends. The dashboard includes bar charts, scatter plots, and heat maps to provide insights into the data. These visualizations are used to identify patterns and trends in personal health, wellness, and productivity. The goal is to identify what variables are best predictors for mood and happiness. I will then be able to take important decisions on my daily activities.

-------------------------------------

# **Screen_Time**


Row {data-width=400}
-------------------------------------


### **How much time do I spend on screens each day?**

The histogram demonstrates the amount of time I spend on screens each day. The plot shows that most days I spend around 5 hours on screens, with a few days having more or less screen time. This information can be used to assess the impact of screen time on health and productivity, and can inform decisions about reducing screen time.

Row {data-width=600}
-------------------------------------
    
```{r first plot,fig.width=6,fig.height=4}


## Screen Time
df_q7 <- read.csv("QSScreenTime.csv")
df_q7$date <- as.Date(df_q7$date, format = "%Y-%m-%d")

# Create the histogram
plot_ly(df_q7, x = ~date, y = ~screen_time, type = "bar", 
        marker = list(color = 'rgba(255, 153, 51, 0.6)', line = list(color = 'rgba(255, 153, 51, 1.0)', width = 1.5))) %>%
  layout(title = "Screen Time over the period",
         xaxis = list(title = "Date", ticklen = 10, tickwidth = 2),  
         yaxis = list(title = "Screen Time (hours per day)"),
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         paper_bgcolor = "purple", plot_bgcolor = "purple")

```


# **Financial_Tracking**


Row {data-width=400}
-------------------------------------


### **How do my income and expenses vary over time? **
The line chart demonstrates the variation in my income and expenses over the past month. The chart shows that my income has been relatively stable over the past month, while my expenses have varied widely from day to day. There have been several days with very high expenses, while most days have lower expenses. This information can be used to assess spending habits and identify areas where expenses can be reduced.


Row {data-width=600}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE,fig.width=6,fig.height=4}

## Financial Tracking
df_q8 <- read.csv("QSFinancialTracking.csv")
df_q8$date <- as.Date(df_q8$date, format = "%Y-%m-%d")

# Create a line chart for income and expenses
plot_ly(df_q8, x = ~date) %>%
  add_lines(y = ~income, name = "Income", line = list(color = "cornflowerblue", width = 1.5)) %>%
  add_lines(y = ~expenses, name = "Expenses", line = list(color = "firebrick", width = 1.5)) %>%
  layout(title = "Variation in Income and Expenses Over Time",
         xaxis = list(title = "Date", tickformat = "%Y-%m-%d"),
         yaxis = list(title = "Amount (USD)"),
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         legend = list(x = 1, y = -0.2, 
                       xanchor = "right", yanchor = "top"),
         paper_bgcolor = "purple", plot_bgcolor = "purple")

```



# **Caloric_Intake**


Row {data-width=400}
-------------------------------------

### **What is my weekly distribution of daily caloric intake?**

* The pie chart shows the distribution of my weekly caloric intake over the past month. The plot shows that my weekly caloric intake doesn't really variate depending on the day of the week and stays around 2100 calories.  
* Sunday is the day where calories intake is the highest.This information can be used to assess dietary habits and identify areas where changes can be made for better health.


Row {data-width=600}
-------------------------------------
    
```{r}

## Caloric Intake
df_q1 <- read.csv("QSCaloricIntake.csv")
df_q1$date <- as.Date(df_q1$date, format = "%Y-%m-%d")

#get weekdays
df_q1$weekday <- weekdays(df_q1$date)
df_weekday <- df_q1 %>%
  group_by(weekday) %>%
  summarize(total_calories = mean(calories))

# Create the pie chart
fig <- plot_ly(df_weekday, labels = ~weekday, values = ~total_calories, type = "pie")

# Customize the chart layout
fig <- fig %>% 
  layout(
    title = "Average Calories by Weekday",
    title_font = list(size = 12),
    margin = list(l = 50, r = 50, t = 100, b = 50),
    font = list(color = "#FFFFFF", size = 12),
    paper_bgcolor = "purple",
    plot_bgcolor = "purple",
    legend = list(font = list(color = "#FFFFFF", size = 16))
  )

# Show the chart
fig

```




# **Water_Intake_Tracking**


Row {data-width=400}
-------------------------------------


### **What is the distribution of my daily water intake?** 

The box plot shows the distribution of my daily water intake over the past month. The plot shows that my daily water intake is mostly between 75 and 130 ounces, with a few days having more or less water intake. This information can be used to assess hydration habits and identify areas where changes can be made for better health.


Row {data-width=600}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE}


## Water Intake
df_q2 <- read.csv("QSWaterIntake.csv")
df_q2$date <- as.Date(df_q2$date, format = "%Y-%m-%d")

# Create plotly box plot
plot <- plot_ly(df_q2, y = ~water_intake, type = "box", boxpoints = "all", jitter = 0.3, pointpos = -1.8, marker = list(color = "#1f77b4")) 

# Customize layout
plot %>% layout(title = "Daily Water Intake (ounces)", yaxis = list(title = "Water Intake (ounces)"), font = list(family = "Arial", color = "white"),
       margin = list(l = 50, r = 50, t = 50, b = 50),
       paper_bgcolor = "purple", plot_bgcolor = "purple",
       height = 250)

# Create a line chart for water intake
plot_ly(df_q2, x = ~date) %>%
  add_lines(y = ~water_intake, name = "Water Intake", line = list(color = "cornflowerblue", width = 1.5)) %>% 
  layout(title = "Variation in Water intake Over Time",
         xaxis = list(title = "Date", tickformat = "%Y-%m-%d", 
                      ticklen = 10, tickwidth = 2),
         yaxis = list(title = "Water Intake (ounces)"), 
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         legend = list(x = 0.1, y = 0.9),
         paper_bgcolor = "grey", plot_bgcolor = "grey",
         height = 200)

```

# **Average_Daily_Steps_Tracking**

Row {data-width=400}
-------------------------------------

### **What is the average daily step count over the past month?**

* The area plot shows the average daily step count over the past month is a bar chart. The chart displays the average daily step count for each week of the past month, with a trend line showing the overall trend in step count over the past month.
* Average stays pretty stable but when compared with beginning of period, the count decreased overall.


Row {data-width=600}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE,fig.width=6,fig.height=4}


## Average Daily Steps
df_q3 <- read.csv("QSAverageDailySteps.csv")
df_q3$date <- as.Date(df_q3$date, format = "%Y-%m-%d")

df_q3$week <- format(df_q3$date, "%W")  ## create a week column

df_q3_weekly <- aggregate(steps ~ week, data=df_q3, FUN=mean) ## get the mean steps per week

# Create the area plot
plot_ly(df_q3_weekly, x = ~week, y = ~steps, type = "scatter", fill = "tozeroy",
        line = list(color = 'rgba(255, 153, 51, 1.0)', width = 2),
        fillcolor = 'rgba(255, 153, 51, 0.4)') %>%
  layout(title = "Steps by day",
         xaxis = list(title = "Week", ticklen = 10, tickwidth = 2), 
         yaxis = list(title = "Steps count by week"),
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         paper_bgcolor = "purple", plot_bgcolor = "purple")


```

# **Mood_Tracking**

Row {data-width=400}
-------------------------------------

### **What is the distribution of my mood scores over the past month? **
* The scatter plots shows the relationships between my mood scores over the past month, income, and calories.  
* There is no clear relationships between mood and income looking at the plot but it seems to be slightly positively correlated.   
* There is also no clear relationships between mood and calories looking at the plot but it seems to be slightly negatively correlated.


Row {data-width=600}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE,fig.width=6,fig.height=2}

# create the plot
plot_ly(df_all, x = ~income, y = ~mood, type = "scatter", mode = "markers", marker = list(color = "white")) %>%
  layout(title = "Mood vs Income", 
         xaxis = list(title = "Income in $"), yaxis = list(title = "Mood score"), 
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         paper_bgcolor = "grey", plot_bgcolor = "grey")

# create the plot
plot_ly(df_all, x = ~calories, y = ~mood, type = "scatter", mode = "markers", marker = list(color = "white")) %>%
  layout(title = "Mood vs Calories", 
         xaxis = list(title = "Calories"), yaxis = list(title = "Mood score"), 
         font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         paper_bgcolor = "purple", plot_bgcolor = "purple")

```
# **Meditation**

Row {data-width=400}
-------------------------------------

### **What is the frequency and duration of my meditation practice?**

* The line plot demonstrates the frequency and duration of my meditation practice over the past month. The plot shows that my meditation frequency has been fairly consistent over the past month, with most days having at least some amount of meditation.  
* The duration of meditation, on the other hand, varies widely from day to day. There have been several days with very long meditation sessions (over 20 minutes), while most days have shorter sessions (around 10 minutes).  
* This information can be used to assess the consistency and effectiveness of the meditation practice, and can inform future practice decisions.

Row {data-width=600}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE,fig.width=6,fig.height=4}


## Meditation
df_q6 <- read.csv("QSMeditation.csv")
df_q6$date <- as.Date(df_q6$date, format = "%Y-%m-%d")

# create plotly time series plot
fig <- plot_ly()
fig <- fig %>% add_trace(x = df_q6$date, y = df_q6$meditation_duration, name = "Duration", type = "scatter", mode = "lines")
fig <- fig %>% 
  add_bars(x = df_q6$date, y = df_q6$meditation_frequency, name = "Frequency")

# format plot
fig <- fig %>% layout(title = "Meditation Duration and Frequency",
                      xaxis = list(title = "Date", ticklen = 10, tickwidth = 2),
                      yaxis = list(title = "Duration (in mins) and Frequency"),
                      font = list(family = "Arial", size = 12, color = "white"),
         margin = list(l = 50, r = 50, t = 50, b = 50),
         paper_bgcolor = "purple", plot_bgcolor = "purple")
fig

```

# **Conclusion**

Column {data-width=300}
-------------------------------------

### **What are good predictors of mood?**

* Looking at the correlation matrix, calories, water intake, screen time, income, expense, and exercise duration have the highest correlation.   
* Overall, less calories, water, screen time, expense, and exercise duration are correlated with better mood.    
* A higher income is correlated with a better mood.   
* In my daily habits, I will aim to reduce screen time and calories while exercising more and meditate for shorter period of time.

Column {data-width=500}
-------------------------------------
    
```{r, echo = FALSE, message = FALSE, fig.width=6}

# Create a correlation matrix of the variables in mtcars
corr_matrix <- cor(df_all[, -1])

corrplot(corr_matrix, method = "pie", type="upper", bg="#d7b5d8",
         addCoef.col = "black", cl.cex = .8,
         tl.cex = 0.6, tl.col = 'black',
         addColorbar = TRUE, col=brewer.pal(n=5, name="RdBu"),
         number.cex = 0.4, height = 250)

```

-------------------------------------

# **Data_sources**

<font style="font-size: 10px">
**Case 1 -Caloric Intake:**  
The data for this Case was collected by tracking daily caloric intake using an app. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSCaloricIntake.csv"*. 

**Case 2 -Water Intake Tracking:**  
The data for this Case was collected by tracking daily water intake using an app. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSWaterIntake.csv"*.  

**Case 3 -Average Daily Steps Tracking:**  
The data for this Case was collected by tracking daily steps using a wearable device. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSAverageDailySteps.csv"*.  

**Case 4 -Mood Tracking:**  
The data for this Case was collected by tracking daily mood using an app. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSMoodTracking.csv"*.  

**Case 5 -Monthly & Weekday Miles:**  
The data for this Case was collected by manually tracking daily miles walked and run for each month and day of the week over a period of one year (from January 1st, 2022 to December 31st, 2022). The data was entered into spreadsheets for monthly and weekly data and saved in CSV files named *"Q5_monthly_miles.csv"* and *"Q5_weekday_miles.csv"*, respectively.  

**Case 6 -Meditation Tracking:**  
The data for this Case was collected by tracking daily meditation duration and frequency using an app. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSMeditation.csv"*.  

**Case 7 -Financial Tracking:**  
The data for this Case was collected by tracking daily income and expenses using an app. The data was collected over a period of 60 days (from February 12th, 2023 to April 12th, 2023) and saved in a CSV file named *"QSFinancialTracking.csv"*.