###Question 2: How does the total exercise time break down by exercise type for each person?
###Question 3: What is the daily trend of step counts for each person over the two-month period?
###Question 4: Is there a relationship between the minutes of exercise and the daily step count for each person?
###Question 5: What are the activity patterns for each person on different days of the week?Introduction
The focus of this assignment was to develop data visualization using data from the life of the students as part of the Quantified Self movement which leverages personal data to gain deeper insights into one’s own life. This project analyzed the physical activity for the two students, Kelia Olughu and Puteri Sara Lily Fauzi, labelled as Person 1 and Person 2. Data was collected over a 61-day period from June 1 to July 31, 2025. The goal was to identify and visualize patterns in their daily routines, exercise habits, and overall activity levels.
This analysis was guided by five key questions, which are outlined below: 1. How do the average daily step counts and minutes of exercise compare between Person 1 and Person 2? 2. What is the daily trend of step counts for each person over the two-month period? 3. Is there a relationship between the minutes of exercise and the daily step count for each person? 4. How does the total exercise time break down by exercise type for each person? 5. What are the activity patterns for each person on different days of the week?
Data Acquisition, Storage, and Manipulation
The data for this project was collected for two individuals over a two-month period. The data collected included daily step counts, duration of exercise (in minutes), and type of exercise (e.g., Running, Walking). This data were captured automatically via personal fitness trackers integrated with smartphone health applications. This data was then collected and manually recorded into a single CSV (Comma-Separated Values) file. The data was structured in a “long” format, where each row corresponds to a single day for a single person. This format is optimal for analysis and visualization in R and other data analysis tools. The columns included Person, Date, Day of Week, Daily Step Count, Minutes of Exercise, and Exercise Type.
The entire project, from data cleaning to final dashboard creation, was performed using the R programming language. The primary packages used were from the tidyverse suite (dplyr for data manipulation and ggplot2 for creating static plots). The plotly package was used to transform the static plots into the interactive graphical elements required by the assignment. The final dashboard was built using the flexdashboard package, which compiles the analysis and text into the required single HTML file deliverable.
Analysis and Visualization Summaries
This section provides a written summary for each visualization presented in the dashboard.
Visualization 1: Comparison of Average Daily Metrics A grouped bar chart was chosen to provide a direct and clear comparison of the average Daily Step Count and Minutes of Exercise for Person 1 and Person 2. This aggregates the 61 days of data into two key performance metrics for each individual, thus addressing the first question, how do the average daily step counts and minutes of exercise compare between Person 1 and Person 2? The visualization reveals an interesting distinction in activity styles. On average, Person 2 achieves a higher daily step count (8,590) than Person 1 (7,921). Conversely, Person 1 averages slightly more time in dedicated exercise sessions (50 minutes) compared to Person 2 (48 minutes). This suggests that while Person 1 may focus more on structured workouts, Person 2’s lifestyle incorporates more general activity throughout the day, leading to a higher overall step count.
Visualization 2: Breakdown of Total Exercise Minutes by Type To show how each person allocates their workout time, faceted bar charts were used. This method creates a separate plot for each individual, showing the total minutes accumulated for each Exercise Type (excluding days with “None”). The charts highlight distinct exercise preferences. Person 1 heavily favors Running, which accounts for the vast majority of their exercise time. Person 2, in contrast, displays a more balanced routine, splitting their time more evenly between Running and Walking than Person 1. This visualization effectively illustrates their different approaches to fitness.
Visualization 3: Daily Step Count Trend A line chart was used to plot the daily step count for each person over the full two-month period. This visualization is ideal for showing trends over time, highlighting volatility and patterns. The chart shows considerable daily fluctuation in activity for both individuals, demonstrating that activity level is not static. Notably, there are several points where their activity levels peak and dip in unison, suggesting shared activities on those specific days. Overall, Person 2’s activity line trends slightly above Person 1’s for a majority of the observed period.
Visualization 4: Relationship Between Exercise and Step Count To explore the correlation between two continuous variables, Minutes of Exercise and Daily Step Count, a scatter plot was created. A linear regression trend line was overlaid for each person to better visualize the strength and direction of the relationship. There is a clear positive correlation for both individuals: an increase in the time spent on dedicated exercise is strongly associated with a higher total daily step count. This indicates that formal workouts are a primary driver of their overall physical activity. The data points for Person 1 appear slightly more tightly clustered around the trend line, suggesting a marginally more consistent relationship between their exercise time and step accumulation.
Visualization 5: Average Daily Steps by Day of the Week A grouped bar chart was used to display the average step count for each day of the week, with bars grouped by person. This allows for easy comparison of weekly routines. A strong and clear weekly pattern emerges for both individuals. Activity levels are consistently lower on weekdays (Tuesday through Friday) and experience a significant spike over the weekend. For both Person 1 and Person 2, Saturday is the most active day on average. This powerful insight suggests that work or other weekday commitments limit their opportunity for physical activity, which they compensate for on the weekends.