This case study analyzes Divvy’s bike-sharing dataset, covering the last 12 months of trips up to February 2025. The data was obtained from the official Divvy public repository (https://divvy-tripdata.s3.amazonaws.com/index.html ). The main objective is to compare usage patterns between casual riders and annual members to better understand rider behavior. Key guiding questions include: How do members and casuals differ in their trip duration? On which days and times are bikes most frequently used? What trends emerge when comparing usage across weekdays, weekends, and months?
The raw dataset, which exceeded 1.5 GB in size, was carefully prepared and cleaned to ensure accuracy and consistency. Missing values and duplicates were checked and addressed, and trip duration was calculated in both minutes and hours for meaningful comparison. After confirming the dataset’s quality and trustworthiness, it was aggregated into summaries that highlight key differences between rider groups and usage trends over time. This foundation supports reliable insights and effective visualization for decision-making.
The purpose of this case study is to demonstrate the end-to-end data analytics process in R, from importing and cleaning raw data to transforming and visualizing insights using R’s analytical libraries. The dataset represents real-world bike-share trips from the Divvy bike system, covering over a year of data and totaling more than 1.5 GB of trip records.
This project showcases key data analysis skills such as:
Data wrangling and cleaning: managing large CSV files, removing inconsistencies, and preparing clean, structured data for analysis. (The complete data cleaning process is documented in this HTML RMarkdown file ).
Data transformation and aggregation: summarizing trip data to compare member and casual rider behavior, usage by time of day, and weekly or monthly trends.
Analytical interpretation: identifying usage patterns, seasonality, and behavioral differences that could inform operational or marketing strategies.
Data visualization: using R’s visualization tools (such as ggplot2) to produce clear, data-driven visuals directly within the RMarkdown report.
Overall, this case study aims to demonstrate practical data analytics workflow proficiency — cleaning, transforming, aggregating, and visualizing large-scale datasets entirely within R — while providing insights into user behavior in the bike-sharing context.
After cleaning and aggregating the dataset, several visualizations were created in R to better understand ride patterns and differences between casual riders and annual members.
Explanation: Casual riders tend to have significantly longer ride durations compared to members. This may suggest that casual users use bikes more for leisure or sightseeing, while members use them for short, routine commutes.
To further validate our hypothesis about user behavior, it’s important to compare how long each rider type typically uses the bikes. By analyzing the distribution of trip durations, we can better understand the differences between casual riders and members — for instance, whether rides usually last under 30 minutes, between 30–60 minutes, or over an hour.
the following chart displays total ride by duration range for each
membership type.
The following plots clearly show that the majority of users tend to use the bikes for less than 30 minutes. For longer durations, the percentage of users decreases noticeably. However, casual riders show slightly higher usage in the longer ride categories compared to members, suggesting that casual users are more likely to take extended trips.
We can also analyze the total number of rides per day of the week to determine whether users tend to ride more during weekends or weekdays. This helps reveal patterns in how different rider types use the service — for leisure or for daily commuting purposes.
The results show that annual members record a higher number of rides overall compared to casual riders. However, there’s a clear trend where member usage decreases toward the end of the week, while casual rider activity increases during weekends. This suggests that members primarily use the bikes for weekday commuting, whereas casual riders are more likely to use them for leisure or recreational purposes.
Next, we can examine whether these differences in usage also vary across the seasons of the year. By analyzing the total number of rides per month, we can identify seasonal trends and understand how weather or time of year affects bike usage among both casual riders and annual members.
The results show that both rider types follow the same seasonal pattern,
with similar rises and drops in ride counts throughout the year. This
indicates that seasonal effects impact all users in the same way, and
that factors such as weather or temperature influence overall bike usage
rather than creating differences between casual riders and annual
members.
Finally, to complete the analysis and provide additional evidence, we examine rides per hour to observe how bike usage varies throughout a single day. This allows us to identify peak usage hours and understand differences in daily patterns between casual riders and annual members.
The chart shows that annual members exhibit two clear peak periods,
around 8 AM and 5 PM, reflecting typical commuting times to and from
work. In contrast, casual riders’ activity gradually increases
throughout the day, peaking in the late afternoon, which suggests that
casual users primarily ride for leisure and recreational purposes rather
than commuting.
This analysis of Divvy bike-share data provides clear insights into user behavior patterns over the last 12 months. The key findings are:
Ride Duration – The majority of trips last less than 30 minutes, with casual riders showing slightly more frequent longer trips than members, suggesting leisure-oriented usage.
Day-of-Week Usage – Annual members ride predominantly on weekdays, while casual riders peak on weekends, reflecting commuting vs. recreational patterns.
Monthly Trends – Both rider types follow similar seasonal trends, indicating that external factors such as weather influence overall bike usage, but do not differentiate between user types.
Hourly Patterns – Members exhibit two peak hours (8 AM and 5 PM) corresponding to work commute times, whereas casual riders’ activity gradually rises in the afternoon, peaking in the evening, consistent with recreational use.
Targeted Marketing & Promotions: Focus marketing campaigns for casual riders around weekends and afternoons, while offering member incentives for weekday commuting.
Fleet Management: Ensure more bikes are available during peak commuter hours (8 AM & 5 PM) in high-demand locations for members, and redistribute bikes toward popular leisure areas during weekends for casual riders.
Service Improvements: Consider special weekend packages, recreational route suggestions, or timed discounts for casual users to increase engagement.
Future Analysis: Additional studies could explore trip start/end locations, trip frequency by geographic region, or the effect of special events and weather conditions on usage.
Overall, the cleaned and aggregated dataset provides reliable insights that can inform operational decisions, marketing strategies, and user engagement initiatives for the bike-share system.