Cyclistic Case Study

How does a bike-share navigate speedy success?

Cody LaBrie, Business Analyst

2024-10-07


Scenario

I’m a junior data analyst working on the marketing team at Cyclistic, a (fake) bike-share company in Chicago. The director of marketing is focused on increasing the number of annual memberships, as they believe this is key to the company’s future success. My goal is to understand how casual riders and annual members use Cyclistic bikes differently. By analyzing these patterns, I hope to design a marketing strategy that will help convert more casual riders into annual members. My recommendations will be backed by solid data insights and professional visualizations.

Setup R Environment

library(lubridate)
library(prettydoc)
library(ggplot2)
library(readr)
library(tidyverse)

Import Data

To ensure accuracy and relevance, I utilized the most current data available, spanning from January 2024 to August 2024. Upon importing this data, I organized it into a single dataframe, “year24_data,” to consolidate and manage the dataset efficiently.

jan24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202401-divvy-tripdata.csv") 
feb24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202402-divvy-tripdata.csv")
mar24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202403-divvy-tripdata.csv")
apr24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202404-divvy-tripdata.csv")
may24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202405-divvy-tripdata.csv")
jun24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202406-divvy-tripdata.csv")
jul24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202407-divvy-tripdata.csv")
aug24_data <- read_csv("C:/Users/codyl/OneDrive/Desktop/202408-divvy-tripdata.csv")

#combine months together for a year dataframe
year24_data <- rbind(jan24_data, feb24_data, mar24_data, apr24_data, may24_data, jun24_data, jul24_data, aug24_data)

is.na(year24_data) %>% sum()
## [1] 2823066

Data Cleanup & Process

Removed unnecessary columns

year24_data <- year24_data %>% select(-c(start_lat, start_lng, end_lat, end_lng, start_station_id,end_station_id, end_station_name))

Created a day_of_week column based on Date/Time provided

day_of_week <- weekdays(year24_data$started_at)

#Add day of week dataframe to main dataset
year24_data <- cbind(day_of_week, year24_data)

Created a month column based on Date/Time provided

month <- month(year24_data$started_at)
year24_data$month_name <- month(year24_data$started_at, label = TRUE, abbr = TRUE)  # For abbreviated names

#Add month dataframe to main dataset
year24_data <- cbind(month, year24_data)

Created a ride_length_mins column based on started_at and ended_at columns

ride_length_mins <- as.numeric(difftime(year24_data$ended_at, year24_data$started_at, units = "secs")) / 60

#Add length of ride dateframe (in minutes) to main dataset
year24_data <- cbind(ride_length_mins, year24_data)

Calculated how many rides are taken each week day by member and casual members

year24_data %>% 
  mutate(day_of_week = wday(started_at, label = TRUE)) %>%  #creates weekday field using wday()
  group_by(member_casual, day_of_week ) %>%  #groups by usertype and weekday
  summarise(number_of_rides = n())
## # A tibble: 14 × 3
## # Groups:   member_casual [2]
##    member_casual day_of_week number_of_rides
##    <chr>         <ord>                 <int>
##  1 casual        Sun                  243043
##  2 casual        Mon                  163062
##  3 casual        Tue                  156226
##  4 casual        Wed                  185134
##  5 casual        Thu                  178479
##  6 casual        Fri                  215907
##  7 casual        Sat                  315103
##  8 member        Sun                  273800
##  9 member        Mon                  344997
## 10 member        Tue                  374590
## 11 member        Wed                  405132
## 12 member        Thu                  379644
## 13 member        Fri                  348272
## 14 member        Sat                  326175

Preformed Calculations

mean_ride_length <- year24_data$ride_length_mins %>% mean()       
# Mean ride length is 18.33 minutes                                    

max_ride_length <- year24_data$ride_length_mins %>% max()
# Max is 1559.93 minutes

member_count <- sum(year24_data$member_casual == 'member')
# Total is 2452610

casual_count <- sum(year24_data$member_casual == 'casual')
# Total is 1456954

Visualizations

The graph above shows that members take more rides every day of the week in comparison to casual individuals. Casual individuals tend to ride more on the weekends indicating they enjoy riding for leisurely activity opposed to as a main form of transportation.

Code used for graph:

ride_per_day_graph <- year24_data %>%                              
  group_by(member_casual, day_of_week) %>% 
  summarise(number_of_rides = n()) %>% 
  arrange(member_casual, day_of_week) %>%
  ggplot(aes(x = day_of_week, y = number_of_rides, fill = member_casual)) + geom_col(position = "dodge") + 
  labs(x='Day of Week', y='Total Number of Rides', title='Rides per Day of Week', fill = 'Type of Membership') + 
  scale_y_continuous(breaks = c(100000, 200000, 300000, 400000, 500000), labels = c("100K", "200K", "300K", "400K", "500K"))



The graph above shows a significant increase of usage in the number of riders throughout the year. This must be due to our company gaining more traction and riders becoming accustomed to the lifestyle our product has to offer. We can also see a noticeable gap between member and casual riders.

Code used for graph:

ride_per_month_graph <- year24_data %>%                              
  group_by(member_casual, month_name) %>% 
  summarise(number_of_rides = n()) %>% 
  arrange(member_casual, month_name) %>%
  ggplot(aes(x = month_name, y = number_of_rides, fill = member_casual)) + geom_col(position = "dodge") + 
  labs(x='Month', y='Total Number of Rides', title='Rides per Month', fill = 'Type of Membership') + 
  scale_y_continuous(breaks = c(100000, 200000, 300000, 400000, 500000), labels = c("100K", "200K", "300K", "400K", "500K"))



The graph above shows a fairly equal distribution of bike types from both membership parties. With no underlying preference, bike types will be ruled out of any marketing suggestions as of now.

Code used for graph:

type_of_bike_graph <- year24_data %>%                              
  group_by(member_casual, rideable_type) %>% 
  summarise(number_of_rides = n()) %>% 
  arrange(member_casual, rideable_type) %>%
  ggplot(aes(x = rideable_type, y = number_of_rides, fill = member_casual)) + geom_col(position = "dodge") + 
  labs(x='Type of Bike', y='Total Number of Rides', title='Bikes vs Membership', fill = 'Type of Membership') +
  scale_y_continuous(breaks = c(200000, 400000, 600000, 800000, 1000000, 1200000), labels = c("200K", "400K", "600K", "800K", "1M", "1.2M"))



The graph above shows that causal riders have a higher monthly ride length average time even though member riders dominate in overall rides. This was surprising, so further analysis was needed to look at the daily average.

Code used for graph:

ride_length_month_graph <- year24_data %>%
  group_by(member_casual, month_name) %>% 
  summarise(average_ride_length = mean(ride_length_mins, na.rm = TRUE)) %>% 
  arrange(month_name, member_casual) %>%
  ggplot(aes(x = month_name, y = average_ride_length, fill = member_casual)) + 
  geom_col(position = "dodge") + 
  labs(x = 'Month', 
       y = 'Average Ride Length (mins)', 
       title = 'Monthly Rider Average', 
       fill = 'Type of Membership')



The graph above also shows casual riders dominating in a higher daily ride length average time.

Code used for graph:

ride_length_days_graph <- year24_data %>%
  group_by(member_casual, day_of_week) %>% 
  summarise(average_ride_length = mean(ride_length_mins, na.rm = TRUE)) %>% 
  arrange(day_of_week, member_casual) %>%
  ggplot(aes(x = day_of_week, y = average_ride_length, fill = member_casual)) + 
  geom_col(position = "dodge") + 
  labs(x = 'Day of Week', 
       y = 'Average Ride Length (mins)', 
       title = 'Daily Rider Average', 
       fill = 'Type of Membership')



Key Takeaways

  1. Frequency of rides:
    • Members ride frequently but for shorter durations on individual days.
    • When averaged across an entire month, the accumulated ride length can be high.
    • Casual riders might take fewer rides but for longer periods
    • This is what lead to a higher daily/monthly average for casual riders.
  2. Weekend vs. Weekday patterns:
    • Casual riders tend to use services more on weekends or for leisure purposes.
    • These leisure rides often involve longer durations.
    • Members, on the other hand, might use the service for commuting or daily tasks.
    • This results in more frequent, shorter rides spread across many days.
  3. Ride purpose:
    • Members likely have access to the bikes as part of their routine.
    • This leads to shorter, utilitarian rides (e.g., commuting or errands).
    • These shorter rides add up to a high monthly total but lower daily averages.
    • Casual riders might take longer, recreational rides, especially on weekends, leading to higher average ride times on certain days.

Thus, Members tend to have more consistent, frequent rides, while Casual riders may have fewer but longer rides.

Recommendations

  1. Targeted marketing on weekends:
    • Since casual riders tend to take longer rides on weekends or holidays, promote membership plans during these peak periods.
    • Offer limited-time discounts or promotions specifically aimed at weekend riders, emphasizing the long-term savings and convenience of membership.
  2. Highlight the cost benefits of membership for frequent riders:
    • Create clear messaging about the financial advantages of membership for those who ride regularly.
    • Use ride data to show casual riders how much they could save over time by switching to a membership plan.
  3. Incentivize frequent casual riders:
    • Identify casual riders who use the service regularly and provide them with personalized offers to join as members (e.g., offer the first month free or at a discount).
    • Send targeted communications to these frequent users, showing how membership could enhance their experience.
  4. Offer flexible membership options:
    • Introduce flexible, short-term membership options (e.g., 1-month or 3-month plans) to appeal to casual riders who may not want a long-term commitment.
    • Highlight how these options can provide the same benefits without a lengthy commitment.
  5. Provide perks for members that enhance the riding experience:
    • Offer exclusive benefits to members, such as access to premium bike stations, early access to new features, or discounts on accessories.
    • Ensure that casual riders are aware of these perks through app notifications, emails, and in-person promotions at bike stations.
  6. Focus on convenience and accessibility:
    • Emphasize the convenience of being a member for routine or utilitarian rides (e.g., faster access to bikes, no need to book in advance).
  7. Engage casual riders through loyalty programs:
    • Develop a loyalty or rewards program that allows casual riders to earn points towards discounts on membership or free ride time.
    • Encourage casual riders to engage more with the service and experience the benefits of transitioning to a membership model.