Cyclistic, a Chicago-based bike-share company, offers bikes to both casual riders and annual members. The marketing team aims to convert more casual riders into annual members to increase long-term revenue.
As a data analyst, my goal is to identify how annual members and casual riders use Cyclistic differently, and provide insights to support a targeted marketing strategy.
Key Question:
> How do annual members and casual riders differ in their bike
usage patterns?
This analysis uses historical ride data provided by Cyclistic, a fictional bike-share program in Chicago. The dataset is part of the Google Data Analytics Capstone Case Study and reflects real-world data from Divvy, Chicago’s actual bike-share service.
The data includes two quarters:
Divvy_Trips_2019_Q1.csvDivvy_Trips_2020_Q1.csvEach record represents a single bike trip and contains information such as:
To ensure consistency and comparability across the two datasets (2019 Q1 and 2020 Q1), several preprocessing steps were performed:
member_casual in 2020 vs. usertype in
2019) were standardized.tripduration was computed from start and end times (in
minutes)."Subscriber": Includes member (2020) and
Subscriber (2019)"Customer": Includes casual (2020) and
Customer (2019)Below is the code that performed this processing:
# Combine 2019 and 2020 with clean column names and labels
data2019_clean <- data2019 %>%
select(
usertype,
start_time = start_time,
end_time = end_time,
from_station_name = from_station_name,
to_station_name = to_station_name,
from_station_id = from_station_id,
to_station_id = to_station_id,
tripduration
)
data2020_clean <- data2020 %>%
mutate(
tripduration = as.numeric(difftime(ended_at, started_at, units = "mins"))
) %>%
select(
usertype = member_casual,
start_time = started_at,
end_time = ended_at,
from_station_name = start_station_name,
to_station_name = end_station_name,
from_station_id = start_station_id,
to_station_id = end_station_id,
tripduration
)
# Merge and harmonize user types
data_all <- bind_rows(data2019_clean, data2020_clean) %>%
mutate(
usertype = case_when(
usertype %in% c("member", "Subscriber") ~ "Subscriber",
usertype %in% c("casual", "Customer") ~ "Customer",
TRUE ~ usertype
)
)
Insight: Male and female riders show different subscription preferences. Given that female riders have a higher share of Customers, targeted promotions toward female casual riders may be an effective strategy to boost membership.
Insight: Riders born after 2000 show the highest Customer share (~87%), whereas those born in the 1980s are more likely to be Subscribers. This suggests that younger riders are more likely to ride occasionally. To convert them into members, Cyclistic may offer student/young adult discounts or flexible membership tiers.
Insight: Customer trips cluster around tourist areas and recreational hotspots, such as parks and waterfronts. In contrast, Subscriber rides are more concentrated near downtown hubs and transit-heavy areas, indicating a likely commuter behavior. This geographic segmentation offers opportunities for targeted campaigns based on location and trip purpose (e.g., leisure vs. daily transit).
Insight: Customers take much longer trips on average (~1266 seconds) than Subscribers (~402 seconds), suggesting they may be using bikes more for leisure or exploration, while Subscribers likely use them for commuting or short, routine tasks.
Insight: The distribution shows that most Subscriber trips are short and tightly clustered, peaking sharply below 10 minutes. In contrast, Customer trips have a flatter, wider spread, with a longer right tail—indicating more variability and longer rides.
To analyze how user behavior varies across time, we examined patterns by month, weekday, and hour of day. Because the original dataset was too large to efficiently process on a personal device, we used a random sample of 100,000 trips to ensure timely code execution and visualization generation.
set.seed(42)
data_sample <- data_all %>% sample_n(100000)
data_sample <- data_sample %>%
mutate(
month = month(start_time, label = TRUE, abbr = FALSE),
weekday = wday(start_time, label = TRUE, abbr = FALSE, week_start = 1),
hour = hour(start_time)
)
Insight: Subscribers dominate across all three months shown, with trip volume increasing steadily. Customers also grew in March, suggesting potential seasonality or increased interest later in Q1.
Insight: Subscribers show consistent weekday usage, likely for commuting. Customer usage peaks on weekends, especially Sunday, suggesting more leisure-oriented behavior.
Insight: Subscriber trips peak at 8am and 5–6pm, indicating strong commuter behavior. Customers tend to ride between 11am–5pm, with a more evenly distributed pattern across the day.
Our analysis reveals clear and actionable differences between Customers (casual riders) and Subscribers (annual members), with key distinctions across time, geography, trip duration, and demographics:
Based on the above findings, Cyclistic can implement the following targeted marketing strategies: