The primary objective is to understand the differences in how annual members and casual riders use Cyclistic bikes. By analyzing these usage patterns, we aim to uncover meaningful insights that can guide strategies to convert casual riders into annual members.
This analysis will reveal key behaviors and trends for both user groups, enabling Cyclistic’s marketing team to create targeted, data-driven campaigns. These campaigns will address the needs and motivations of casual riders, using personalized offers, tailored messaging, and engaging content to encourage them to commit to annual memberships.
The primary task is to compare the usage patterns of annual members and casual riders by examining:
Frequency of use: How often rides occur.
Time of day: Preferred riding times.
Trip duration: Average length of rides.
Bike types: Most commonly used bikes.
This understanding will support the development of effective marketing strategies aimed at converting casual riders into annual members. Successfully achieving this goal will drive revenue growth, enhance customer retention, and strengthen Cyclistic’s long-term success.
Lily Moreno (Marketing Director):
Leads the development of marketing campaigns and is focused on
leveraging insights to design strategies that effectively increase the
number of annual members.
Cyclistic Executives:
Committed to ensuring the marketing strategy’s success while aligning it
with the company’s broader objectives of growing annual memberships and
driving long-term business performance.
Cyclistic Marketing Analytics Team:
Responsible for gathering and analyzing data to uncover actionable
insights that will inform and optimize the marketing efforts.
Casual Riders and Annual Members:
While not directly involved in decision-making, their behaviors,
preferences, and usage patterns are central to the analysis.
Understanding these groups is crucial for creating impactful marketing
strategies that drive engagement and conversions.
Cyclistic’s historical trip data was downloaded from the official
website provided by Motivate International Inc. under a public license.
This dataset includes trip details for the most recent 12 months of
2024. The files were downloaded as monthly zip archives, extracted into
12 CSV files, and organized in a folder named casestudy on
the desktop. The data underwent thorough checks to ensure completeness,
consistency, and compliance with privacy standards.
The dataset consists of 13 columns, each contributing to the analysis in different ways:
start_station_id.Why RStudio?
RStudio, paired with the R programming language, was selected for this
analysis due to its robust capabilities in:
By utilizing RStudio, we aim to ensure data integrity while uncovering actionable insights that drive business growth.
library(tidyverse)
library(skimr)
library(janitor)
library(readr)
library(dplyr)
library(lubridate)
library(data.table)
library(tidyr)
library(ggplot2)
setwd("C:/Users/Soo/Desktop/casestudy")
file_paths <- list.files(pattern = "*.csv", full.names = TRUE)
datasets <- lapply(file_names, function(file) { data <- read_csv(file) print(glue::glue("\nFile: {file}")) print(glimpse(data)) print(summary(data)) return(data) })
The result as shown below:
for (i in seq_along(datasets)) { cat("File:", basename(file_names[i]), "\n") cat("Rows:", nrow(datasets[[i]]), "Columns:", ncol(datasets[[i]]), "\n\n") }
we need to check the columns names for each files to avoid non consistency between files
datasets <- lapply(datasets, clean_names)
missing_summary <- lapply(datasets, function(data) { colSums(is.na(data)) })
for (i in seq_along(missing_summary)) { cat("Missing Values in", basename(file_names[i]), ":\n") print(missing_summary[[i]]) cat("\n") }
Due to different DatTime formats in dataset files, we need to fix format issue by this code:
datetime_formats <- c("%Y-%m-%d %H:%M:%S", "%m/%d/%Y %H:%M", "%Y-%m-%d", "%d.%m.%Y")
parse_datetime <- function(datetime_str) { parse_date_time(datetime_str, orders = datetime_formats, tz = "UTC") }
datasets <- lapply(datasets, function(df) { df %>% mutate( started_time = parse_datetime(started_time), ended_time = parse_datetime(ended_time) ) })
all_trips <- rbindlist(datasets)
fwrite(all_trips, "combined_all_trips.csv")
missing_summary_combined <- colSums(is.na(all_trips)) print(missing_summary_combined)
To clean our data, we decided not to delete the rows with missing values, even though certain columns have up to 18% missing data. Instead, we chose to remove these columns from our analysis and focus on the critical columns listed below.
important_columns <- c("trip_num", "bike_type", "started_time", "ended_time", "membership")
combined_data <- all_trips %>% select(all_of(important_columns))
print(summary(combined_data))
combined_data <- combined_data %>% mutate( trip_length = as.numeric(difftime(ended_time, started_time, units = "mins")), week_day = lubridate::wday(started_time, label = TRUE), month = format(started_time, "%b") # Add month for later use )
combined_data <- combined_data %>% filter(trip_length >= 0)
total_trips_day <- combined_data %>% group_by(week_day) %>% summarise(total_trips = n(), .groups = 'drop')
cat("Total Trips by Day of the Week:\n") print(total_trips_day)
The analysis of Cyclistic’s historical bike trip data reveals several key insights about the differences between casual and member riders and the potential for converting casual riders into members.
Member riders take more trips: The total number of trips taken by members is significantly higher than that of casual riders across all days of the week.
Member riders are more frequent: The average number of trips per month is higher for members than for casual riders throughout the year.
Member riders use bikes more frequently: Member riders take an average of 13 trips per month, while casual riders take an average of 3 trips per month. This suggests a strong level of engagement with bikes.
Member riders use bikes for shorter trips: The average trip duration is shorter for members than for casual riders both on a daily and monthly basis.
Member riders prefer classic bikes: Members favor classic bikes over electric bikes and electric scooters. Casual riders, however, tend to use electric bikes slightly more than classic bikes and are also the primary users of electric scooters.
Based on these findings, the following recommendations are proposed to convert casual riders into annual members:
1. Promote Membership Benefits:
Focus on convenience and value: Highlight the convenience and cost savings of an annual membership. Emphasize the fact that members can ride for shorter durations with the same affordability as casual riders.
Offer exclusive discounts: Provide members with exclusive discounts on merchandise, food, or services at partnering businesses.
2. Target Specific User Groups:
Emphasize the frequency of use: Target casual riders who take multiple trips a week, especially during peak hours or in specific locations.
Focus on the value of electric bikes: Promote the benefits of electric bikes to casual riders who prefer shorter, more frequent rides.
3. Leverage Digital Media:
Personalized email campaigns: Segment casual riders based on their trip history and send tailored emails with promotional offers, incentives, and educational content about the benefits of membership.
Targeted social media advertising: Utilize social media advertising platforms to target casual riders with relevant ads based on their interests, demographics, and location.
Mobile app promotions: Promote the benefits of membership within the Cyclistic mobile app, potentially offering in