The following is a case study analysis of twelve months of Divvy bike share data. Divvy launched in Chicago in 2016 and has quickly grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. Their bikes can be unlocked, ridden, and returned to and from any station within their network.

Divvy’s pricing plan identifies two types of riders: members and casual riders. Members have purchased an annual membership while casual riders purchase single rides or full-day passes. With members proving to be more profitable than casual riders, Divvy has set its sight on growth by developing strategies to convert casual riders to members.

The forthcoming analysis identifies distinguishing behaviors between the two types of riders to inform marketing strategies aimed at converting casual riders to members.

The business questions we are tasked with answering are:

“How do annual members and casual riders use Divvy bikes differently?”
“How would these observations influence a marketing campaign aimed at converting casual riders to members?”

Dataset: Publicly available Divvy trip data that includes: start day and time, end day and time, start station, end station, type of bike, and rider type (member or casual). Each row of data is one individual ride.

For this analysis we are using 12 months of data. January 2021 - December 2021. The compiled dataset contains 5,594,452 rows. The dataset can be downloaded here.

Key behavioral findings:

Member rides make up a majority of morning rides and weekday rides
Member rides are typically shorter in length
Member ride count and duration is consistent across all seven days of the week.
Casual rides make up a majority of rides on Saturday and Sunday
Casual rides are typically longer in length
Casual ride count and duration is highest on Saturday and Sunday

Let’s begin by loading required R packages:

library(tidyverse)
library(lubridate)
library(ggplot2)
library(gridExtra)
library(waffle)
library(reshape2)

Next we will upload our raw data.

january_2021 <- read_csv("1. Cyclistic_January_2021_tripdata.csv")
february_2021 <- read_csv("2. Cyclistic_February_2021_tripdata.csv")
march_2021 <- read_csv("3. Cyclistic_March_2021_tripdata.csv")
april_2021 <- read_csv("4. Cyclistic_April_2021_tripdata.csv")
may_2021 <- read_csv("5. Cyclistic_May_2021_tripdata.csv")
june_2021 <- read_csv("6. Cyclistic_June_2021_tripdata.csv")
july_2021 <-read_csv("7. Cyclistic_July_2021_tripdata.csv")
august_2021 <- read_csv("8. Cyclistic_August_2021_tripdata.csv")
september_2021 <- read_csv("9. Cyclistic_September_2021_tripdata.csv")
october_2021 <- read_csv("10. Cyclistic_October_2021_tripdata.csv")
november_2021 <- read_csv("11. Cyclistic_November_2021_tripdata.csv")
december_2021 <- read_csv("12. Cyclistic_December_2021_tripdata.csv")

Ensuring data integrity:

It is important we check that column names match before we combine the data

colnames(january_2021)
colnames(february_2021)
colnames(march_2021)
colnames(april_2021)
colnames(may_2021)
colnames(june_2021)
colnames(july_2021)
colnames(august_2021)
colnames(september_2021)
colnames(october_2021)
colnames(november_2021)
colnames(december_2021)

We should also check the structure of our data sets:

str(january_2021)
str(february_2021)
str(march_2021)
str(april_2021)
str(may_2021)
str(june_2021)
str(july_2021)
str(august_2021)
str(september_2021)
str(october_2021)
str(december_2021)

After ensuring data sets are combatible, we are ready to combine them:

q1_2021 <- bind_rows(january_2021, february_2021, march_2021)
q2_2021 <- bind_rows(april_2021, may_2021, june_2021)
q3_2021 <- bind_rows(july_2021, august_2021, september_2021)
q4_2021 <- bind_rows(october_2021, november_2021, december_2021)
data_2021 <- bind_rows(q1_2021, q2_2021, q3_2021, q4_2021)

The next phase is to clean and manipulate our data so it will be ready to analyze:

In order to clean our data effectively, we will create columns for “ride length” and “day of week”:

## Creating "ride_length" column (in seconds)

q1_2021$ride_length <- difftime(q1_2021$ended_at, q1_2021$started_at)
q2_2021$ride_length <- difftime(q2_2021$ended_at, q2_2021$started_at)
q3_2021$ride_length <- difftime(q3_2021$ended_at, q3_2021$started_at)
q4_2021$ride_length <- difftime(q4_2021$ended_at, q4_2021$started_at)
data_2021$ride_length <- difftime(data_2021$ended_at, data_2021$started_at)

## Convert "ride_length" to characters and then to numeric

q1_2021$ride_length <- as.numeric(as.character(q1_2021$ride_length))
q2_2021$ride_length <- as.numeric(as.character(q2_2021$ride_length))
q3_2021$ride_length <- as.numeric(as.character(q3_2021$ride_length))
q4_2021$ride_length <- as.numeric(as.character(q4_2021$ride_length))
data_2021$ride_length <- as.numeric(as.character(data_2021$ride_length))

## Creating day_of_week column

q1_2021$day_of_week <- weekdays(as.Date(q1_2021$started_at))
q2_2021$day_of_week <- weekdays(as.Date(q2_2021$started_at))
q3_2021$day_of_week <- weekdays(as.Date(q3_2021$started_at))
q4_2021$day_of_week <- weekdays(as.Date(q4_2021$started_at))
data_2021$day_of_week <- weekdays(as.Date(data_2021$started_at))

Now we will remove bad data. Our data frames currently have rows with a negative ride length or trips that were actually bikes removed from service for testing. We will create a new “v2” dataframe since we are removing data.

q1_2021_v2 <- q1_2021[!(q1_2021$end_station_id == "Hubbard Bike-checking (LBS-WH-TEST)" & !is.na(q1_2021$start_station_id) & !is.na(q1_2021$start_station_name) & !is.na(q1_2021$end_station_name) & !is.na(q1_2021$end_station_id) | q1_2021$ride_length < 0),]
q2_2021_v2 <- q2_2021[!(q2_2021$end_station_id == "Hubbard Bike-checking (LBS-WH-TEST)" & !is.na(q2_2021$start_station_id) & !is.na(q2_2021$start_station_name) & !is.na(q2_2021$end_station_name) & !is.na(q2_2021$end_station_id) | q2_2021$ride_length < 0),]
q3_2021_v2 <- q3_2021[!(q3_2021$end_station_id == "Hubbard Bike-checking (LBS-WH-TEST)" & !is.na(q3_2021$start_station_id) & !is.na(q3_2021$start_station_name) & !is.na(q3_2021$end_station_name) & !is.na(q3_2021$end_station_id) | q3_2021$ride_length < 0),]
q4_2021_v2 <- q4_2021[!(q4_2021$end_station_id == "Hubbard Bike-checking (LBS-WH-TEST)" & !is.na(q4_2021$start_station_id) & !is.na(q4_2021$start_station_name) & !is.na(q4_2021$end_station_name) & !is.na(q4_2021$end_station_id) | q4_2021$ride_length < 0),]
data_2021_v2 <- data_2021[!(data_2021$end_station_id == "Hubbard Bike-checking (LBS-WH-TEST)" & !is.na(data_2021$start_station_id) & !is.na(data_2021$start_station_name) & !is.na(data_2021$end_station_name) & !is.na(data_2021$end_station_id) | data_2021$ride_length < 0),]

Now that our data is cleaned, we will further manipulate it to aid in our analysis by creating a “time of day” column. This column will group our trips into four distinct time periods.

## First we need to create two new columns "start_hour" & "start_minute" 

q1_2021_v2$start_hour <- hour(q1_2021_v2$started_at)
q1_2021_v2$start_minute <- minute(q1_2021_v2$started_at)
q2_2021_v2$start_hour <- hour(q2_2021_v2$started_at)
q2_2021_v2$start_minute <- minute(q2_2021_v2$started_at)
q3_2021_v2$start_hour <- hour(q3_2021_v2$started_at)
q3_2021_v2$start_minute <- minute(q3_2021_v2$started_at)
q4_2021_v2$start_hour <- hour(q4_2021_v2$started_at)
q4_2021_v2$start_minute <- minute(q4_2021_v2$started_at)
data_2021_v2$start_hour <- hour(data_2021_v2$started_at)
data_2021_v2$start_minute <- minute(data_2021_v2$started_at)

With our new columns we can now create the “time_of_day” column. Our custom time frames are as follows:

“AM” - 4:00AM-9:59AM
“MID” - 10:00AM-2:59PM
“PM” - 3:00PM-7:59PM
“LATE” - 8:00PM-3:59AM

DISCLAIMER: These time periods were created to correspond to easily categorized periods of the day, strongly influenced by the M-F work schedule. Analysis will be impacted by this choice. In a professional setting, stakeholders would be involved in this decision.

## Time of day column is being created for each data frame to preserve the ability to analyze by quarter.

q1_2021_v2$time_of_day[q1_2021_v2$start_hour>=20 & q1_2021_v2$start_minute>=0]='LATE'
q1_2021_v2$time_of_day[q1_2021_v2$start_hour<=3 & q1_2021_v2$start_minute<=59] = 'LATE'
q1_2021_v2$time_of_day[q1_2021_v2$start_hour>=4 & q1_2021_v2$start_minute>=0 & q1_2021_v2$start_hour<=9 & q1_2021_v2$start_minute<=59] = 'AM'
q1_2021_v2$time_of_day[q1_2021_v2$start_hour>=10 & q1_2021_v2$start_minute>=0 & q1_2021_v2$start_hour<=14 & q1_2021_v2$start_minute<=59] = 'MID'
q1_2021_v2$time_of_day[q1_2021_v2$start_hour>=15 & q1_2021_v2$start_minute>=0 & q1_2021_v2$start_hour<=19 & q1_2021_v2$start_minute<=59] = 'PM'

q2_2021_v2$time_of_day[q2_2021_v2$start_hour>=20 & q2_2021_v2$start_minute>=0]='LATE'
q2_2021_v2$time_of_day[q2_2021_v2$start_hour<=3 & q2_2021_v2$start_minute<=59] = 'LATE'
q2_2021_v2$time_of_day[q2_2021_v2$start_hour>=4 & q2_2021_v2$start_minute>=0 & q2_2021_v2$start_hour<=9 & q2_2021_v2$start_minute<=59] = 'AM'
q2_2021_v2$time_of_day[q2_2021_v2$start_hour>=10 & q2_2021_v2$start_minute>=0 & q2_2021_v2$start_hour<=14 & q2_2021_v2$start_minute<=59] = 'MID'
q2_2021_v2$time_of_day[q2_2021_v2$start_hour>=15 & q2_2021_v2$start_minute>=0 & q2_2021_v2$start_hour<=19 & q2_2021_v2$start_minute<=59] = 'PM'

q3_2021_v2$time_of_day[q3_2021_v2$start_hour>=20 & q3_2021_v2$start_minute>=0]='LATE'
q3_2021_v2$time_of_day[q3_2021_v2$start_hour<=3 & q3_2021_v2$start_minute<=59] = 'LATE'
q3_2021_v2$time_of_day[q3_2021_v2$start_hour>=4 & q3_2021_v2$start_minute>=0 & q3_2021_v2$start_hour<=9 & q3_2021_v2$start_minute<=59] = 'AM'
q3_2021_v2$time_of_day[q3_2021_v2$start_hour>=10 & q3_2021_v2$start_minute>=0 & q3_2021_v2$start_hour<=14 & q3_2021_v2$start_minute<=59] = 'MID'
q3_2021_v2$time_of_day[q3_2021_v2$start_hour>=15 & q3_2021_v2$start_minute>=0 & q3_2021_v2$start_hour<=19 & q3_2021_v2$start_minute<=59] = 'PM'
  
q4_2021_v2$time_of_day[q4_2021_v2$start_hour>=20 & q4_2021_v2$start_minute>=0]='LATE'
q4_2021_v2$time_of_day[q4_2021_v2$start_hour<=3 & q4_2021_v2$start_minute<=59] = 'LATE'
q4_2021_v2$time_of_day[q4_2021_v2$start_hour>=4 & q4_2021_v2$start_minute>=0 & q4_2021_v2$start_hour<=9 & q4_2021_v2$start_minute<=59] = 'AM'
q4_2021_v2$time_of_day[q4_2021_v2$start_hour>=10 & q4_2021_v2$start_minute>=0 & q4_2021_v2$start_hour<=14 & q4_2021_v2$start_minute<=59] = 'MID'
q4_2021_v2$time_of_day[q4_2021_v2$start_hour>=15 & q4_2021_v2$start_minute>=0 & q4_2021_v2$start_hour<=19 & q4_2021_v2$start_minute<=59] = 'PM;'

data_2021_v2$time_of_day[data_2021_v2$start_hour>=20 & data_2021_v2$start_minute>=0]='LATE'
data_2021_v2$time_of_day[data_2021_v2$start_hour<=3 & data_2021_v2$start_minute<=59] = 'LATE'
data_2021_v2$time_of_day[data_2021_v2$start_hour>=4 & data_2021_v2$start_minute>=0 & data_2021_v2$start_hour<=9 & data_2021_v2$start_minute<=59] = 'AM'
data_2021_v2$time_of_day[data_2021_v2$start_hour>=10 & data_2021_v2$start_minute>=0 & data_2021_v2$start_hour<=14 & data_2021_v2$start_minute<=59] = 'MID'
data_2021_v2$time_of_day[data_2021_v2$start_hour>=15 & data_2021_v2$start_minute>=0 & data_2021_v2$start_hour<=19 & data_2021_v2$start_minute<=59] = 'PM'

## Let's rename the values within our rideable_type column

data_2021_v2 <- data_2021_v2 %>% 
  mutate(rideable_type = recode(rideable_type,
                                classic_bike = 'Classic', docked_bike='Docked', electric_bike='Electric'))

We are ready to begin our analysis:

Let’s start with raw data of the variables we will be looking at:

Day of week (Monday-Sunday)
Time of day (AM, MID, PM, LATE)
Ride length (in minutes)
Ride type (member or casual)

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##     0.00     6.75    12.00    21.63    21.77 55944.15

##    Ride Type Day of Week Avg Length Min
## 1     member      Monday       13.24712
## 2     casual      Monday       31.51744
## 3     member     Tuesday       12.78499
## 4     casual     Tuesday       27.76331
## 5     member   Wednesday       12.81457
## 6     casual   Wednesday       26.89222
## 7     member    Thursday       12.77541
## 8     casual    Thursday       27.06034
## 9     member      Friday       13.32393
## 10    casual      Friday       29.33489
## 11    member    Saturday       15.26466
## 12    casual    Saturday       34.03548
## 13    member      Sunday       15.65791
## 14    casual      Sunday       36.72005

##    Ride Type Day of Week Median Length Min
## 1     member      Monday          9.200000
## 2     casual      Monday         15.950000
## 3     member     Tuesday          9.133333
## 4     casual     Tuesday         14.283333
## 5     member   Wednesday          9.216667
## 6     casual   Wednesday         13.966667
## 7     member    Thursday          9.133333
## 8     casual    Thursday         13.783333
## 9     member      Friday          9.433333
## 10    casual      Friday         14.966667
## 11    member    Saturday         10.816667
## 12    casual    Saturday         17.816667
## 13    member      Sunday         10.866667
## 14    casual      Sunday         18.716667

##   Ride Type Time of Day Avg Length Min
## 1    member          AM       12.31430
## 2    casual          AM       25.12575
## 3    member         MID       13.70664
## 4    casual         MID       33.53091
## 5    member          PM       14.10659
## 6    casual          PM       30.40294
## 7    member        LATE       13.87497
## 8    casual        LATE       32.69821

##   Ride Type Time of Day Median Length Min
## 1    member          AM           8.95000
## 2    casual          AM          12.26667
## 3    member         MID           9.35000
## 4    casual         MID          18.35000
## 5    member          PM          10.13333
## 6    casual          PM          16.30000
## 7    member        LATE           9.55000
## 8    casual        LATE          14.40000

From January 2021-December 2021, there were a total of 5,594,452 rides. The distribution between member and casual is as follows:

## Member Rides 
##      3065739

## Casual Rides 
##      2528713

From January 2021 - December 2021, 55% of all rides were by members and 45% were casual rides.

In-depth behavioral analysis

When do member and casual rides occur: Day of week

## # A tibble: 14 x 3
## # Groups:   Ride_Type [2]
##    Ride_Type Day_of_Week Total_Rides
##    <ord>     <ord>             <int>
##  1 member    Monday           416159
##  2 member    Tuesday          465470
##  3 member    Wednesday        477122
##  4 member    Thursday         451483
##  5 member    Friday           446377
##  6 member    Saturday         433025
##  7 member    Sunday           376103
##  8 casual    Monday           286347
##  9 casual    Tuesday          274363
## 10 casual    Wednesday        278920
## 11 casual    Thursday         286045
## 12 casual    Friday           364044
## 13 casual    Saturday         557940
## 14 casual    Sunday           481054

Member rides make up a majority of all rides Monday-Friday. Casual rides make up a majority of all rides Saturday and Sunday

Percentage of rides occuring on each day of the week:

Member ride distribution is relatively even across all days of the week. Casual rides, on the other hand, mostly occur on Saturday and Sunday, with a slight increase on Friday.

When do member and casual rides occur: Time of day

We’ve analyzed ride data by day of week. Now let’s shift to time of day. Again, the ranges you will see below are as follows:

AM - 4:00AM-9:59AM
MID - 10:00AM-2:59PM
PM - 3:00PM-7:59PM
LATE - 8:00PM-3:59PM

Let’s look at all rides by time of day:

## # A tibble: 4 x 2
##   Time_of_Day Total_Rides
##   <ord>             <int>
## 1 AM               808261
## 2 MID             1572295
## 3 PM              2255092
## 4 LATE             958804

From the above plot we see several unique behaviors: the PM time of day (3PM-8PM) is when most member and casual rides occur; member ride count in the AM is more than double casual ride count; the LATE time of day is majority casual rides.

Percentage of rides by time of day:

Significant difference in the AM and LATE times of day. 19% of all member rides occur in the AM compared to only 9% of casual rides. While only 14% of all member rides occur in the LATE period compared to 21% of of casual rides.

At this point we’ve looked separately at behavioral data by day of week and time of day. Will we see anything different if we look at them together?

Total rides by time of day and day of week:

A clear example of how behavior patterns shift between weekday and weekend. Casual ride count is significantly higher in all times of day, except AM, on Saturday and Sunday.

Behavioral differences within ride length: Day of week

After investigating behavioral differences by looking at when rides occur. We will analyze rides by looking at ride length. This is a different type of behavior, and it is fundamental to understanding the differences between the way members and casual riders use Divvy bikes.

Let’s begin by looking again at an aggregate summary of ride length (minutes):

##   Ride Type Ride Length Min.Min. Ride Length Min.1st Qu. Ride Length Min.Median
## 1    member             0.000000                5.566667               9.600000
## 2    casual             0.000000                9.066667              15.966667
##   Ride Length Min.Mean Ride Length Min.3rd Qu. Ride Length Min.Max.
## 1            13.632078               16.600000          1559.933333
## 2            31.326875               29.266667         55944.150000

With this simple function, we see that casual average ride length is more than double member average length, while the median ride lengths are much closer. The reason behind the discrepancy is apparent by looking at the MAX value, where we see a significant outlier. Let’s explore this in further detail.

Average ride length by day of week:

From this plot we see that member ride length is largely consistent Monday - Friday, with a slight increase on Saturday and Sunday. Average member ride length is also significantly lower than casual average ride length across all days. Casual average ride length is highest on Saturday and Sunday and is not consistent throughout the week. Looking at mean by itself, however, can be deceiving.

What will median reveal?

Let’s put the median and average ride length by day of week side-by-side. We will look at the raw data and then plot it.

## # A tibble: 14 x 4
## # Groups:   member_casual [2]
##    member_casual day_of_week ride_length ride_min
##    <ord>         <ord>             <dbl>    <dbl>
##  1 member        Monday             795.     13.2
##  2 member        Tuesday            767.     12.8
##  3 member        Wednesday          769.     12.8
##  4 member        Thursday           767.     12.8
##  5 member        Friday             799.     13.3
##  6 member        Saturday           916.     15.3
##  7 member        Sunday             939.     15.7
##  8 casual        Monday            1891.     31.5
##  9 casual        Tuesday           1666.     27.8
## 10 casual        Wednesday         1614.     26.9
## 11 casual        Thursday          1624.     27.1
## 12 casual        Friday            1760.     29.3
## 13 casual        Saturday          2042.     34.0
## 14 casual        Sunday            2203.     36.7

## # A tibble: 14 x 4
## # Groups:   member_casual [2]
##    member_casual day_of_week ride_length ride_min
##    <ord>         <ord>             <dbl>    <dbl>
##  1 member        Monday              552     9.2 
##  2 member        Tuesday             548     9.13
##  3 member        Wednesday           553     9.22
##  4 member        Thursday            548     9.13
##  5 member        Friday              566     9.43
##  6 member        Saturday            649    10.8 
##  7 member        Sunday              652    10.9 
##  8 casual        Monday              957    16.0 
##  9 casual        Tuesday             857    14.3 
## 10 casual        Wednesday           838    14.0 
## 11 casual        Thursday            827    13.8 
## 12 casual        Friday              898    15.0 
## 13 casual        Saturday           1069    17.8 
## 14 casual        Sunday             1123    18.7

Looking at them side-by-side, the difference in ride length between mean and median is apparent and significant. The mean ride length is being heavily skewed by outliers(lengthier trips), particularly with casual rides. Behaviorally, most casual rides are not double the length of member rides, as you would conclude from looking at only at mean. In general, most rides are shorter than the mean ride length.

Behavioral differences within ride length: Time of day

Now that we have observed ride length differences by day of week. Let’s again introduce time of day to this analysis. We’ll look at the raw data first.

## # A tibble: 8 x 4
## # Groups:   member_casual [2]
##   member_casual time_of_day ride_length ride_min
##   <ord>         <ord>             <dbl>    <dbl>
## 1 member        AM                 739.     12.3
## 2 member        MID                822.     13.7
## 3 member        PM                 846.     14.1
## 4 member        LATE               832.     13.9
## 5 casual        AM                1508.     25.1
## 6 casual        MID               2012.     33.5
## 7 casual        PM                1824.     30.4
## 8 casual        LATE              1962.     32.7

## # A tibble: 8 x 4
## # Groups:   member_casual [2]
##   member_casual time_of_day ride_length ride_min
##   <ord>         <ord>             <dbl>    <dbl>
## 1 member        AM                  537     8.95
## 2 member        MID                 561     9.35
## 3 member        PM                  608    10.1 
## 4 member        LATE                573     9.55
## 5 casual        AM                  736    12.3 
## 6 casual        MID                1101    18.4 
## 7 casual        PM                  978    16.3 
## 8 casual        LATE                864    14.4

Mean ride length by time of day:

Member ride length is consistent across all periods of the day, with slightly lengthier trips in the afternoon and evenings. Casual ride length is significantly shorter in the AM time period compared to others.

Median ride length by time of day:

Now let’s arrange the two above plots next to each other:

Looking at these side-by-side, they tell different stories. While member rides follow a similar pattern between mean and median, casual rides do not. When looking at mean, the longest casual rides take place in the MID time of day, followed by the LATE period. But when looking at median, the PM time of day replaces LATE as the second longest. LATE casual rides are heavily skewed by outliers.

Behavioral differences within ride length:

Day of week & Time of day

Finally, let’s look at ride length across time of day and day of week. For the below plots we will look at both mean and median again. However, it must be stated that, ultimately, median will be a better approximation of common behavior within each ride group because it discounts outliers.

Mean:

We can see by the elevated casual ride lengths on Saturday and Sunday that most ride length outliers occur on the weekend.

Median:

This shows us that ride length among members is relatively consistent Monday-Friday, with slight elevations in the MID and PM slots on Saturday and Sunday. Casual ride length is consistent Monday-Friday with an exception in the MID slot on Mondays, where length is noticeably longer. Ride length across all times of day is elevated on Saturdays and Sundays.

Creating a duration category:

The below charts will examine the number of rides that fall into four duration categories:

0-15 minutes
15-30 minutes
30-45 minutes
45+ minutes

The number of member rides under 15 minutes is far greater than the number of casual rides under 15 minutes. The number of casual rides is greater for all other cateogries.

Percentage of rides that fall into each duration category:

A strong majority of member rides are under 15 minutes. Only 29% of member rides are longer than 15 minutes, while 53% of casual rides are longer than 15 minutes. Only 8% of member rides are longer than 30 minutes while 24% of casual rides are over 30 minutes.

Ride length incentives have been a major factor in sales and marketing strategies at Divvy. The current annual membership includes free rides up to 45 minutes. It will be beneficial to look more closely at casual rides over 45 minutes:

Count of casual rides that last:

## 45-50 minutes 
##         49294

## 45-55 minutes 
##         89338

## 45-60 minutes 
##        121970

Of all casual rides over 45 minutes:

14% are 45-50 minutes
25% are 45-55 minute
35% are 45-60 minutes

Findings and recommendations:

How do members and casual riders differ?

The morning belongs to members:

19% of member rides start between 4AM-10AM
Only 9% of casual rides start between 4AM-10AM

Late night belongs to casual riders:

21% of casual ride start between 8PM-4AM
Only 14% of member rides start between 8PM-4AM

The weekends belong to casual riders:

Saturday and Sunday account for 41% of casual rides
Only 26% of member rides occur on Saturday and Sunday

Member rides are shorter:

71% of member rides are less than 15 minutes
47% of casual rides are less than 15 minutes

Casual rides are longer:

24% of casual rides last longer than 30 minutes
Only 8% of member rides last longer than 30 minutes

Member ride behavior appears influenced by the workday. They likely use Divvy bikes to travel to and from work, which would account for the greater ride count between 4AM-10AM and consistent ride length across the week. Casual riders use Divvy bikes more frequently after 10AM and on the weekends, likely for leisure activities. This is corroborated by higher ride count and longer ride length on Saturdays and Sundays.

Why would casual riders become members?

In designing a campaign aimed at converting casual riders to members, success may be found in developing membership incentives surrounding weekend trips, evening/late trips, and longer trips.

Current membership plans include the following benefits as of March 2022:

Unlimited 45-min classic bike rides
Speed up with 25% off ebikes
Earn Bike Angels points and rewards
3 free guest passes to share

The following changes to membership benefits may convert casual riders:

Increase ride length limit for the unlimited rides benefit
Earn additional Bike Angel points for rides on the weekends
Earn additional BIke Angel points for rides between 8PM-4AM

Another possibility is a second type of annual membership with benefits aimed at casual riders. For example, a unique “Weekend Warrior” membership - a reduced-price annual membership that receives discounts for late-night rides, rides on Saturdays and Sundays, and extra discounts on rides over “x” minutes in length.

Forthcoming analysis will focus on comparing months and quarters, as the majority of all rides occur in the warmer months. Being able to isolate behavior by season would be very beneficial to understanding Divvy customers.

This concludes our R analysis of behavioral differences between member and casual rides from January 2021 - December 2021.

An Analysis of Divvy Bikeshare Data

Nathan Barash

2/23/2022