How does a bike-share navigate speedy success?

About the company

In 2016, Cyclistic launched a successful bike-share. The bikes can be unlocked from one station and returned to any other station in the system at any time. Riders who have an annual subscription are called members while riders who are single-ride or full-day pass users are considered casual riders.

Analyzing the data

Tools used: R for data cleaning and visualisation

Dataset: https://divvy-tripdata.s3.amazonaws.com/index.html

I used R to combine all the dataset into one dataframe and removed all unused columns and also created a new column where the duration of each ride was displayed.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.1
## ✔ readr   2.1.2     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## 
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(ggplot2)

Importing all the csv files to r:

df2020_4 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_4.csv",header = TRUE)
df2020_5 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_5.csv",header = TRUE)
df2020_6 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_6.csv",header = TRUE)
df2020_7 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_7.csv",header = TRUE)
df2020_8 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_8.csv",header = TRUE)
df2020_9 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_9.csv",header = TRUE)
df2020_10 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_10.csv",header = TRUE)
df2020_11 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_11.csv",header = TRUE)
df2020_12 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2020_12.csv",header = TRUE)
df2021_1 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2021_1.csv",header = TRUE)
df2021_2 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2021_2.csv",header = TRUE)
df2021_3 <- read.csv(file = "C:\\Users\\Dibyajyoti Das\\Desktop\\Processed csv file\\2021_3.csv",header = TRUE)

Combining all the datasets into one dataframe

all_trips <- rbind(df2020_4,df2020_5,df2020_6,df2020_7,df2020_8,df2020_9,df2020_10,
                   df2020_11,df2020_12,df2021_1,df2021_2,df2021_3)

Converting the datatype of the dates from char to datetime

all_trips$started_at <- as.POSIXct(all_trips$started_at, format="%m/%d/%Y %H:%M")
all_trips$ended_at <- as.POSIXct(all_trips$ended_at, format="%m/%d/%Y %H:%M")
all_trips$ride_length2 <- difftime(all_trips$ended_at,all_trips$started_at)

Converting ride_length column from factor to numeric

all_trips$ride_length2 <- as.numeric(as.character(all_trips$ride_length2))

Filtering out all the rows where ride length is less than zero:

all_trips2 <- all_trips %>% 
  filter(ride_length2>0)

Recoding the column day_of_week

all_trips2$day_of_week <- recode(all_trips2$day_of_week,
                                 "1"="Sunday",
                                 "2"="Monday",
                                 "3"="Tuesday",
                                 "4"="Wednesday",
                                 "5"="Thrusday",
                                 "6"="Friday",
                                 "7"="Saturday")

Observation from the data:

It can be seen that docked bikes are used much more frequently than classic bike or electric bike.

From the graph above, it can be seen that on weekends, both casual and member users of the bike-sharing company have the approximately the same number of rides, although on Saturday, casual users outnumber the members.

On weekdays though, member users use the service much more frequently than casual users.

Average duration of the casual users far exceeds that of the members of the service on all weekdays.

Both casual and member users of the service prefer docked bikes much more than electric bikes and classic bike, classic bike users being negligible in case of casual users.

The maximum usage occurs during weekends, Saturday having the highest usage while monday has the lowest.

The above graph shows the usage per day by users. In case of casual users, the distribution is uneven, with Saturday having the highest number of users. After Sunday, there is significant drop in users.

While in case members of the service, the distribution is quite even. Saturday has the highest number of users, although the differnce between weekdays and weekends is big and is comparable. Monday and Tuesday experience a drop after Sunday though the drop is not quite as significant as in the case of casual users.

aggregate(all_trips2$ride_length2~all_trips2$member_casual, FUN=max)
##   all_trips2$member_casual all_trips2$ride_length2
## 1                   casual                 3341040
## 2                   member                 3523200

The maximum duration for casual users translates to 38 days while for members is 41 days.

aggregate(all_trips2$ride_length2~all_trips2$member_casual, FUN=min)
##   all_trips2$member_casual all_trips2$ride_length2
## 1                   casual                      60
## 2                   member                      60

The minimum duration for both casual and members is 60 seconds.

Recommendations

To convert casual users into members of the service, the following recommendation can be applied:

  1. Offer discounts on weekends to casual users

  2. Increase the renting price for the casual users especially during weekends to convince the casual members into buying the annual subscription.