Using Cyclistic ride data for October 2023, I show that casual riders are more likely to choose classic bikes over electric bikes and ride them longer. Also, casual riders are more likely to use bikes in the first ten days of the month rather than later. Finally, I identify the top five most frequented stations by casual riders.
My top 3 recommendations are to:
First, install libraries ggplot2, tidyverse, and plotly:
library(ggplot2)
library(tidyverse)
library(plotly)
Then, create a handle for the .csv file and a dataframe:
X202310_divvy_tripdata <- read_csv("202310-divvy-tripdata.csv")
head(X202310_divvy_tripdata)
## # A tibble: 6 × 13
## ride_id rideable_type started_at ended_at
## <chr> <chr> <dttm> <dttm>
## 1 4449097279F8BBE7 classic_bike 2023-10-08 10:36:26 2023-10-08 10:49:19
## 2 9CF060543CA7B439 electric_bike 2023-10-11 17:23:59 2023-10-11 17:36:08
## 3 667F21F4D6BDE69C electric_bike 2023-10-12 07:02:33 2023-10-12 07:06:53
## 4 F92714CC6B019B96 classic_bike 2023-10-24 19:13:03 2023-10-24 19:18:29
## 5 5E34BA5DE945A9CC classic_bike 2023-10-09 18:19:26 2023-10-09 18:30:56
## 6 F7D7420AFAC53CD9 electric_bike 2023-10-04 17:10:59 2023-10-04 17:25:21
## # ℹ 9 more variables: start_station_name <chr>, start_station_id <chr>,
## # end_station_name <chr>, end_station_id <chr>, start_lat <dbl>,
## # start_lng <dbl>, end_lat <dbl>, end_lng <dbl>, member_casual <chr>
df <- data.frame(X202310_divvy_tripdata)
The Cyclistic ride data for October 2023 contains useful ride information. For instance, it shows whether a rider took a classic or electric bike and it indicates each ride’s start and end time. These times are particularly useful for visualizations and to determine exactly how long riders are using Cyclistic bikes.
First, create a ride duration variable in seconds:
df$rd <- difftime(df$ended_at, df$started_at, units="secs")
df$rd <- as.numeric(df$rd)
Then, create a date variable from the ride start times as a base line for visualizations:
df$dates <- as.Date(df$started_at)
Then, aggregate the time variable to summarize the ride information by day. It’s important to show ride duration by day in a single visualization (there are over 530,000 rides in October). I also aggregate by rideable type and member type to better highlight the usage differences among members and casual riders.
rd_day <- aggregate(rd~dates, data=df, sum)
rd_day2 <- aggregate(rd~dates+member_casual, data=df, sum)
rd_day3 <- aggregate(rd~rideable_type+member_casual, data=df, sum)
For my analysis, consider the ride duration of casual riders and members in a pie chart. Members and casual riders have about the same usage:
rd_day4 <- aggregate(rd~member_casual, data=df, sum)
labels = c('Member','Casual')
values = c(262477696, 242975985)
fig <- plot_ly(rd_day4, type='pie', labels=labels, values=values,
textinfo='label+percent',
insidetextorientation='radial',
title="Ride duration for casual riders and members")
fig
While members and casual riders have about the same usage, they may use the bikes in varied amounts each day. Therefore, I plot the ride length for each day in October to determine if riders use the bikes longer at different times of the month. In the bar chart below we see that ride length is longer at the beginning of the month compared to the end.
Clearly there is more usage at the beginning of the month, but does it vary by member type? The bar chart below shows that casual riders are taking longer rides in the first ten days of October.
Therefore, if Cyclistic wants to convert casual riders into members, it may wish to advertise more during the first ten days of the month. Let’s consider one more analysis by ride type: electric vs. classic bicycles.
Are classic bikes more popular than electric bikes? Consider the pie chart below composed of ride duration by ride type. The classic bikes are used for longer compared to electric bikes.
## rideable_type rd
## 1 classic_bike 320412162
## 2 electric_bike 185041519
If classic bicycles are used more than electric bikes, then perhaps the difference is evident among members and casual riders. This information can be useful for membership advertising. For instance, if members use electric bikes more often than classic bikes, then Cyclistic can advertise electric bikes to casual owners in case they are unaware of the technology. In the pie charts below, when we break down the ride type by member status, it becomes clear that casual riders ride longer on classic bikes than electric bikes.
While casual riders ride longer on classic bikes, members ride longer on electric bikes. Therefore, it could be beneficial to market electric bikes to casual riders.
Finally, I’ve counted the number of visits to stations by casual riders and determined the top five busiest starting and ending stations to market memberships. The casual riders typically start and end in these locations:
df2 <- data.frame(X202310_divvy_tripdata) %>% count(start_station_name, end_station_name, member_casual)
df2[order(-df2$n),][1:20,]
## start_station_name end_station_name
## 99877 <NA> <NA>
## 99876 <NA> <NA>
## 87309 University Ave & 57th St Ellis Ave & 60th St
## 34189 Ellis Ave & 60th St University Ave & 57th St
## 10302 Calumet Ave & 33rd St State St & 33rd St
## 83001 State St & 33rd St Calumet Ave & 33rd St
## 34149 Ellis Ave & 60th St Ellis Ave & 55th St
## 33996 Ellis Ave & 55th St Ellis Ave & 60th St
## 86332 Streeter Dr & Grand Ave Streeter Dr & Grand Ave
## 31952 DuSable Lake Shore Dr & Monroe St DuSable Lake Shore Dr & Monroe St
## 56088 Loomis St & Lexington St Morgan St & Polk St
## 64035 Morgan St & Polk St Loomis St & Lexington St
## 83049 State St & 33rd St MLK Jr Dr & 29th St
## 32147 DuSable Lake Shore Dr & Monroe St Streeter Dr & Grand Ave
## 34148 Ellis Ave & 60th St Ellis Ave & 55th St
## 56238 MLK Jr Dr & 29th St State St & 33rd St
## 33995 Ellis Ave & 55th St Ellis Ave & 60th St
## 87322 University Ave & 57th St Kimbark Ave & 53rd St
## 4940 Blackstone Ave & 59th St University Ave & 57th St
## 34188 Ellis Ave & 60th St University Ave & 57th St
## member_casual n
## 99877 member 24092
## 99876 casual 16241
## 87309 member 838
## 34189 member 821
## 10302 member 805
## 83001 member 790
## 34149 member 696
## 33996 member 566
## 86332 casual 545
## 31952 casual 451
## 56088 member 426
## 64035 member 409
## 83049 member 365
## 32147 casual 347
## 34148 casual 338
## 56238 member 334
## 33995 casual 322
## 87322 member 299
## 4940 member 293
## 34188 casual 278