Cyclistic, Bike Sharing analysis case

Case Presentation


Cyclistic is a bike sharing company which gathers data of their users and wants to understand the main differences between the ā€œcasualā€ and ā€œmemberā€ users.

The main question to be answered is ā€œHow do annual members and casual riders use Cyclistic bikes differently?ā€.

The following document is a description of the steps and processes undergone throughout the analysis to arrive to the answer of the previous question.

Data Sources

The only data source for available for this case are the Q1 data of the ride trips of the app in the years 2019 and 2020. Thus, the answers are limited to what can be found on this data.

Exploration of the data


First step is to install and activate the packages that will be used.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## āœ” dplyr     1.1.4     āœ” readr     2.1.5
## āœ” forcats   1.0.0     āœ” stringr   1.5.1
## āœ” ggplot2   3.5.1     āœ” tibble    3.2.1
## āœ” lubridate 1.9.4     āœ” tidyr     1.3.1
## āœ” purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## āœ– dplyr::filter() masks stats::filter()
## āœ– dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Secondly, the data should be imported and explored.

# Importing Data
divvy_trips_2019 <- read.csv("Divvy_Trips_2019_Q1.csv")
divvy_trips_2020 <- read.csv("Divvy_Trips_2020_Q1.csv")

# Exploratory Analysis (EDA) - Q1 2019
head(divvy_trips_2019)
##    trip_id          start_time            end_time bikeid tripduration
## 1 21742443 2019-01-01 00:04:37 2019-01-01 00:11:07   2167        390.0
## 2 21742444 2019-01-01 00:08:13 2019-01-01 00:15:34   4386        441.0
## 3 21742445 2019-01-01 00:13:23 2019-01-01 00:27:12   1524        829.0
## 4 21742446 2019-01-01 00:13:45 2019-01-01 00:43:28    252      1,783.0
## 5 21742447 2019-01-01 00:14:52 2019-01-01 00:20:56   1170        364.0
## 6 21742448 2019-01-01 00:15:33 2019-01-01 00:19:09   2437        216.0
##   from_station_id                   from_station_name to_station_id
## 1             199              Wabash Ave & Grand Ave            84
## 2              44              State St & Randolph St           624
## 3              15                Racine Ave & 18th St           644
## 4             123      California Ave & Milwaukee Ave           176
## 5             173 Mies van der Rohe Way & Chicago Ave            35
## 6              98          LaSalle St & Washington St            49
##                  to_station_name   usertype gender birthyear
## 1      Milwaukee Ave & Grand Ave Subscriber   Male      1989
## 2 Dearborn St & Van Buren St (*) Subscriber Female      1990
## 3  Western Ave & Fillmore St (*) Subscriber Female      1994
## 4              Clark St & Elm St Subscriber   Male      1993
## 5        Streeter Dr & Grand Ave Subscriber   Male      1994
## 6        Dearborn St & Monroe St Subscriber Female      1983
colnames(divvy_trips_2019)
##  [1] "trip_id"           "start_time"        "end_time"         
##  [4] "bikeid"            "tripduration"      "from_station_id"  
##  [7] "from_station_name" "to_station_id"     "to_station_name"  
## [10] "usertype"          "gender"            "birthyear"
glimpse(divvy_trips_2019)
## Rows: 365,069
## Columns: 12
## $ trip_id           <int> 21742443, 21742444, 21742445, 21742446, 21742447, 21…
## $ start_time        <chr> "2019-01-01 00:04:37", "2019-01-01 00:08:13", "2019-…
## $ end_time          <chr> "2019-01-01 00:11:07", "2019-01-01 00:15:34", "2019-…
## $ bikeid            <int> 2167, 4386, 1524, 252, 1170, 2437, 2708, 2796, 6205,…
## $ tripduration      <chr> "390.0", "441.0", "829.0", "1,783.0", "364.0", "216.…
## $ from_station_id   <int> 199, 44, 15, 123, 173, 98, 98, 211, 150, 268, 299, 2…
## $ from_station_name <chr> "Wabash Ave & Grand Ave", "State St & Randolph St", …
## $ to_station_id     <int> 84, 624, 644, 176, 35, 49, 49, 142, 148, 141, 295, 4…
## $ to_station_name   <chr> "Milwaukee Ave & Grand Ave", "Dearborn St & Van Bure…
## $ usertype          <chr> "Subscriber", "Subscriber", "Subscriber", "Subscribe…
## $ gender            <chr> "Male", "Female", "Female", "Male", "Male", "Female"…
## $ birthyear         <int> 1989, 1990, 1994, 1993, 1994, 1983, 1984, 1990, 1995…
# Exploratory Analysis (EDA) - Q1 2020
head(divvy_trips_2020)
##            ride_id rideable_type          started_at            ended_at
## 1 EACB19130B0CDA4A   docked_bike 2020-01-21 20:06:59 2020-01-21 20:14:30
## 2 8FED874C809DC021   docked_bike 2020-01-30 14:22:39 2020-01-30 14:26:22
## 3 789F3C21E472CA96   docked_bike 2020-01-09 19:29:26 2020-01-09 19:32:17
## 4 C9A388DAC6ABF313   docked_bike 2020-01-06 16:17:07 2020-01-06 16:25:56
## 5 943BC3CBECCFD662   docked_bike 2020-01-30 08:37:16 2020-01-30 08:42:48
## 6 6D9C8A6938165C11   docked_bike 2020-01-10 12:33:05 2020-01-10 12:37:54
##         start_station_name start_station_id               end_station_name
## 1 Western Ave & Leland Ave              239          Clark St & Leland Ave
## 2  Clark St & Montrose Ave              234 Southport Ave & Irving Park Rd
## 3   Broadway & Belmont Ave              296       Wilton Ave & Belmont Ave
## 4   Clark St & Randolph St               51       Fairbanks Ct & Grand Ave
## 5     Clinton St & Lake St               66          Wells St & Hubbard St
## 6    Wells St & Hubbard St              212    Desplaines St & Randolph St
##   end_station_id start_lat start_lng end_lat  end_lng member_casual
## 1            326   41.9665  -87.6884 41.9671 -87.6674        member
## 2            318   41.9616  -87.6660 41.9542 -87.6644        member
## 3            117   41.9401  -87.6455 41.9402 -87.6530        member
## 4             24   41.8846  -87.6319 41.8918 -87.6206        member
## 5            212   41.8856  -87.6418 41.8899 -87.6343        member
## 6             96   41.8899  -87.6343 41.8846 -87.6446        member
colnames(divvy_trips_2020)
##  [1] "ride_id"            "rideable_type"      "started_at"        
##  [4] "ended_at"           "start_station_name" "start_station_id"  
##  [7] "end_station_name"   "end_station_id"     "start_lat"         
## [10] "start_lng"          "end_lat"            "end_lng"           
## [13] "member_casual"
glimpse(divvy_trips_2020)
## Rows: 426,887
## Columns: 13
## $ ride_id            <chr> "EACB19130B0CDA4A", "8FED874C809DC021", "789F3C21E4…
## $ rideable_type      <chr> "docked_bike", "docked_bike", "docked_bike", "docke…
## $ started_at         <chr> "2020-01-21 20:06:59", "2020-01-30 14:22:39", "2020…
## $ ended_at           <chr> "2020-01-21 20:14:30", "2020-01-30 14:26:22", "2020…
## $ start_station_name <chr> "Western Ave & Leland Ave", "Clark St & Montrose Av…
## $ start_station_id   <int> 239, 234, 296, 51, 66, 212, 96, 96, 212, 38, 117, 1…
## $ end_station_name   <chr> "Clark St & Leland Ave", "Southport Ave & Irving Pa…
## $ end_station_id     <int> 326, 318, 117, 24, 212, 96, 212, 212, 96, 100, 632,…
## $ start_lat          <dbl> 41.9665, 41.9616, 41.9401, 41.8846, 41.8856, 41.889…
## $ start_lng          <dbl> -87.6884, -87.6660, -87.6455, -87.6319, -87.6418, -…
## $ end_lat            <dbl> 41.9671, 41.9542, 41.9402, 41.8918, 41.8899, 41.884…
## $ end_lng            <dbl> -87.6674, -87.6644, -87.6530, -87.6206, -87.6343, -…
## $ member_casual      <chr> "member", "member", "member", "member", "member", "…

Preparing the data


After a quick examination there are a few things that can be noticed:

  • There are different amount of columns in each data frame. (e.g.Ā ā€œrideable_typeā€ and ā€œgenderā€).
  • Columns names are different between each other, even though they are trying to represent the same thing. (e.g.Ā ā€œtrip_idā€ and ā€œride_idā€).
  • The datatypes are wrong in some cases, as they don’t follow the best practices (e.g.Ā ā€œtrip_idā€ as an integer, when it should be a str).

The following code tackles each of the previously mentioned problems.

## Handles Extra column problem 

# Deletes the extra columns
divvy_trips_2019_df <- subset(divvy_trips_2019, select = -c(bikeid,
                                                            tripduration,
                                                            gender, 
                                                            birthyear))

divvy_trips_2020_df <- subset(divvy_trips_2020, select = -c(rideable_type, 
                                                            start_lat, 
                                                            start_lng, 
                                                            end_lat,
                                                            end_lng))

glimpse(divvy_trips_2019_df)
## Rows: 365,069
## Columns: 8
## $ trip_id           <int> 21742443, 21742444, 21742445, 21742446, 21742447, 21…
## $ start_time        <chr> "2019-01-01 00:04:37", "2019-01-01 00:08:13", "2019-…
## $ end_time          <chr> "2019-01-01 00:11:07", "2019-01-01 00:15:34", "2019-…
## $ from_station_id   <int> 199, 44, 15, 123, 173, 98, 98, 211, 150, 268, 299, 2…
## $ from_station_name <chr> "Wabash Ave & Grand Ave", "State St & Randolph St", …
## $ to_station_id     <int> 84, 624, 644, 176, 35, 49, 49, 142, 148, 141, 295, 4…
## $ to_station_name   <chr> "Milwaukee Ave & Grand Ave", "Dearborn St & Van Bure…
## $ usertype          <chr> "Subscriber", "Subscriber", "Subscriber", "Subscribe…
glimpse(divvy_trips_2020_df)
## Rows: 426,887
## Columns: 8
## $ ride_id            <chr> "EACB19130B0CDA4A", "8FED874C809DC021", "789F3C21E4…
## $ started_at         <chr> "2020-01-21 20:06:59", "2020-01-30 14:22:39", "2020…
## $ ended_at           <chr> "2020-01-21 20:14:30", "2020-01-30 14:26:22", "2020…
## $ start_station_name <chr> "Western Ave & Leland Ave", "Clark St & Montrose Av…
## $ start_station_id   <int> 239, 234, 296, 51, 66, 212, 96, 96, 212, 38, 117, 1…
## $ end_station_name   <chr> "Clark St & Leland Ave", "Southport Ave & Irving Pa…
## $ end_station_id     <int> 326, 318, 117, 24, 212, 96, 212, 212, 96, 100, 632,…
## $ member_casual      <chr> "member", "member", "member", "member", "member", "…
## Corrects the naming in the older version of the file. 
colnames(divvy_trips_2019_df)
## [1] "trip_id"           "start_time"        "end_time"         
## [4] "from_station_id"   "from_station_name" "to_station_id"    
## [7] "to_station_name"   "usertype"
colnames(divvy_trips_2020_df)
## [1] "ride_id"            "started_at"         "ended_at"          
## [4] "start_station_name" "start_station_id"   "end_station_name"  
## [7] "end_station_id"     "member_casual"
colnames(divvy_trips_2019_df) <- c("ride_id",
                                   "started_at", 
                                   "ended_at", 
                                   "start_station_id", 
                                   "start_station_name",
                                   "end_station_id", 
                                   "end_station_name", 
                                   "member_casual")

colnames(divvy_trips_2019_df)
## [1] "ride_id"            "started_at"         "ended_at"          
## [4] "start_station_id"   "start_station_name" "end_station_id"    
## [7] "end_station_name"   "member_casual"
colnames(divvy_trips_2020_df)
## [1] "ride_id"            "started_at"         "ended_at"          
## [4] "start_station_name" "start_station_id"   "end_station_name"  
## [7] "end_station_id"     "member_casual"
## Corrects the datatypes in both tables 
glimpse(divvy_trips_2019_df)
## Rows: 365,069
## Columns: 8
## $ ride_id            <int> 21742443, 21742444, 21742445, 21742446, 21742447, 2…
## $ started_at         <chr> "2019-01-01 00:04:37", "2019-01-01 00:08:13", "2019…
## $ ended_at           <chr> "2019-01-01 00:11:07", "2019-01-01 00:15:34", "2019…
## $ start_station_id   <int> 199, 44, 15, 123, 173, 98, 98, 211, 150, 268, 299, …
## $ start_station_name <chr> "Wabash Ave & Grand Ave", "State St & Randolph St",…
## $ end_station_id     <int> 84, 624, 644, 176, 35, 49, 49, 142, 148, 141, 295, …
## $ end_station_name   <chr> "Milwaukee Ave & Grand Ave", "Dearborn St & Van Bur…
## $ member_casual      <chr> "Subscriber", "Subscriber", "Subscriber", "Subscrib…
glimpse(divvy_trips_2020_df)
## Rows: 426,887
## Columns: 8
## $ ride_id            <chr> "EACB19130B0CDA4A", "8FED874C809DC021", "789F3C21E4…
## $ started_at         <chr> "2020-01-21 20:06:59", "2020-01-30 14:22:39", "2020…
## $ ended_at           <chr> "2020-01-21 20:14:30", "2020-01-30 14:26:22", "2020…
## $ start_station_name <chr> "Western Ave & Leland Ave", "Clark St & Montrose Av…
## $ start_station_id   <int> 239, 234, 296, 51, 66, 212, 96, 96, 212, 38, 117, 1…
## $ end_station_name   <chr> "Clark St & Leland Ave", "Southport Ave & Irving Pa…
## $ end_station_id     <int> 326, 318, 117, 24, 212, 96, 212, 212, 96, 100, 632,…
## $ member_casual      <chr> "member", "member", "member", "member", "member", "…
divvy_trips_2019_df <- transform(divvy_trips_2019_df, 
                                 ride_id = as.character(ride_id),
                                 started_at = as.Date(started_at),
                                 ended_at = as.Date(ended_at),
                                 start_station_name = as.character(start_station_name),
                                 end_station_name = as.character(end_station_name)
                                 )

divvy_trips_2020_df <- transform(divvy_trips_2020_df, 
                                 started_at = as.Date(started_at),
                                 ended_at = as.Date(ended_at),
                                 start_station_id = as.character(start_station_id),
                                 end_station_id = as.character(end_station_id)
                                 )
glimpse(divvy_trips_2019_df)
## Rows: 365,069
## Columns: 8
## $ ride_id            <chr> "21742443", "21742444", "21742445", "21742446", "21…
## $ started_at         <date> 2019-01-01, 2019-01-01, 2019-01-01, 2019-01-01, 20…
## $ ended_at           <date> 2019-01-01, 2019-01-01, 2019-01-01, 2019-01-01, 20…
## $ start_station_id   <int> 199, 44, 15, 123, 173, 98, 98, 211, 150, 268, 299, …
## $ start_station_name <chr> "Wabash Ave & Grand Ave", "State St & Randolph St",…
## $ end_station_id     <int> 84, 624, 644, 176, 35, 49, 49, 142, 148, 141, 295, …
## $ end_station_name   <chr> "Milwaukee Ave & Grand Ave", "Dearborn St & Van Bur…
## $ member_casual      <chr> "Subscriber", "Subscriber", "Subscriber", "Subscrib…
glimpse(divvy_trips_2020_df)
## Rows: 426,887
## Columns: 8
## $ ride_id            <chr> "EACB19130B0CDA4A", "8FED874C809DC021", "789F3C21E4…
## $ started_at         <date> 2020-01-21, 2020-01-30, 2020-01-09, 2020-01-06, 20…
## $ ended_at           <date> 2020-01-21, 2020-01-30, 2020-01-09, 2020-01-06, 20…
## $ start_station_name <chr> "Western Ave & Leland Ave", "Clark St & Montrose Av…
## $ start_station_id   <chr> "239", "234", "296", "51", "66", "212", "96", "96",…
## $ end_station_name   <chr> "Clark St & Leland Ave", "Southport Ave & Irving Pa…
## $ end_station_id     <chr> "326", "318", "117", "24", "212", "96", "212", "212…
## $ member_casual      <chr> "member", "member", "member", "member", "member", "…

Once both data frames are ready, having the same number of columns and datatypes I merge them to create a single data frame.

# Having solved all previous problems I merge both data frames into one variable. 
divvy_trips_df <- rbind(divvy_trips_2019_df, divvy_trips_2020_df)

Cleaning and Aggregating the data


Having just one data frame simplifies the analysis, but still there are bound to be mistakes within.

First, a new inspection is needed as there is a new table to explore.

# Inspect the new table that has been created
colnames(divvy_trips_df) # List of column names
## [1] "ride_id"            "started_at"         "ended_at"          
## [4] "start_station_id"   "start_station_name" "end_station_id"    
## [7] "end_station_name"   "member_casual"
nrow(divvy_trips_df) # How many rows are in data frame?
## [1] 791956
dim(divvy_trips_df) # Dimensions of the data frame?
## [1] 791956      8
head(divvy_trips_df) # See the first 6 rows of data frame. 
##    ride_id started_at   ended_at start_station_id
## 1 21742443 2019-01-01 2019-01-01              199
## 2 21742444 2019-01-01 2019-01-01               44
## 3 21742445 2019-01-01 2019-01-01               15
## 4 21742446 2019-01-01 2019-01-01              123
## 5 21742447 2019-01-01 2019-01-01              173
## 6 21742448 2019-01-01 2019-01-01               98
##                    start_station_name end_station_id
## 1              Wabash Ave & Grand Ave             84
## 2              State St & Randolph St            624
## 3                Racine Ave & 18th St            644
## 4      California Ave & Milwaukee Ave            176
## 5 Mies van der Rohe Way & Chicago Ave             35
## 6          LaSalle St & Washington St             49
##                 end_station_name member_casual
## 1      Milwaukee Ave & Grand Ave    Subscriber
## 2 Dearborn St & Van Buren St (*)    Subscriber
## 3  Western Ave & Fillmore St (*)    Subscriber
## 4              Clark St & Elm St    Subscriber
## 5        Streeter Dr & Grand Ave    Subscriber
## 6        Dearborn St & Monroe St    Subscriber
str(divvy_trips_df) # See list of columns and data types (numeric, character, etc)
## 'data.frame':    791956 obs. of  8 variables:
##  $ ride_id           : chr  "21742443" "21742444" "21742445" "21742446" ...
##  $ started_at        : Date, format: "2019-01-01" "2019-01-01" ...
##  $ ended_at          : Date, format: "2019-01-01" "2019-01-01" ...
##  $ start_station_id  : chr  "199" "44" "15" "123" ...
##  $ start_station_name: chr  "Wabash Ave & Grand Ave" "State St & Randolph St" "Racine Ave & 18th St" "California Ave & Milwaukee Ave" ...
##  $ end_station_id    : chr  "84" "624" "644" "176" ...
##  $ end_station_name  : chr  "Milwaukee Ave & Grand Ave" "Dearborn St & Van Buren St (*)" "Western Ave & Fillmore St (*)" "Clark St & Elm St" ...
##  $ member_casual     : chr  "Subscriber" "Subscriber" "Subscriber" "Subscriber" ...
summary(divvy_trips_df) # Statistical summary of data. 
##    ride_id            started_at            ended_at         
##  Length:791956      Min.   :2019-01-01   Min.   :2019-01-01  
##  Class :character   1st Qu.:2019-02-28   1st Qu.:2019-02-28  
##  Mode  :character   Median :2020-01-07   Median :2020-01-07  
##                     Mean   :2019-08-31   Mean   :2019-08-31  
##                     3rd Qu.:2020-02-19   3rd Qu.:2020-02-19  
##                     Max.   :2020-03-31   Max.   :2020-05-19  
##  start_station_id   start_station_name end_station_id     end_station_name  
##  Length:791956      Length:791956      Length:791956      Length:791956     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  member_casual     
##  Length:791956     
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

On further inspection, it is possible to detect that there are 4 distinct ways to name the type of user that the company has.

# member_casual should be composed of two values. member or casual. 

divvy_trips_df %>% 
  distinct(member_casual)
##   member_casual
## 1    Subscriber
## 2      Customer
## 3        member
## 4        casual

To fix this, the ā€œSubscriberā€ and ā€œCustomerā€ classification is changed for the more updated version of the Q1 2020 data frame.

divvy_trips_df$member_casual <- replace(divvy_trips_df$member_casual, 
                                        divvy_trips_df$member_casual == 'Subscriber', 'member')

divvy_trips_df$member_casual <- replace(divvy_trips_df$member_casual, 
                                        divvy_trips_df$member_casual == 'Customer', 'casual')

In order to make further analysis (such as representing the ride length throughout the month), there needs to be a reformat of the date.

divvy_trips_df$date <- as.Date(divvy_trips_df$started_at) # The default format is yyyy-mm-dd
divvy_trips_df$month <- format(as.Date(divvy_trips_df$date), "%m")
divvy_trips_df$day <- format(as.Date(divvy_trips_df$date), "%d")
divvy_trips_df$year <- format(as.Date(divvy_trips_df$date), "%Y")
divvy_trips_df$day_of_week <- format(as.Date(divvy_trips_df$date), "%A")

In the initial data frame of 2019 there was a ā€œtripdurationā€ column, showing the time that the trip took. However, the 2020 data frame didn’t have it, and at the time of merging it had to be deleted.

But having the moment at which the user started its ride, and when the user ended his ride, it can be determine how long was his ride.

divvy_trips_df$ride_length <- difftime(divvy_trips_df$ended_at,divvy_trips_df$started_at)

typeof(divvy_trips_df$ride_length) ## Shows data-type as "Double" when it should be NUmeric.
## [1] "double"
divvy_trips_df <- divvy_trips_df %>% 
  mutate(ride_length = as.numeric(ride_length))

str(divvy_trips_df) # Checks if datatype has changed
## 'data.frame':    791956 obs. of  14 variables:
##  $ ride_id           : chr  "21742443" "21742444" "21742445" "21742446" ...
##  $ started_at        : Date, format: "2019-01-01" "2019-01-01" ...
##  $ ended_at          : Date, format: "2019-01-01" "2019-01-01" ...
##  $ start_station_id  : chr  "199" "44" "15" "123" ...
##  $ start_station_name: chr  "Wabash Ave & Grand Ave" "State St & Randolph St" "Racine Ave & 18th St" "California Ave & Milwaukee Ave" ...
##  $ end_station_id    : chr  "84" "624" "644" "176" ...
##  $ end_station_name  : chr  "Milwaukee Ave & Grand Ave" "Dearborn St & Van Buren St (*)" "Western Ave & Fillmore St (*)" "Clark St & Elm St" ...
##  $ member_casual     : chr  "member" "member" "member" "member" ...
##  $ date              : Date, format: "2019-01-01" "2019-01-01" ...
##  $ month             : chr  "01" "01" "01" "01" ...
##  $ day               : chr  "01" "01" "01" "01" ...
##  $ year              : chr  "2019" "2019" "2019" "2019" ...
##  $ day_of_week       : chr  "martes" "martes" "martes" "martes" ...
##  $ ride_length       : num  0 0 0 0 0 0 0 0 0 0 ...
## Checks if there are negative rides
any(divvy_trips_df$ride_length < 0) # Returns false, all ride lengths make sense 
## [1] FALSE

Having cleaned the data set, I can export it as a .csv file and analyze it in a dataviz tool such as Tableau.

write.csv(divvy_trips_df, file = 'divvy_trips_2019-2020.csv')

Analyzing the data and visualizing the data


To end this analysis I will show an interactive dashboard representing the cleaned data done with the Tableau visualization tool.

Conclusions


We started off this analysis questioning the main differences between both kind of users, and through the analysis done we can conclude that:

  • Trip duration by day of the week:

    • Casual users have a higher average trip duration, especially on weekends (Saturday and Sunday).

    • Members have a more consistent average duration throughout the week, with lower values compared to casual users.

  • Trip duration by month:

    • Casual users show significantly higher durations in February compared to the other months (January and March).

    • Members have a more uniform trip duration across the three analyzed months.

  • Daily evolution of trip duration:

    • Casual users show higher duration peaks on specific days, such as February 9th and 24th, exceeding 3,000 minutes on average.

    • Members have less variability, with lower peaks and more stable distributions throughout the month.