MiBici Public Bikeshare System use pattern analysis

Intro

MiBici is based on the operation of a network of stations where users can access and use bicycles any day of the week as a mean of transportation for their daily mobility needs. According to official public information, MiBici currently operates 3,972 bicycles and 360 stations, which are the origin-destination base of the system’s trips.

In this document we will analyse the most frequent travel routes that have been in use over almost 1o years of operation. By doing this, we expect to comprehend travel patterns and provide an initial interpretation about the role of MiBici in providing a solution to daily urban mobility needs.

Library setup

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(sf)
## Warning: package 'sf' was built under R version 4.4.1
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(ggmap)
## Warning: package 'ggmap' was built under R version 4.4.1
## ℹ Google's Terms of Service: <https://mapsplatform.google.com>
##   Stadia Maps' Terms of Service: <https://stadiamaps.com/terms-of-service/>
##   OpenStreetMap's Tile Usage Policy: <https://operations.osmfoundation.org/policies/tiles/>
## ℹ Please cite ggmap if you use it! Use `citation("ggmap")` for details.
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.4.1
library(osrm)
## Warning: package 'osrm' was built under R version 4.4.1
## Data: (c) OpenStreetMap contributors, ODbL 1.0 - http://www.openstreetmap.org/copyright
## Routing: OSRM - http://project-osrm.org/

Data input

MiBici Stations List

We start by downloading the list and location of the system’s stations. Note that this file is constantly being updated, so the code may only be suitable for a short period of time before the URL would have to be updated.

if(!file.exists("nomenclatura_2024_08.csv")) {
     download.file("https://www.mibici.net/site/assets/files/1118/nomenclatura_2024_08.csv",
                   "nomenclatura_2024_08.csv")
} 
mbpoints <- read.csv("nomenclatura_2024_08.csv", encoding = "latin1")

MiBici Trip data base

Historical trip data is available on the MiBici open data page. Each available file corresponds to a single month of operational data, where each observation equals a trip.

All the collecting and cleaning process has been already completed and can be reviwed here., resulting in a data frame with over 28 million observations.

This data frame was summarized by station, year, and gender, which we will now read into two objects in the lines below.

dfTrips_Tot <- read_csv("dfTrips_Total.csv")
## Rows: 89107 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Orig_obcn, Dest_obcn
## dbl (7): Origen_Id, Destino_Id, Total_trips, Orig_latitude, Orig_longitude, ...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dfTrips_GenYear <- read_csv("dfTrips_GenYear.csv")
## Rows: 805644 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): Genero, Orig_obcn, Dest_obcn
## dbl (8): Origen_Id, Destino_Id, Year_trip, Total_trips, Orig_latitude, Orig_...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data Exploration

Stations location basic plot. The following plot is intended to provide a quick overview of the data points for each location. It is set in geographic coordinates (longitude, latitude).

bbox_amg <- make_bbox(mbpoints$longitude, mbpoints$latitude)

mapa_amg <- get_stadiamap(bbox_amg, zoom = 13)
## ℹ © Stadia Maps © Stamen Design © OpenMapTiles © OpenStreetMap contributors.
ggmap(mapa_amg)+ 
  geom_point(data = mbpoints, aes(x=longitude, y=latitude), inherit.aes = FALSE)+
  labs(title = "MiBici Public Bikeshare System", subtitle = "Stations location", 
       caption = "Source: https://www.mibici.net/es/datos-abiertos/",
       x = "Longitude", y = "Latitude")

Station map by total generated trips by station from 2014 to 2024

ggmap(mapa_amg)+
  geom_point(data = dfTrips_Tot, aes(x=Orig_longitude, y=Orig_latitude, size=Total_trips),
             colour = "brown4", alpha = .2) + 
  labs(title = "Trips originated MiBici Stations")
## Warning: Removed 15 rows containing missing values or values outside the scale range
## (`geom_point()`).

Station map by total attracted trips by station from 2014 to 2024

ggmap(mapa_amg)+
  geom_point(data = dfTrips_Tot, aes(x=Dest_longitude, y=Dest_latitude, size=Total_trips),
             colour = "purple3", alpha = .2) + 
  labs(title = "Trips atracted MiBici Stations")
## Warning: Removed 18 rows containing missing values or values outside the scale range
## (`geom_point()`).

Origin - Destination pair trip histogram

ggplot(dfTrips_Tot, aes(Total_trips)) + 
  geom_histogram(color = "darkgrey",fill="darkred", alpha = 0.6, binwidth = 200)+
  labs(title = "Trips by single origin-destination pair Histogram", x= "Total trips", y="Count")

Trips that have the same origin and destination travel path histogram.

ggplot(filter(dfTrips_Tot, Origen_Id == Destino_Id), aes(Total_trips)) + 
  geom_histogram(color = "darkgrey",fill="darkgreen", alpha = 0.6, binwidth = 200)+
  labs(title = "Trips with the same origin-destination pair Histogram", x= "Total trips", y="Count")

Origin destination matrix

ggplot() + 
    geom_tile(data = dfTrips_Tot, aes(x = Origen_Id, y = Destino_Id, fill = Total_trips))+
    scale_fill_distiller(palette = "RdYlGn")+
  labs(title = "Origin - Destination Matrix Plot", x="Origin Id", y="Destination Id")

Most frequent travel paths matrix

top_trips <- dfTrips_Tot %>%
          filter(Origen_Id != Destino_Id) %>% 
          arrange(desc(Total_trips)) %>%
          head(50)

ggplot() + 
    geom_tile(data = top_trips,
              aes(x = as.factor(Origen_Id),
                  y = as.factor(Destino_Id),
                  fill = Total_trips)) +
    scale_fill_distiller(palette = "RdYlGn") +
    labs(title="Origin-Destination Matrix",
         subtitle="Top 50 travel paths MiBici",
         x="Origin Id",
         y="Destinon Id",
         fill="Ttrips")

Route analysis

paths_MB0 <- function(Orig_obcn, Orig_longitude, Orig_latitude,
                        Dest_obcn, Dest_longitude, Dest_latitude) {
  path <- osrmRoute(src = c(Orig_obcn, Orig_longitude, Orig_latitude),
                    dst = c(Dest_obcn, Dest_longitude, Dest_latitude), 
                    returnclass = "sf",
                    overview = "full",
                    osrm.profile = "bike")
  
  cbind(Origin = Orig_obcn, Destin = Dest_obcn, path)
}

paths_MB <- function(Orig_longitude, Orig_latitude, Dest_longitude, Dest_latitude) {
  path <- osrmRoute(src = c(Orig_longitude, Orig_latitude),
                    dst = c(Dest_longitude, Dest_latitude), 
                    returnclass = "sf",
                    overview = "full",
                    osrm.profile = "bike")
  
  cbind(path)
}

top_routes <- list(#top_trips$Orig_obcn, 
                    top_trips$Orig_longitude, 
                    top_trips$Orig_latitude,
                   #top_20Paths$Dest_obcn, 
                   top_trips$Dest_longitude, 
                   top_trips$Dest_latitude)

top_routes <- pmap(top_routes, paths_MB) %>% 
  reduce(rbind)
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
## Warning: "returnclass" is deprecated.
ggmap(mapa_amg)+
       geom_sf(data=top_routes, color="red3", size=2, linewidth = 1, inherit.aes = FALSE)+
  labs(title="50 most frequent bikeroutes using MiBici System",
       subtitle="MiBici - Guadalajara",
       caption="Data source: https://www.mibici.net/es/datos-abiertos/")+
  scale_color_viridis_c(direction=-1)+
  theme_void()
## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.

Final preliminary remarks

In this simple analysis we have used the aggregated information of more than 28 million trips made by users of the public bicycle system “MiBici”, between December 2014 and June 2024.

By combining information from different open data platforms we have geographically identified the most frequent theoretical routes used. We can highlight some initial findings:

The system is most frequently used to make trips on the east-west axis of the city. This is interesting, since it coincides with the historical process of urbanization of the city, which over the years resulted in a similar pattern of distribution between employment and population.

It may also be related to the fact that the main modes of mass transport (Lines 1 and 3 of the Light Rail and Line 1 of BRT) have a mainly north-south layout, hence we can layout an initial hypothesis about the complementary nature of MiBici to solve daily trip needs.

Another interesting finding is that additional frequent routes were identified in the northwest polygon, which link the location of an important university campus of the University of Guadalajara with the site of a Light Rail station on Line 3 in the municipality of Zapopan.

These conclusions are preliminary, but they allow us to raise some questions for future, more detailed analyses, that could provide detailed information on the population’s travel patterns and needs.