Goals of this notebook

Setup

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.1.0
## ✔ tidyr   1.2.0     ✔ stringr 1.4.1
## ✔ readr   2.1.2     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(lubridate)
## 
## Attaching package: 'lubridate'
## 
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

Import clean data

bike <- read_rds("data-processed/01-bike.rds") 

bike |> glimpse()
## Rows: 1,694,087
## Columns: 12
## $ trip_id               <chr> "9900285854", "9900285855", "9900285856", "99002…
## $ membership_type       <chr> "Annual (San Antonio B-cycle)", "24-Hour Kiosk (…
## $ bicycle_id            <chr> "207", "969", "214", "745", "164", "37", "517", …
## $ checkout_time         <time> 13:12:00, 13:12:00, 13:12:00, 13:12:00, 13:12:0…
## $ checkout_kiosk_id     <chr> "2537", "2498", "2537", NA, "2538", NA, "2496", …
## $ checkout_kiosk        <chr> "West & 6th St.", "Convention Center / 4th St. @…
## $ return_kiosk_id       <chr> "2707", "2566", "2496", NA, NA, "2545", "2561", …
## $ return_kiosk          <chr> "Rainey St @ Cummings", "Pfluger Bridge @ W 2nd …
## $ trip_duration_minutes <dbl> 76, 58, 8, 28, 15, 26, 35, 11, 0, 25, 10, 29, 34…
## $ month                 <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, …
## $ year                  <dbl> 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, …
## $ checkout_date         <date> 2014-10-26, 2014-10-26, 2014-10-26, 2014-10-26,…

Create month and year columns

These are columns in the original data, but some are NA, so I made new month and year columns. I also made the month column a character so I can get the month names when I make plots.

month_year <- bike |> 
  mutate(month_new = month(checkout_date), year_new = year(checkout_date)) |> 
  select(-"month", -"year") |> 
  mutate(month = month_new, year = year_new) |> 
  select(-"month_new", -"year_new") |> 
  mutate(month_name = month(checkout_date, label = TRUE)) # change month to name instead of number to use on charts

month_year

See all walk up memberships

Walk Up used to be called 24-Hour Kiosk.

month_year |> 
  filter(str_detect(membership_type, "24-Hour Kiosk")) |> 
  group_by(membership_type, year) |> 
  summarise(use = n()) |> 
  arrange(desc(use))
## `summarise()` has grouped output by 'membership_type'. You can override using
## the `.groups` argument.

When were Walk Up memberships most frequently used?

month_year |> 
  filter(membership_type == "Walk Up" | membership_type == "24-Hour Kiosk (Austin B-cycle)") |> 
  select(membership_type, year) |> 
  group_by(year) |> 
  summarize(walk_up_uses = n()) |> 
  arrange(desc(walk_up_uses))

Walk Up memberships were most frequently used in 2014.

When were UT student memberships most frequently used?

month_year |> 
  filter(membership_type == "U.T. Student Membership") |> 
  select(membership_type, year) |> 
  group_by(year) |> 
  summarize(ut_uses = n()) |> 
  arrange(desc(ut_uses))

UT student memberships were most frequently used in 2018. This is mostly because student memberships were free this year. The next highest year was 2019.

How did the usage of UT student memberships change when Lime scooters came to Austin?

Lime scooters showed up in Austin April 16, 2018, according to the Texas Tribune.

Usage of UT student memberships before Lime scooters arrived

month_year |> 
  filter(checkout_date < "2018-04-16", membership_type == "U.T. Student Membership") |> 
  summarize(uses_before_lime = n())

Usage of UT student memberships after Lime scooters arrived

month_year |> 
  filter(checkout_date >= "2018-04-16", membership_type == "U.T. Student Membership") |> 
  summarise(uses_since_lime = n())

The number of rides by people with the UT student membership increased after Lime scooters arrived in Austin.

How did usage for all memberships change when Lime scooters came to Austin?

Usage before Lime

month_year |> 
  filter(checkout_date < "2018-04-16") |> 
  summarise(uses_before_lime = n())

Usage after Lime

month_year |> 
  filter(checkout_date >= "2018-04-16") |> 
  summarise(uses_after_lime = n())

Overall, the number of rides decreased by about 43,000 after Lime scooters came to Austin.

Compare the number of rides in 2017 and 2019

These are the years before and after Lime scooters arrived.

Rides in 2017

month_year |> filter(year == "2017") |> 
  summarize(total_rides = n())

Rides in 2019

month_year |> 
  filter(year == "2019") |> 
  summarise(total_rides = n())

The total number of rides was lower in 2019 than in 2017.

Rides per month each year since 2017

rides_since_2017 <- month_year |> 
  filter(year >= "2017") |> 
  group_by(year, month_name) |> 
  summarise(rides = n()) |> 
  arrange(desc(rides))
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
rides_since_2017

March 2018 had the highest number of rides.

Rides per year since 2017

month_year |> 
  filter(year >= "2017") |> 
  group_by(year) |> 
  summarise(rides = n())

The number of rides more than doubled from 2017 to 2018. Rides dropped by more than half from 2018 to 2019 but rose again in 2021.

Rides from January to June each year

The data for 2022 ends in June, so I want to see how the amount of rides in the first six months of each year compare.

month_year |> 
  filter(month <= 6) |> 
  group_by(year) |> 
  summarise(rides = n()) |> 
  arrange(desc(rides))

2020 had the fewest number of rides during this time frame, and 2018 and 2022 had the highest number.

Old Lede

MetroBike rides from UT Austin student membership holders more than doubled after Lime scooters appeared in the city in 2018, according to an analysis of data from the City of Austin open data portal.

Export data

month_year |> write_rds("data-processed/02-clean-bike.rds")
rides_since_2017 |> write_rds("data-processed/02-rides-since.rds")