The Seattle Light Rail system plays a major role in transportation by connecting commuters to neighborhoods, attractions, and the airport. This project uses network analysis to explore which stations are most central to the systemās connectivity. By identifying key stations and visualizing their connections, I can better understand the strengths of the system and areas where it could be improved.
library(tidyverse)
## āā Attaching core tidyverse packages āāāāāāāāāāāāāāāāāāāāāāāā tidyverse 2.0.0 āā
## ā dplyr 1.1.4 ā readr 2.1.5
## ā forcats 1.0.0 ā stringr 1.5.1
## ā ggplot2 3.5.1 ā tibble 3.2.1
## ā lubridate 1.9.4 ā tidyr 1.3.1
## ā purrr 1.0.4
## āā Conflicts āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā tidyverse_conflicts() āā
## ā dplyr::filter() masks stats::filter()
## ā dplyr::lag() masks stats::lag()
## ā¹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(igraph)
##
## Attaching package: 'igraph'
##
## The following objects are masked from 'package:lubridate':
##
## %--%, union
##
## The following objects are masked from 'package:dplyr':
##
## as_data_frame, groups, union
##
## The following objects are masked from 'package:purrr':
##
## compose, simplify
##
## The following object is masked from 'package:tidyr':
##
## crossing
##
## The following object is masked from 'package:tibble':
##
## as_data_frame
##
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
##
## The following object is masked from 'package:base':
##
## union
library(scales)
##
## Attaching package: 'scales'
##
## The following object is masked from 'package:purrr':
##
## discard
##
## The following object is masked from 'package:readr':
##
## col_factor
# Read in the data
stations <- read_csv("Seattle_Transportation_Plan_Transit_Element.csv")
## Rows: 58 Columns: 16
## āā Column specification āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
## Delimiter: ","
## chr (10): STATUS, NAME, STATION, SEGMENT, BOARD, STATION_DE, REGIONAL_T, EDI...
## dbl (6): OBJECTID_1, LINK_TYPE, CONTRACT, New, x, y
##
## ā¹ Use `spec()` to retrieve the full column specification for this data.
## ā¹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Heres the link for my file i downloaded it as a csv #https://data.seattle.gov/dataset/Light-Rail-Stations/nng6-bzbm/about_data
nodes <- stations %>%
distinct(NAME, x, y) %>%
rename(name = NAME)
# Create edge list
ordered_stations <- nodes$name[order(nodes$y, decreasing = TRUE)]
edges <- data.frame(
from = head(ordered_stations, -1),
to = tail(ordered_stations, -1)
)
# Build the graph
g <- graph_from_data_frame(edges, vertices = nodes, directed = FALSE)
layout_coords <- layout_with_kk(g)
deg <- degree(g)
sizes <- rescale(deg, to = c(6, 18))
top_stations <- names(sort(deg, decreasing = TRUE))[1:5]
colors <- ifelse(V(g)$name %in% top_stations, "red", "skyblue")
labels <- ifelse(V(g)$name %in% top_stations, V(g)$name, "")
stations <- read_csv("Seattle_Transportation_Plan_Transit_Element.csv")
## Rows: 58 Columns: 16
## āā Column specification āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
## Delimiter: ","
## chr (10): STATUS, NAME, STATION, SEGMENT, BOARD, STATION_DE, REGIONAL_T, EDI...
## dbl (6): OBJECTID_1, LINK_TYPE, CONTRACT, New, x, y
##
## ā¹ Use `spec()` to retrieve the full column specification for this data.
## ā¹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(stations)
## # A tibble: 6 Ć 16
## OBJECTID_1 STATUS NAME STATION SEGMENT BOARD STATION_DE LINK_TYPE REGIONAL_T
## <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
## 1 1 Existi⦠NE 1⦠NE 145⦠Lynnwo⦠Appr⦠Board App⦠0 <NA>
## 2 2 Future NE 1⦠NE 130⦠Lynnwo⦠Prop⦠Proposed ⦠0 <NA>
## 3 3 Future Shor⦠Shorel⦠Lynnwo⦠Appr⦠Board App⦠0 <NA>
## 4 4 Future Moun⦠Mountl⦠Lynnwo⦠Appr⦠Board App⦠0 <NA>
## 5 5 Future Lynn⦠Lynnwo⦠Lynnwo⦠Appr⦠Board App⦠0 <NA>
## 6 6 Future 220t⦠220th ⦠Lynnwo⦠Prop⦠Proposed ⦠0 <NA>
## # ā¹ 7 more variables: CONTRACT <dbl>, EDITDATE <chr>, WEBSITE <chr>,
## # EDIT_FLAG <chr>, New <dbl>, x <dbl>, y <dbl>
library(scales)
layout_coords <- layout_with_kk(g)
deg <- degree(g)
deg[is.na(deg)] <- 0
sizes <- rescale(deg, to = c(6, 18))
top_stations <- names(sort(deg, decreasing = TRUE))[1:5]
colors <- ifelse(V(g)$name %in% top_stations, "red", "skyblue")
labels <- ifelse(V(g)$name %in% top_stations, V(g)$name, "")
# Plot
plot(g,
layout = layout_coords,
vertex.size = sizes,
vertex.color = colors,
vertex.label = labels,
vertex.label.cex = 0.85,
vertex.label.color = "black",
vertex.frame.color = "gray50",
edge.color = "gray80",
edge.width = 0.6,
main = "Seattle Light Rail Network: Key Stations Highlighted")
# Layout 1
plot(g,
layout = layout_with_graphopt(g),
vertex.size = sizes,
vertex.color = colors,
vertex.label = labels,
vertex.label.cex = 0.75,
vertex.label.color = "black",
edge.color = "gray85",
main = "Seattle Light Rail: Centrality Structure")
plot(g,
layout = layout_in_circle(g),
vertex.color = "blue",
vertex.frame.color = "gray",
vertex.label = V(g)$name,
vertex.label.cex = 0.7,
vertex.label.color = "black",
vertex.size = 8,
edge.color = "pink",
edge.width = 1.2,
main = "Seattle Light Rail: Station-to-Station Connections")
# Layout 2
plot(g,
layout = layout_with_fr(g),
vertex.size = sizes,
vertex.color = colors,
vertex.label = labels,
vertex.label.cex = 0.75,
vertex.label.color = "black",
edge.color = "white",
main = "Seattle Light Rail: Force-Directed Layout")
# Layout 3
top10 <- names(sort(deg, decreasing = TRUE))[1:10]
V(g)$color <- ifelse(V(g)$name %in% top10, "pink", "blue")
V(g)$size <- ifelse(V(g)$name %in% top10, 15, 5)
V(g)$label <- ifelse(V(g)$name %in% top10, V(g)$name, NA)
plot(g,
layout = layout_with_fr(g),
vertex.color = V(g)$color,
vertex.size = V(g)$size,
vertex.label = V(g)$label,
vertex.label.cex = 0.8,
vertex.label.color = "black",
main = "Top 10 Most Central Stations")
This project used network analysis to explore the structure and connectivity of Seattleās Light Rail system. By visualizing the network and applying centrality measures, I was able to identify the most influential stations in terms of their direct connections to others. Stations like Westlake, University Street, and SeaTac/Airport emerged as key hubs that serve as the backbone of the transit system.
The multiple network layouts helped me view the system from different perspectives, geographic, structural, and centrality-based. The subgraph focusing on the top 10 stations provided a focused visualization of the busiest nodes and their immediate neighbors.