Introduction

This tutorial explains the process of unzipping .kmz files, importing .kml files using the readOGR function from the rgdal package to add CTA ā€˜L’ train lines and bus routes to a map of the City of Chicago.

CTA Open Data

The City of Chicago Open Data Portal contains a list of publically available datasets from the Chicago Transportation Authority (CTA).

We’ll import the five following datasets:

  • CSV file of CTA System Information - List of ā€˜L’ Stops: This list of ā€˜L’ stops provides location and basic service availability information for each place on the CTA system where a train stops, along with formal station names and stop descriptions.

  • KML file of CTA ā€˜L’ (Rail) Lines: Lines representing approximately where the CTA rail lines are.

  • KML file of CTA ā€˜L’ (Rail) Stations: Point data representing approximate location of Station head house. (Not necessarily where an entrance to station would be.)

  • KML file of CTA Bus Stops: Point data representing over 11,000 CTA bus stops. The Stop ID is used to get Bus Tracker information.

  • KML file of CTA Bus Routes: Line data representing CTA Bus Routes. Source data are NAVTEQ street centerlines.

Import CSV into R

To import a CSV file into R, from the City of Chicago Open Data Portal, you’ll do the following:

  1. Click the Download Button on the top-right of the webpage.

  2. Hover your mouse on CSV, right-click, and click on the Copy Link Address.

  1. Store the link as a character vector to be used in file argument inside the read.csv function, while setting header equal to TRUE and stringsAsFactors equal to FALSE.
# import CSV
ctaLinfo <- read.csv( file = "https://data.cityofchicago.org/api/views/8pix-ypme/rows.csv?accessType=DOWNLOAD"
                      , header = TRUE
                      , stringsAsFactors = FALSE
                      )
# peak inside
str( ctaLinfo )

Welcome to KMZ

KMZ files are zipped KML (Keyhole Markup Language) files with a .kmz extension. The contents of a KMZ file are a single root KML document, which is commonly expressed as ā€œdoc.kmlā€. The line and point data regarding CTA ā€˜L’ trains and buses is stored as a KMZ, requiring us to unzip the file prior to using the KML file inside.

Import KMZ into R

To import a KMZ file into R, from the City of Chicago Open Data Portal, you’ll do the following:

  1. Hover your mouse on Download, right-click, and click on the Copy the Link Address.
  1. Paste the link into the url argument inside the download.file function. Be sure to name the zip file. You’ll reuse that exact same character string when unzipping the file. However, you’ll always be extracting the ā€œdoc.kmlā€ file that resides inside the unzipped KMZ file.
###############################
# download CTA 'L' Rail Lines #
###############################
download.file( url = "https://data.cityofchicago.org/download/sgbp-qafc/application%2Fzip"
               , destfile = "CTA_RailLines.zip"
               )
# unzip file
unzip( "CTA_RailLines.zip")

# read data
ctaLines <- readOGR( dsn = paste( getwd()
                                  , "doc.kml"
                                  , sep = "/"
                                  )
                , stringsAsFactors = FALSE
                )
##################################
# download CTA 'L' Rail Stations #
##################################
download.file( url = "https://data.cityofchicago.org/download/4qtv-9w43/application%2Fzip"
               , destfile = "CTA_RailStations.zip"
               )
# unzip file
unzip( "CTA_RailStations.zip")

# read data
ctaLineStations <- readOGR( dsn = paste( getwd()
                                  , "doc.kml"
                                  , sep = "/"
                                  )
                , stringsAsFactors = FALSE
                )
###############################
# download CTA Bus Routes #####
###############################
download.file( url = "https://data.cityofchicago.org/download/rytz-fq6y/application%2Fzip"
               , destfile = "CTA_ROUTES.zip"
               )

# unzip file
unzip( "CTA_ROUTES.zip")

# read data
ctaBusRoutes <- readOGR( dsn = paste( getwd()
                                  , "doc.kml"
                                  , sep = "/"
                                  )
                , stringsAsFactors = FALSE
                )
###############################
# download CTA Bus Stops ######
###############################
download.file( url = "https://data.cityofchicago.org/download/84eu-buny/application%2Fzip"
               , destfile = "CTA_BusStops.zip")

# unzip file
unzip( "CTA_BusStops.zip" )

# read data
ctaBusStations <- readOGR( dsn = paste( getwd()
                                  , "doc.kml"
                                  , sep = "/"
                                  )
                , stringsAsFactors = FALSE
                )

Plotting CTA ā€˜L’ Rail Lines and Stations

Plotting the CTA ā€˜L’ rail lines and stations requires the use of four functions:

  • par: used to minimize the white space on the plot, as well as giving it an off-white background color.

  • plot: used to display the spatial polygons data frame. This script ā€œhidesā€ the borders of the City of Chicago Community Areas by filling the polygons with the same color as the borders.

  • lines: used to display the spatial lines data frame of the CTA ā€˜L’ rail lines.

  • points: used to display the spatial points data frame of the CTA ā€˜L’ rail line stations.

# save as pdf
pdf( file = "CTA_L_RailLines_Stations_2017-08-19.pdf"
     , width = 8
     , height = 11
     )
# clear margin white space
par( mar = c(0, 0, 4, 0 )
     , bg = "#000000"
     )
# plot community areas
plot( comarea606
      #, main = "City of Chicago 77 Community Areas"
      , col = "#B3DDF2"
      , border = "#B3DDF2"
      )

# add Blue Line
lines( ctaLines[ ctaLines$Name == "Blue Line (Forest Park)" |
                   ctaLines$Name == "Blue Line (O'Hare)"
                 , ]
       , col = "#00A1DE"
       , lwd = 10 # make line thicker
       )

# plot Red Line
lines( ctaLines[ ctaLines$Name == "Red, Purple Line" |
                   ctaLines$Name == "Brown, Purple (Express), Red" |
                   ctaLines$Name == "Red Line"
                 , ]
       , col = "#C60C30"
       , lwd = 10
       )
# plot Green Line
lines( ctaLines[ ctaLines$Name == "Green, Pink" |
                   ctaLines$Name == "Green Line" |
                   ctaLines$Name == "Green, Orange" |
                   ctaLines$Name == "Brown, Green, Orange, Pink, Purple (Exp)"
                 , ]
       , col = "#009B3A"
       , lwd = 8
       )
# plot Yellow Line
lines( ctaLines[ ctaLines$Name == "Yellow Line"
                 , ]
       , col = "#F9E300"
       , lwd = 10
       )
# plot Purple Line
lines( ctaLines[ ctaLines$Name == "Red, Purple Line" |
                   ctaLines$Name == "Brown, Purple (Express), Red" |
                   ctaLines$Name == "Brown, Green, Orange, Pink, Purple (Exp)" |
                   ctaLines$Name == "Purple Line" |
                   ctaLines$Name == "Brown, Purple" |
                   ctaLines$Name == "Brown, Orange, Pink, Purple (Express)"
                 , ]
       , col = "#522398"
       , lwd = 10
       )
# plot Orange Line
lines( ctaLines[ ctaLines$Name == "Brown, Green, Orange, Pink, Purple (Exp)" |
                   ctaLines$Name == "Brown, Orange, Pink, Purple (Express)" |
                   ctaLines$Name == "Green, Orange" |
                   ctaLines$Name == "Orange Line"
                   , ]
       , col = "#F9461C"
       , lwd = 10
       )
# plot Brown Line
lines( ctaLines[ ctaLines$Name == "Brown, Purple (Express), Red" |
                   ctaLines$Name == "Brown, Green, Orange, Pink, Purple (Exp)" |
                   ctaLines$Name == "Brown, Purple" |
                   ctaLines$Name == "Brown, Orange, Pink, Purple (Express)" |
                   ctaLines$Name == "Brown Line"
                 , ]
       , col = "#62361B"
       , lwd = 10
       )
# plot Pink Line
lines( ctaLines[ ctaLines$Name == "Brown, Green, Orange, Pink, Purple (Exp)" |
                   ctaLines$Name == "Brown, Orange, Pink, Purple (Express)" |
                   ctaLines$Name == "Green, Pink" |
                   ctaLines$Name == "Pink Line"
                   , ]
       , col = "#E27EA6"
       , lwd = 10
       )
# add CTA 'L' Station points
points( ctaLineStations
        , col = "#FFFFFF"
        , pch = 20
        , cex = 1
        )
# shut down graphing device
dev.off()

Plotting CTA Bus Routes and Stations

Plotting CTA bus routes and stations requires the same four functions that were used to plot CTA ā€˜L’ lines and stations.

# save as PDF
pdf( file = "CTA_Bus_Routes_Stops_2017-08-19.pdf"
     , width = 9
     , height = 12
     )
# clear margin white space
par( mar = c(0, 0, 4, 0 )
     , bg = "#000000"
     )
# plot community areas
plot( comarea606
      #, main = "City of Chicago 77 Community Areas"
      , col = "#B3DDF2"
      , border = "#B3DDF2"
      )
# add CTA Bus Routes
lines( ctaBusRoutes
       , col = "#FFFFFF"
       , lwd = 4
       )
# add CTA Bus Stops
points( ctaBusStations
        , col = "#FF0000"
        , pch = 20
        , cex = 0.6
        )
# turn graphic device off
dev.off()