OVERVIEW

R is an incredibly powerful tool for data analysis. As an analytical platform, it has benefitted from the decision to make the software free, open-source, and easily accessible. Consequently, it has been embraced by the academic and data science communities as useful way to develop and share new software. As a result, people have written thousands of programs for R that can be used for specialized or advanced analysis. The core R program is best understood as an operating system that is designed to run programs, called ‘packages’, written by members of the R community. This has allowed the platform to evolve rapidly through libraries of user-created tools that extend far beyond the original intent of the software.

Although R was created primarily for statistical analysis, it has become a powerful engine for mapping and geographic analysis. This vignette is designed to introduce the reader to a few of the hundreds of spatial packages in R to give a sense of the scope and power of the available tools.

The data and markdown files for this vignette can be found on GitHub.



CHOROPLETH MAPS IN R

Maps are most powerful when they are combined with data in order to identify patterns in the geographic distribution of the data. Choropleth maps are one of the most common conventions for the visual display of data. They summarize data across geographic units like states or counties, then use colors to represent discrete levels of continuous data, or groups in categorical data.



THE MAPS PACKAGE

Since choropleth maps group the data by geographic units they require the use of maps that are divided into polygons representing things like regions, states, or municipal borders. The Census provides a wide array of these maps, called TIGER “shapefiles”. They can be downloaded from the Census website and manipulated through GIS programs to create the desired choropleth maps.

R provides a default package called maps which provides current shapefiles for US states and counties. If we wish to use an R package we can install and run it through the following commands:

install.packages( "mapproj" )
library( "mapproj" )

The maps package comes with the program by default. The package called mapproj allows us to project maps in a variety of styles.

Step 1 - Select a Basemap

Creating some basic maps in R can be done through a couple of lines of code. These maps, however, will look pretty basic without any data.

library( maps )
library( mapproj )


map( 'county', fill=F, col="gray50" )
title( main="US Counties" )



Step 2 - Grab Some Data

Choropleth maps in the social sciences often represent demographic data, which frequently comes from the Census. It can be quite tedious to use the Census website to search for variables and download a dataset as a CSV file, read that into our program, clean the headers, then merge the data.

Fortunately R makes this process a little more palatable through the use of APIs. A data API is an easy way to query a database through a website address. Your query parameters are sent through the web address, and your data is printed directly in your browser or returned as a file download. When call the API through R, we can read the dataset directly into the program.

The API call will look odd at first, but once you get used to the convention you will realize that it is basically:

API Web Address + Variable Name + Unit of Analysis

For example, we can use the DataUSA API, which defines a query as follows:

  • API Web Address: http://api.datausa.io/api/csv/?show=geo
  • Variable Name: required=adult_smoking
  • Unit of Analysis: sumlevel=county

The database query would then be: http://api.datausa.io/api/csv/?show=geo&sumlevel=county&required=adult_smoking

We can read the data directly into R as follows: smoking <- read.csv( "http://api.datausa.io/api/csv/?show=geo&sumlevel=county&required=adult_smoking" )

For this map, let’s grab a few variables related to public health: smoking, obesity, drunk driving deaths, and Medicare reimbursement rates. Documentation for the API is available here.

### POPULATION BY RACE AND COUNTY

vars <- "adult_smoking,violent_crime,adult_obesity,alcoholimpaired_driving_deaths,health_care_costs&year=latest"
api <- paste( "http://api.datausa.io/api/csv/?show=geo&sumlevel=county&required=", vars, sep="" )
api %>% read.csv() %>% select(1:7) %>% head() %>% pander
Table continues below
year geo_name geo adult_smoking violent_crime
2016 Autauga County 05000US01001 0.187 253.64
2016 Baldwin County, AL 05000US01003 0.186 220.67
2016 Barbour County 05000US01005 0.214 146.89
2016 Bibb County 05000US01007 0.21 235.95
2016 Blount County 05000US01009 0.194 219.03
2016 Bullock County 05000US01011 0.223 283.35
adult_obesity alcoholimpaired_driving_deaths
0.309 0.318
0.267 0.394
0.408 0.391
0.401 0.344
0.324 0.192
0.445 0.44
dat <- read.csv( api )
dat$fips <- as.numeric( substr( dat$geo, 8, 12 ) )

data( county.fips )
county.fips <- merge( county.fips, dat, by="fips", sort=FALSE )



Step 3 - Define a Color Scheme

We select colors to communicate information about our data. If we are using a continuous variable the most basic decision is whether we want to represent it as positive and negative deviations from the average (a divergent scale), or as a continuum of low to high values (a sequential scale). If we have categorical data, it is generally visualized through different colors representing each group (a qualitative scale).



R has a variety of functions that allow the user to create color scales easily. These functions generally require you to specify a color on each end of the spectrum and they will interpolate the values between based upon how many levels you desire.

plot( 1:7, rep(1,7), ylim=c(-0.5,3.5), xlim=c(0,12), yaxt="n", xaxt="n", bty="n", xlab="", ylab=""  )

color.function <- colorRampPalette( c("gray80","darkred") )
col.ramp <- color.function( 7 ) # number of groups you desire
points( 1:7, rep(3,7), pch=15, cex=8, col=col.ramp )

color.function <- colorRampPalette( c("darkred","gray80","steelblue") )
col.ramp <- color.function( 7 ) # number of groups you desire
points( 1:7, rep(2,7), pch=15, cex=8, col=col.ramp )

color.function <- colorRampPalette( c("gray80","black") )
col.ramp <- color.function( 7 ) # number of groups you desire
points( 1:7, rep(1,7), pch=15, cex=8, col=col.ramp )

text( 8, 3, "Sequential", pos=4 )
text( 8, 2, "Divergent", pos=4 )
text( 8, 1, "Grayscale", pos=4 )



For this analysis let’s use a sequential scale with five levels:

color.function <- colorRampPalette( c("gray80","navyblue") )

col.ramp <- color.function( 5 ) # number of groups you desire

plot( 1:5, rep(1,5), ylim=c(-5,5), xlim=c(0,6), yaxt="n", 
      xaxt="n", bty="n", xlab="", ylab=""  )

points( 1:5, rep(1,5), pch=15, cex=10, col=col.ramp )



Step 4 - Split the Data into Levels

Choropleth maps represent levels of data as discrete colors. The decision on how to split a continuous variable into discrete categories is not a trivial one. This decision can have a big impact on how we visually interpret the results. As an example, if we use the rule to create bins with an equal width, each bin will capture a different amount of data. The bins near the center of the distribution will gather a large proportion of the data, and the bins near the extremes will capture only a small amount. Conversely, you can define the bins so that they each contain a similar proportion of the data. The implications of these two decisions on group size would look something like this:

histgm <- hist( rnorm(10000000), breaks=1000, plot=FALSE )
d <- histgm$breaks
equal.dist <- ifelse( d < -1.8 | d > -0.6 & d < 0.6 | d > 1.8, "red", "gray")
equal.data <- ifelse( d < -0.84 | d > -0.25 & d < 0.25 | d > 0.84, "red", "gray" )

par( mfrow=c(1,2) )
plot( histgm, col=equal.dist, main="Equally Sized Bins", border=NA, xlab="", yaxt="n", ylab="" )
plot( histgm, col=equal.data, main="Equally Sized Groups", border=NA, xlab="", yaxt="n", ylab="" )

To see how this impacts our understanding of the data, consider a choropleth map representing the average age of each county. A helpful blog post describes how applying three different rules about how to split the data will lead to widely different interpretations about the distribution of age in the country. These three maps present the exact same data, but it was split into groups using different rules.



Because of the sensitivity of the data to the split rules we need to be careful to not mislead our audience. We can use a variety of functions in order to split the data judiciously according to the underlying distribution and what patterns we wish to communicate. In this case, we will split the data into quantiles - groups with an equal number of observations. We select five groups, so each level will represent approximately 20% of the data.

smoking <- as.character( cut( rank(county.fips$adult_smoking), breaks=5, labels=col.ramp ) )

obesity <- as.character( cut( rank(county.fips$adult_obesity), breaks=5, labels=col.ramp ) )

alcohol <- as.character( cut( rank(county.fips$alcoholimpaired_driving_deaths), breaks=5, labels=col.ramp ) )

costs   <- as.character( cut( rank(county.fips$health_care_costs), breaks=5, labels=col.ramp ) )



Step 5 - Draw the Map

We have some data. We have a color scale. We have split the data into groups. We are now ready to create the map. Let’s use an Albers projection instead of the default rectangular projection.

map( database="county", col=smoking, fill=T, border="gray80", lwd=0.001, proj='albers', par=c(30,40)  )
map( database="state", fill=F, col="white", lwd=1.5, add=TRUE, proj='albers', par=c(30,40)  )

title( main="Smoking Prevalence in 2015")

map.scale( metric=F, ratio=T, relwidth=0.15, cex=0.5 )

legend.text=c(" 7-15 %"," 15-17 %"," 17-19 %","19-21 %","21-42 %")

legend( "bottomright", bg="white",
        pch=19, pt.cex=1.5, cex=0.7,
        legend=legend.text, 
        col=col.ramp, 
        box.col="white",
        title="Smoking Rate" 
       )

If we want to examine some trends in health patterns across the country we can plot multiple maps together. In this case, we can see that there is a greater prevalence of smoking and obesity in the South, but drunk driving seems to occur at equal rates across the country. Medicare reimbursement rates are also lower in the South than in other parts of the country.

par( mfrow=c(2,2), mar=c(0,0,2,0), oma=c(0,0,0,0) )

map( database="county", col=smoking, fill=T, lty=0, proj='albers', par=c(30,40) )
map( database="state", fill=F, col="white", lwd=1.5, add=TRUE, proj='albers', par=c(30,40)  )
title( main="Smoking")


map( database="county", col=obesity, fill=T, lty=0, proj='albers', par=c(30,40) )
map( database="state", fill=F, col="white", lwd=1.5, add=TRUE, proj='albers', par=c(30,40)  )
title( main="Obesity")


map( database="county", col=alcohol, fill=T, lty=0, proj='albers', par=c(30,40) )
map( database="state", fill=F, col="white", lwd=1.5, add=TRUE, proj='albers', par=c(30,40)  )
title( main="Drunk Driving")


map( database="county", col=costs, fill=T, lty=0, proj='albers', par=c(30,40) )
map( database="state", fill=F, col="white", lwd=1.5, add=TRUE, proj='albers', par=c(30,40)  )
title( main="Medicare Rates")



THE PLOTLY PACKAGE

The maps package in R is useful to produce high-quality static graphics suitable for publications or reports. If we are generating analysis that will be shared on websites, however, we have some additional options. There are a growing number of packages that make use of JavaScript to create interactive graphs that have features like zooming and hover-over fields. The plotly package offers many of these features.

library(plotly)

df <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/2011_us_ag_exports.csv")

df$hover <- with(df, paste(state, '<br>', "Beef", beef, "Dairy", dairy, "<br>",
                           "Fruits", total.fruits, "Veggies", total.veggies,
                           "<br>", "Wheat", wheat, "Corn", corn))

# give state boundaries a white border
l <- list(color = toRGB("white"), width = 2)

# specify some map projection/options
g <- list(
  scope = 'usa',
  projection = list(type = 'albers usa'),
  showlakes = TRUE,
  lakecolor = toRGB('white')
)

# render the plot
plot_ly(df, z = df$total.exports, text = df$hover, locations = df$code, type = 'choropleth',
        locationmode = 'USA-states', color = df$total.exports, colors = 'Purples',
        marker = list(line = l), colorbar = list(title = "Millions USD")) %>%
  layout(title = '2011 US Agriculture Exports by State<br>(Hover for breakdown)', geo = g,
         width = 800, height = 500 )



LEAFLET MAPS

For a long period the best base maps in the world were created as proprietary intellectual property by GIS companies that only made them available if you licensed their software for thousands of dollars. Base maps have been liberated through collaborative open-source efforts like the Open Street Maps, a platform that was “built by a community of mappers that contribute and maintain data about roads, trails, railway stations, and much more, all over the world.” Other companies like Google have created APIs to allow people to use their map platforms to build new tools.

The availability of these resources has created a thriving open-source GIS community that has contributed a number of new tools to R. The Leaflet package allows users to select from a wide variety of base map styles, and use them to create interactive web applications.

Let’s look at an example of how we can use the Leaflet package to communicate patterns in municipal code violations to citizens in Syracuse, or build maps internally within the Code Enforcement Department in order to better manage resources. Let’s start with a map of the city. Note that you can zoom and scroll through the map.

library( leaflet )
library( dplyr )

# conjure a map of syracuse

leaflet( ) %>% addTiles() %>% setView( lng=-76.14, lat=43.05, zoom=12 )

Map Styles

If we are not satisfied with the default style of the base map we have huge array of map styles to choose from. Changing one argument in the code will change the overall feel of the map.

leaflet( ) %>% addProviderTiles( "Stamen.Watercolor" )  %>% setView( lng=-76.14, lat=43.05, zoom=12 ) 
leaflet( ) %>% addProviderTiles( "OpenStreetMap.HOT" )  %>% setView( lng=-76.14, lat=43.05, zoom=12 ) 
leaflet( ) %>% addProviderTiles( "Esri.WorldImagery" )  %>% setView( lng=-76.14, lat=43.05, zoom=12 ) 

Add Data

Leaflet maps are pretty, but to be informative we need data. Let’s look at the spatial distribution of residential code violations throughout Syracuse. This data is useful to identify clusters of problematic dwellings within the city, often caused by unscrupulous landlords. To examine this question we will load the code violations data generated by city inspectors and add one blue dot on the map for each instance of a property that was issued one or more code violations.

# read data on code violations in syracuse
violations <- read.csv( "https://raw.githubusercontent.com/lecy/maps-in-R/master/Data/OpenCodeViolations.csv" )

syr.map <- leaflet( ) %>% 
            addProviderTiles( "CartoDB.Positron" )  %>%
            setView( lng=-76.17, lat=43.037, zoom=15 )

addCircles( syr.map, lng=violations$lon, lat=violations$lat )



Patterns in where violations occur are helpful, but if we are trying to effectively enforce violations we need some context to make the data more actionable. Let’s size each violation according to severity, operationalized as the number of infractions identified by a code inspector on a single visit. Let’s also highlight the instances where inspectors find bed bugs since they are easily spread if not remediated quickly. Let’s differentiate those cases by coloring them orange.

We can also add additional information that is reported when you click on a specific violation. One might include things like the address of the property, the data of the violation, or the owner’s name. In this case we will list the types of infractions recorded.

# highlight bed bugs in orange

col.vec <- ifelse( grepl("Bed Bugs", violations$Complaint.Type, ignore.case=TRUE), "orange", "black" )
opac <- ifelse( grepl("Bed Bugs", violations$Complaint.Type, ignore.case=TRUE), 1, 0.4 )
  
# scale points by severity of code violations

addCircles( syr.map, lng = violations$lon, lat = violations$lat, popup = violations$Complaint.Type, 
            radius=violations$Severity, fillColor=col.vec, fillOpacity=opac, stroke=F )



The information is now actionable. We can see where code violations are clustering. We can identify which infractions are serious by the number of violations on each property. We can see which infractions were cited at each address. And we can quickly identify cases like bed bugs that need immediate attention. The maps can be useful for presenting patterns in lots of municipal data such as crime, potholes, or locations of tax subsidies. Leaflet provides a great tool for making this data accessible to managers and citizens.

GEO-CODING IN R

In the code violation example above inspectors had visited various addresses throughout the city and created a database of infractions. These data had been geocoded by the department of code violations - the latitude and longitude coordinates of each infraction were added, typically using a GPS system or using the location of the tax parcel.

In many instances we have data that has interesting spatial applications but we don’t have the coordinates of the events. If we have addresses, however, we can apply a method called ‘geocoding’ to match a street address to a location on the map to generate the latitude and longitude fields.

Let’s consider an example using a database of farmer’s markets in New York City. The city requires all farmer’s markets to register in a public health database and report information including when the market is open, whether it takes food stamps, and the street address. This data has been posted publicly on the city’s Open Data Website.

farmers.markets <- read.csv( "http://data.cityofnewyork.us/api/views/b7kx-qikm/rows.csv?accessType=DOWNLOAD", stringsAsFactors=F )

head( farmers.markets ) %>% pander
Table continues below
Borough Market.Name Street.Address
Bronx Bissel Gardens Farmers’ Market Baychester Ave & E 241st St
Bronx Bronx Borough Hall Greenmarket Grand Concourse bet 161st & 162nd Sts
Bronx Crotona Park Greenmarket Crotona Park South & Clinton Ave, in Crotona Park
Bronx Harvest Home Co-op City Farmers’ Market 140 Bellamy Loop
Bronx Harvest Home Echo Park Market Ryer & Burnside Aves
Bronx Harvest Home Forest Avenue Farmers’ Market Forest Ave bet Westchester Ave & 156th St
Table continues below
Day.s. Hours Distribute.Health.Bucks
Wednesday & Saturday 9am-5pm 1
Tuesday 8am-4pm 1
Saturday 8am-3pm 1
Saturday 8am-4pm 1
Wednesday 8am-4pm 1
Wednesday 8am-4pm 1
Accepts.Health.Bucks EBT Stellar
1 1 0
1 1 1
1 1 1
1 1 0
1 1 0
1 1 0



The ggmap package has implemented a convenient geocoding function that makes use of the Google map API. Street addresses are sent to Google, and they send back latitude and longitude coordinates. One nice feature of using the Google API is that results are fairly predictable - the first result that you see when you type the address in Google Maps is the result that will be returned through the API.

You can likely anticipate the main problem with geocoding the addresses listed above - they provide only the street address in the city. If the street is generic, like First Avenue or MLK Boulevard, then that address might be matched to lots of cities. We need a more complete address for the geocoding to work. Not a problem, we can just add the city field and state name to the address field in the database.

basic <- head( farmers.markets$Street.Address )

addresses <- paste( farmers.markets$Street.Address, farmers.markets$Borough, "NY", sep=", " )

data.frame( BASIC=basic, AUGMENTED=head( addresses ) ) %>% pander
BASIC AUGMENTED
Baychester Ave & E 241st St Baychester Ave & E 241st St, Bronx, NY
Grand Concourse bet 161st & 162nd Sts Grand Concourse bet 161st & 162nd Sts, Bronx, NY
Crotona Park South & Clinton Ave, in Crotona Park Crotona Park South & Clinton Ave, in Crotona Park, Bronx, NY
140 Bellamy Loop 140 Bellamy Loop, Bronx, NY
Ryer & Burnside Aves Ryer & Burnside Aves , Bronx, NY
Forest Ave bet Westchester Ave & 156th St Forest Ave bet Westchester Ave & 156th St, Bronx, NY



We are now good to go. We geocode the street addresses and use the new coordinate system to create a map of all farmer’s markets in NYC.

library( ggmap )

market.coords <- geocode( addresses, messaging=F )

# To add the location data back to your original dataset use cbind()

farmers.markets <- cbind(farmers.markets, market.coords)


ggmap( get_map("Jackson Heights, NY", col="bw", zoom=11 ), extent="device" ) +

  geom_point( data=farmers.markets, aes(x=lon, y=lat), size=5, col="red", alpha=0.5 )



Note that Google limits you to 2,500 queries a day. For more intensive efforts you can set up a commercial account to use Google’s API with a limit of 100,000 queries a day and the option of paying for additional queries.



SHAPEFILES IN R

R has a robust set of packages for working with traditional shapefiles that are used in popular programs like ESRI’s ArcGIS platform. R can read point, line, or polygon shapefiles, project them, create maps, and manipulate components of the map through subset operations similar to ArcGIS.

Shapefiles are notoriously cranky formats to work with because each map consists of five separate files. If you lose one of these you are out of luck. Furthermore, if you want to create a map that has multiple layers - say a US map that has states (polygons), roads (lines), and cities (points) then you will need three separate shapefiles, up to 15 individual files to manage for one map!

A new open standard format has been developed to make shapefiles easier to use. GeoJSON replaces all give files in a traditional shapefile with only one file which can also include attribute data. Additionally, you can include multiple layers in the same GeoJSON file. If you need smaller files, the TopoJSON format can significantly reduce the overall file size. There are some convenient R libraries for converting a typical shapefile to a GeoJSON format.

# LOAD THE SHAPEFILES

setwd( "C:/Users/jdlecy/Dropbox/02 - CLASSES/02 - MASTERS/09 - DDM II/SYR Parcels" )

library( rgdal )
library( maptools )

syr <- readOGR( dsn=".", layer="01-05-2015" )



# GeoJSON files need to be projected in WGS longlat system

proj4string( syr )

syr <- spTransform( syr, CRS("+proj=longlat +datum=WGS84") )

proj4string( syr )




# ISOLATE THE DOWNTOWN

these <- syr$Nhood == "Downtown"
these[ is.na(these) ] <- FALSE

downtown <- syr[ these , ]

plot( downtown )

# to write to a file:
# geojson_write( downtown, geometry="polygon", file="downtown_syr.geojson" )




# CREATE A MAP GIST

map_gist( downtown, geometry="polygon",
          file="Downtown_Syracuse.geojson",
          description="Tax parcel data for downtown Syracuse, NY" )

Another convenient feature of GeoJSON is that files posted on GitHub (think DropBox for programmers), they will automatically render as Leaflet maps. Let’s look at the land parcels in downtown Syracuse. They have already been converted to GeoJSON and the file currently lives on GitHub.





We can read the file into R for analysis using the geojsonio package.

library( geojsonio )
library( maptools )
library( sp )

url <- "https://gist.githubusercontent.com/lecy/8bf3a15ad894ca98b0722b35978a8115/raw/1a57a074fcfe088907225a188c0220709a5e4b2c/Downtown_Syracuse.geojson"

downtown <- geojson_read( url, method="local", what="sp" )

par( mar=c(0,0,2,0) )
plot( downtown, main="Land Parcels in Downtown Syracuse", border="gray30" )



Note that the file automatically contains all of the attributes associated with each tax parcel.

head(   as.data.frame( downtown )[ , c(2:7,11,21)]   ) %>% pander
Table continues below
  PRINTKEY FRONTFEET DEPTH SqFt Acres
15584 101.-10-08.1 143.7 273.2 57934.9309557811 1.33
15604 103.-15-05.0 79.09 78.71 6209.2875127191 0.1425
15607 103.-16-01.0 41 60 2474.54937594631 0.05681
15611 103.-16-02.0 40.89 60 2466.5013066888 0.05662
15613 103.-16-03.0 41.63 60 2499.42374639834 0.05738
15615 103.-16-04.0 20.81 60 1243.83734251091 0.02856
  Sec_Block Nhood LandUse
15584 3115001010000010 Downtown Parking
15604 3115001030000015 Downtown Commercial
15607 3115001030000016 Downtown Parks
15611 3115001030000016 Downtown Commercial
15613 3115001030000016 Downtown Commercial
15615 3115001030000016 Downtown Commercial



We can use these attributes to understand land use patterns within the downtown neighborhood.

# > as.character(unique( dat$LandUse ))
#  [1] "Vacant Land"        "Single Family"      "Commercial"         "Parking"           
#  [5] "Two Family"         "Three Family"       "Apartment"          "Schools"           
#  [9] "Parks"              "Multiple Residence" "Cemetery"           "Religious"         
# [13] "Recreation"         "Community Services" "Utilities"          "Industrial"


# HIGHLIGHT LAND USE FEATURES

par( mar=c(0,0,2,0), mfrow=c(2,3) )

residential <- ifelse( downtown$LandUse %in% c("Apartment","Multiple Residence","Single Family") , "red", "gray90" )
plot( downtown, col=residential, border=NA, main="Residential" )

commercial <- ifelse( downtown$LandUse == "Commercial", "red", "gray90" )
plot( downtown, col=commercial, border=NA, main="Commercial Properties" )

recreational <- ifelse( downtown$LandUse == "Parks", "red", "gray90" )
plot( downtown, col=recreational, border=NA, main="Parks" )


vacant <- ifelse( downtown$LandUse == "Vacant Land", "red", "gray90" )
plot( downtown, col=vacant, border=NA, main="Vacant Land" )

parking <- ifelse( downtown$LandUse == "Parking", "red", "gray90" )
plot( downtown, col=parking, border=NA, main="Parking Lots" )

religious <- ifelse( downtown$LandUse == "Religious", "red", "gray90" )
plot( downtown, col=religious, border=NA, main="Churches" )



From this analysis we can see that although many cities have developed a lot of mixed-use properties in their urban core, the downtown neighborhood in Syracuse is still dominated by commercial space. The large number of parking lots support the commuter population since it is not possible to live close to where you work without residential offerings. These parking lots also erode some of the most lucrative tax base for the city since the land value will be much lower when there is not building on top of it. There are few parks to make the downtown area livable.

Syracuse has embraced an urban planning model where the business district is separate from the residential space within the community. As a consequence the downtown is surrounded by large highways that are needed to carry workers from their homes to their offices, and they need a significant number of parking lots that are utilized primarily during business hours. Streets are wide and concrete is the dominant feature. Compare this to the town of Cambridge, MA, that was built before the highway boom. Although it is five times more dense than Syracuse it has small streets and fewer parking lots. Residential and commercial space is interspersed, meaning people can walk to work if they live nearby. These two downtowns can be compared side-by-side by grabbing map tiles at the same scale using the ggmap package:

library( ggmap )

syracuse <- ggmap( get_map( c(-76.152237, 43.048708), zoom=16, maptype="satellite" ), extent="device" )
cambridge <-  ggmap( get_map( "cambridge, ma", zoom=16, maptype="satellite" ), extent="device" )

multiplot( syracuse, cambridge, cols = 2 )



CONCLUSION

The examples presented here are only scratching the surface on the geospatial capabilities of R, but hopefully they demonstrate some of the useful tools that are emerging from the open-source community. Note that most of these maps were created with only a few lines of code, meaning it does not take a tremendous amount of training to begin building applications in R.