Now that we have the (very) basics of leaflet down, let’s try to load some geoJSON and plot it. geoJSON is the easiest geo-spatial format to work with, as it acts essentially as both a data frame and a geo-spatial one, to oversimplify a bit. So we can use dplyr’s data-cleaning tools on it - filter, summarize, mutate, and so on. We can merge it with other data sets, be they spatial or non-spatial.

Let’s get geoJSON that defines the county boundaries of California:

https://gis.data.ca.gov/datasets/CALFIRE-Forestry::california-county-boundaries/explore

ca <- read_sf("~/Downloads/California_County_Boundaries.geojson")

Define a color palette

pal <- colorNumeric("viridis", ca$OBJECTID,  58)

(have to describe much more about picking a palette here. also link to documention)

Let’s plot it!

leaflet(ca) %>%
  addTiles() %>%
  addPolygons(stroke = FALSE, 
              smoothFactor = 0.9, 
              fillOpacity = 0.7,
              fillColor = ~pal(ca$OBJECTID),
              label = ~paste0(ca$COUNTY_NAME)
)

Okay! Not every website provides geoJSON, of course. Let’s take things further, and try to map data from a .csv file that includes latitude and longitude pairs. I love to play with restaurant inspection data from San Francisco: DataSF

If we click the ‘export’ button on this website, we have the option of downloading the data in multiple formats, including all of the ones mentioned above (geoJSON, CSV, Shapefile). Let’s try the CSV. Mine downloaded to my ‘Downloads’ folder, which can be accessed on a Mac with the address ‘~/Downloads/ -’ the ‘~’ means ‘start in your home folder.’

san_fran <- read_csv("~/Downloads/Restaurant_Scores_-_LIVES_Standard.csv")
## Rows: 53973 Columns: 22
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (12): business_name, business_address, business_city, business_state, bu...
## dbl (10): business_id, business_latitude, business_longitude, business_phone...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

We get some error messages, as we don’t tell R how to interpret every column, but it should still work fine. That said, leaflet does not like NAs, or messy data in general.

That sensitivity, combined with the frequent messiness of data available on city and state open data portals, means you’ll need to be able to clean data and solve problems.

Let’s be safe and clear out any NAs on the latitude and longitude columns. We will overwrite ‘san_fran’ with itself, minus the NAs:

san_fran %>% 
  drop_na('business_latitude') %>% 
  drop_na('business_longitude')  -> san_fran

This time, let’s use a few different Leaflet features: setView() centers our map, ‘zoom’ controls the zoom, and ‘addCircleMarkers’ changes the marker from the default ‘teardrop’ appearance. I’ve gone ahead and grabbed a latitude and longitude point from Google Maps to act as our center point for the map.

map1 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 12) %>% 
  addCircleMarkers(
    data = san_fran
    )
## Error in guessLatLongCols(names(obj)): Couldn't infer longitude/latitude columns

That didn’t work! Why not? If we look at the dataframe ‘san_fran,’ we’ll be reminded that the latitude and longitude columns are named ‘business_latitude’ and ‘business_longitude.’ Computers are both smart and dumb: any human could figure this out, but we need to explicitly tell Leaflet that these are the columns where our lat/long data resides. And that, in a nutshell, is computer programming.

map1 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 13) %>% 
  addCircleMarkers(
    data = san_fran,
    lat = san_fran$business_latitude,
    lng = san_fran$business_longitude
    )

map1

Ok, well there are good and bad things! Let’s adjust some of the aesthetics of our data layer, and also tell leaflet to load the content of the ‘business_name’ column for each popup:

map1 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 13) %>% 
  addCircleMarkers(
    data = san_fran,
    lat = san_fran$business_latitude,
    lng = san_fran$business_longitude,
    radius =3,
    stroke = FALSE,
    popup = paste0(san_fran$business_name)
    )

map1

Better. The ‘paste0’ command we used for the popup means “paste the values from the ‘business name’ column into the popup for each marker.”

We want to see the inspection score, as well as descriptions for all of the popup data to provide some context of what we’re looking at. This will require adding some HTML formatting to the popup, as well as some labels.

Anything entered into R that is not an R command or variable name, like manually entered text or HTML code, needs to be written in quotes.We’ll string together the contents of the popup with commas:

map1 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 13) %>% 
  addCircleMarkers(
    data = san_fran,
    lat = san_fran$business_latitude,
    lng = san_fran$business_longitude,
    radius =3,
    stroke = FALSE,
    popup = paste0("Name:", san_fran$business_name,
                   "<br/>", 
                   "Inspection Date:",     
                   san_fran$inspection_date, 
                   "<br/>",
                   "Inspection Score:",
                   san_fran$inspection_score,
                   "<br/>",
                   "Violation:", 
                   san_fran$violation_description)
    )

map1

Let’s filter the data so it only includes scores below 75.

san_fran %>% 
  filter(inspection_score < 75) -> san_fran_low

Then, a new map - but let’s make a new map variable, as well as using our new dataframe variable (san_fran_low), since our last map was perfectly viable.

Other than that, the code is the same - but be careful using cut and paste when it comes to code. Remember that the punctuation is key to the syntax, and an errant parenthesis can throw a confusing error.

map2 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 13) %>% 
  addCircleMarkers(
    data = san_fran_low,
    lat = san_fran_low$business_latitude,
    lng = san_fran_low$business_longitude,
    radius =3,
    stroke = FALSE,
    popup = paste0("Name:", san_fran_low$business_name,
                   "<br/>", 
                   "Inspection Date:",     san_fran_low$inspection_date, 
                   "<br/>",
                   "Inspection Score:",
                   san_fran_low$inspection_score,"<br/>",
                   "Violation:", san_fran_low$violation_description)
    )
map2

That’s much easier to read and interact with. The only issue now is that the colors of the markers do not indicate anything related to the inspection score - could we make them do so?

Yes. First, we define a palette - in this case, we’ll use an existing palette, “Reds,’ and we’ll tell leaflet to use 5 different shades of reds, based on the inspection score value. We can do this because a) our Inspection Score column is numeric, b) that numeric column, Inspection Score, is also a continuous variable, and c) we can use the ‘colorBin’ function to ‘bin’ our 5 colors into the ranges of values for ‘Inspection Score.’ If R threw an error, we should look back and make sure all 3 conditions are true with our data.

By the way, ‘pretty = TRUE’ makes it look nicer - for instance, if you made a choropleth map of filled-in states, the fill area would accurately match the state boundary. With ‘pretty=false’ as your setting, your maps will render faster - a good idea while you’re still testing out code.

#install.packages('htmltools')
library(htmltools)

binpal <- colorBin("Reds", san_fran_low$inspection_score, 3, pretty = TRUE)

map2 <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(lng=-122.443334, lat=37.749882 , zoom = 13) %>% 
  addCircleMarkers(
    data = san_fran_low,
    lat = san_fran_low$business_latitude,
    lng = san_fran_low$business_longitude,
   # label = ~htmlEscape(san_fran_low$inspection_score),
    #  labelOptions = labelOptions(noHide = T, textsize = 9, opacity = .3),
    radius =3,
    stroke = "purple",
     fillOpacity = 0.7,
    color = ~binpal(san_fran_low$inspection_score),
    popup = paste0("Name:", san_fran_low$business_name,
                   "<br/>", 
                   "Inspection Date:",     san_fran_low$inspection_date, 
                   "<br/>",
                   "Inspection Score:",
                   san_fran_low$inspection_score,"<br/>",
                   "Violation:", san_fran_low$violation_description)
    )
map2

Wow, that became a lot of code! Imagine mapping as creating two plots: the map underneath and the data on top (and there may be multiple columns of data to overlay on the map, as with our example). Thus we have to write twice as much code to make a map show up the way we want it to.

Continue to part III.