In this project I use Anjali Doney’s great tutorial on mapping and apply it to NYC bike data. https://medium.com/fastah-project/a-quick-start-to-maps-in-r-b9f221f44ff3

We will map NYC bike accidents from 2012 through 2018. There were 27,000 accidents involving bikes. More than 100 resulted in the biker dying. Those 2 numbers , 27,000 and 100, forced us into a second project: layering maps. The plotted points on the map are a scatter plot and how to show 100 of one kind in sea of 27,000 of another kind became an issue. I searched for some combination of the correct point size , color, transparency that would not obscure the smaller number. Layering was the solution.

This project requires a google API key and that entails setting up aGoogle Cloud account and giving them a credit card. But I have never gotten a bill for more than 80 cents so don’t hesitate on that account.

This next part is critical. While there are many R Mapping with ggmap tutorials almost none mention how to insert your google key. Here is how: register_google(key = " your number here“)

register_google(key = “xxxxxxxxxxxxxxxxxxxxxxx”)

The Packages

library(tidyverse)
## -- Attaching packages ------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.8
## v tidyr   0.8.1     v stringr 1.3.1
## v readr   1.1.1     v forcats 0.3.0
## -- Conflicts ---------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggmap)
## Google Maps API Terms of Service: http://developers.google.com/maps/terms.
## Please cite ggmap if you use it: see citation("ggmap") for details.
library(ggplot2)
#register_google(key = "your key here")
register_google(key = "AIzaSyDNJ-0-SrVT2FyV504iPE1urOEo3n2C0R4")

How easy it is to get maps

map.NY<- get_map("New York" , zoom=10)
## Source : https://maps.googleapis.com/maps/api/staticmap?center=New+York&zoom=10&size=640x640&scale=2&maptype=terrain&language=en-EN&key=AIzaSyDNJ-0-SrVT2FyV504iPE1urOEo3n2C0R4
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=New%20York&key=AIzaSyDNJ-0-SrVT2FyV504iPE1urOEo3n2C0R4
ggmap(map.NY)

nyc<-get_map("New York City" , zoom = 10 , maptype = "toner-lite", source = "google")
## maptype = "toner-lite" is only available with source = "stamen".
## resetting to source = "stamen"...
## Source : https://maps.googleapis.com/maps/api/staticmap?center=New+York+City&zoom=10&size=640x640&scale=2&maptype=terrain&key=AIzaSyDNJ-0-SrVT2FyV504iPE1urOEo3n2C0R4
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=New%20York%20City&key=AIzaSyDNJ-0-SrVT2FyV504iPE1urOEo3n2C0R4
## Source : http://tile.stamen.com/toner-lite/10/300/383.png
## Source : http://tile.stamen.com/toner-lite/10/301/383.png
## Source : http://tile.stamen.com/toner-lite/10/302/383.png
## Source : http://tile.stamen.com/toner-lite/10/300/384.png
## Source : http://tile.stamen.com/toner-lite/10/301/384.png
## Source : http://tile.stamen.com/toner-lite/10/302/384.png
## Source : http://tile.stamen.com/toner-lite/10/300/385.png
## Source : http://tile.stamen.com/toner-lite/10/301/385.png
## Source : http://tile.stamen.com/toner-lite/10/302/385.png
## Source : http://tile.stamen.com/toner-lite/10/300/386.png
## Source : http://tile.stamen.com/toner-lite/10/301/386.png
## Source : http://tile.stamen.com/toner-lite/10/302/386.png
ggmap(nyc)

Good To Know One’s Boundaries.

‘’’ NYC boundaries West -74.257159 East -73.699215 North 40.915568 South 40.495992 Lattitude distance from equateor ‘’’

Making Boundry Specific Map

One can also just call up a map but in our case the center point of that map is too far West and so to include the entire city we would have had both zoom ouot to far and bring in non NYC areas.

lat<- c(40.915568 , 40.495992)
long<- c(-74.257159 ,-73.699215)
bbox<- make_bbox(long, lat,f=0.05)
c<- get_map(bbox, maptype = "toner-lite" , source = "stamen")
## Source : http://tile.stamen.com/toner-lite/11/601/768.png
## Source : http://tile.stamen.com/toner-lite/11/602/768.png
## Source : http://tile.stamen.com/toner-lite/11/603/768.png
## Source : http://tile.stamen.com/toner-lite/11/604/768.png
## Source : http://tile.stamen.com/toner-lite/11/601/769.png
## Source : http://tile.stamen.com/toner-lite/11/602/769.png
## Source : http://tile.stamen.com/toner-lite/11/603/769.png
## Source : http://tile.stamen.com/toner-lite/11/604/769.png
## Source : http://tile.stamen.com/toner-lite/11/601/770.png
## Source : http://tile.stamen.com/toner-lite/11/602/770.png
## Source : http://tile.stamen.com/toner-lite/11/603/770.png
## Source : http://tile.stamen.com/toner-lite/11/604/770.png
## Source : http://tile.stamen.com/toner-lite/11/601/771.png
## Source : http://tile.stamen.com/toner-lite/11/602/771.png
## Source : http://tile.stamen.com/toner-lite/11/603/771.png
## Source : http://tile.stamen.com/toner-lite/11/604/771.png
ggmap(c)

https://docs.google.com/spreadsheets/d/e/2PACX-1vRAMnk_n7a7VUVFrlXWJgW4wOn1p-jwlqS0mOhpvHJakeXi2nAc1r6K_ScVMufh37flpCjMdpZSxwKC/pubhtml

bike_accidents<-read_csv("Bike_Accidents.csv")
## Warning: Missing column names filled in: 'X1' [1]
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   X1 = col_integer(),
##   timestamp = col_double(),
##   latitude = col_double(),
##   longitude = col_double(),
##   number_of_cyclist_injured = col_integer(),
##   number_of_cyclist_killed = col_integer(),
##   survived = col_integer(),
##   number_of_motorist_injured = col_integer(),
##   number_of_motorist_killed = col_integer(),
##   number_of_pedestrians_injured = col_integer(),
##   number_of_pedestrians_killed = col_integer(),
##   number_of_persons_injured = col_integer(),
##   number_of_persons_killed = col_integer(),
##   unique_key = col_integer(),
##   zip_code = col_integer()
## )
## See spec(...) for full column specifications.
dim(bike_accidents)
## [1] 27527    31

Lets drop any rows without complete latitude/longitude.

bike_accidents<-bike_accidents%>%drop_na(latitude)
dim(bike_accidents)
## [1] 23455    31
bike_accidents<- bike_accidents%>%drop_na(longitude)
dim(bike_accidents)
## [1] 23455    31
write.csv(bike_accidents,"Bike_Accidents_Cleaned.csv")
Bike_Accidents<- read_csv("Bike_Accidents_Cleaned.csv")
## Warning: Missing column names filled in: 'X1' [1]
## Warning: Duplicated column names deduplicated: 'X1' => 'X1_1' [2]
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   X1 = col_integer(),
##   X1_1 = col_integer(),
##   timestamp = col_double(),
##   latitude = col_double(),
##   longitude = col_double(),
##   number_of_cyclist_injured = col_integer(),
##   number_of_cyclist_killed = col_integer(),
##   survived = col_integer(),
##   number_of_motorist_injured = col_integer(),
##   number_of_motorist_killed = col_integer(),
##   number_of_pedestrians_injured = col_integer(),
##   number_of_pedestrians_killed = col_integer(),
##   number_of_persons_injured = col_integer(),
##   number_of_persons_killed = col_integer(),
##   unique_key = col_integer(),
##   zip_code = col_integer()
## )
## See spec(...) for full column specifications.
#Bike_Accidents<- read_csv("Bike_Accidents.csv")
head(Bike_Accidents)
## # A tibble: 6 x 32
##      X1  X1_1 borough contributing_fa~ contributing_fa~ contributing_fa~
##   <int> <int> <chr>   <chr>            <chr>            <chr>           
## 1     1    95 MANHAT~ Other Vehicular  Unspecified      <NA>            
## 2     2 27426 MANHAT~ Other Vehicular  Unspecified      <NA>            
## 3     3 27427 BRONX   Driver Inattent~ Unspecified      Unspecified     
## 4     4 27428 <NA>    Unsafe Speed     Unspecified      <NA>            
## 5     5 27429 <NA>    Pedestrian/Bicy~ Unspecified      <NA>            
## 6     6 27430 <NA>    Driver Inattent~ Unspecified      <NA>            
## # ... with 26 more variables: contributing_factor_vehicle_4 <chr>,
## #   contributing_factor_vehicle_5 <chr>, cross_street_name <chr>,
## #   timestamp <dbl>, latitude <dbl>, longitude <dbl>, location <chr>,
## #   number_of_cyclist_injured <int>, number_of_cyclist_killed <int>,
## #   survived <int>, lived <chr>, number_of_motorist_injured <int>,
## #   number_of_motorist_killed <int>, number_of_pedestrians_injured <int>,
## #   number_of_pedestrians_killed <int>, number_of_persons_injured <int>,
## #   number_of_persons_killed <int>, off_street_name <chr>,
## #   on_street_name <chr>, unique_key <int>, vehicle_type_code1 <chr>,
## #   vehicle_type_code2 <chr>, vehicle_type_code_3 <chr>,
## #   vehicle_type_code_4 <chr>, vehicle_type_code_5 <chr>, zip_code <int>

First plot. The problem. Those killed are obsucured.

ggmap(c)+
  geom_point(data=Bike_Accidents,
             aes(longitude,latitude,color=lived),  size=5,alpha =0.9)+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")
## Warning: Removed 7 rows containing missing values (geom_point).

No matter what size we make the points nor what we set alpha too the accidents obscure the deaths.

ggmap(c)+
  geom_point(data=Bike_Accidents,
             aes(longitude,latitude,color=lived),  size=1,alpha =0.1)+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")
## Warning: Removed 7 rows containing missing values (geom_point).

ggmap(c)+
  geom_point(data=Bike_Accidents,
             aes(longitude,latitude,color=lived),  size=0.01,alpha =0.5)+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")
## Warning: Removed 7 rows containing missing values (geom_point).

The legend tells us there are deaths there but those victims are lost in the slaughter.

We will seperate out those who were killed biking.

biker_deaths<- filter(Bike_Accidents ,Bike_Accidents$number_of_cyclist_killed>0 )
dim(biker_deaths)
## [1] 84 32

Next we will make a seperate plots of Injured and Killed and then plot them one on top of the other as layers.

write.csv(biker_deaths, "biker_deaths.csv")
biker_deaths<- read_csv("biker_deaths.csv")
## Warning: Missing column names filled in: 'X1' [1]
## Warning: Duplicated column names deduplicated: 'X1' => 'X1_2' [2]
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   X1 = col_integer(),
##   X1_2 = col_integer(),
##   X1_1 = col_integer(),
##   timestamp = col_double(),
##   latitude = col_double(),
##   longitude = col_double(),
##   number_of_cyclist_injured = col_integer(),
##   number_of_cyclist_killed = col_integer(),
##   survived = col_integer(),
##   number_of_motorist_injured = col_integer(),
##   number_of_motorist_killed = col_integer(),
##   number_of_pedestrians_injured = col_integer(),
##   number_of_pedestrians_killed = col_integer(),
##   number_of_persons_injured = col_integer(),
##   number_of_persons_killed = col_integer(),
##   unique_key = col_integer(),
##   zip_code = col_integer()
## )
## See spec(...) for full column specifications.
dim(biker_deaths)
## [1] 84 33
A<-ggmap(c)+
  geom_point(data=biker_deaths,
             aes(longitude,latitude),  size=3,alpha =0.5, color="red")+
               labs(x= "Long", y= "Lat" ,
                title="Died")
                  
A

biker_injuries<- filter(Bike_Accidents, Bike_Accidents$number_of_cyclist_injured>0) 
dim(biker_injuries)
## [1] 23379    32
B<-ggmap(c)+
  geom_point(data=biker_injuries,
             aes(longitude,latitude),  size=0.01,alpha =0.5, color="yellow")+
               labs(x= "Long", y= "Lat" ,
                title="Injured")
B
## Warning: Removed 7 rows containing missing values (geom_point).

Here the two maps are joined and the colors more apprpriate, red for injured, black for killed.

ggmap(map.NY)+
  geom_point(data=biker_injuries,
             aes(longitude,latitude),  size=0.01,alpha =0.5, color="red")+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")+
  geom_point(data=biker_deaths,
             aes(longitude,latitude),  size=3,alpha =0.5, color="black")+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")
## Warning: Removed 7 rows containing missing values (geom_point).

Next we will clean up the map some , removing needless labels and adjusting point size.

Injuries_and_Deaths<-ggmap(map.NY)+
  geom_point(data=biker_injuries,
             aes(longitude,latitude),  size=0.001,alpha =0.5, color="red")+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")+
  geom_point(data=biker_deaths,
             aes(longitude,latitude),  size=2,alpha =0.9, color="black")+
               labs(x= "Long", y= "Lat" ,
                title="P2",
                  color= "Survived")+
  ggtitle(" A Sea of Red")
Injuries_and_Deaths
## Warning: Removed 7 rows containing missing values (geom_point).

Injuries_and_Deaths+
  theme(axis.line=element_blank(),axis.text.x=element_blank(),
          axis.text.y=element_blank(),axis.ticks=element_blank(),
          axis.title.x=element_blank(),
          axis.title.y=element_blank(),legend.position="none",
          panel.background=element_blank(),panel.border=element_blank(),panel.grid.major=element_blank(),
          panel.grid.minor=element_blank(),plot.background=element_blank())
## Warning: Removed 7 rows containing missing values (geom_point).

Injuries_and_Deaths
## Warning: Removed 7 rows containing missing values (geom_point).