My capstone project for Data 205 is an interactive map showing affordable housing and transporation options in Montgomery County. I knew I was going to do this GIS project coming into the semester. My sister-in-law, who works for a non-profit promoting alternative energy policies and programs, suggested the map of electric vehicle charging availability to low-to-middle income households. We had several conversations about the data to include on the map, affordable housing, public transit, bike sharing, car sharing, etc.
The first thing I did was take several online courses about working with geo-spatial data and using Leaflet in R to create maps. And I began gathering data before the semester started.
I was able to find data for all but one of the layers I had planned. I couldn’t find substantial Car Sharing data without joining multiple hourly car rental websites, which isn’t something I was willing to do. The data that Montgomery County has about Car Sharing locations is minimal.
I coded 8 separate .Rmd files for the project, including the file to create this report. There are five files for creating layers in the map.
The first six files are imported into this report as child .Rmd files.
I created separate datasets from each of my source input files as I developed each layer in turn. I chose to use longitude and latitude as separate fields in tibbles for my marker data. There were a few instances where SpatialPoints were needed for a specific function, but the results of those functions were added to the tibble. I became more confident using the spatial data types as my work on the project progressed, but I chose to keep using tibbles rather than rework code that was performing well.
The Ride-On Bus stops data, Data Montgomery - Montgomery_County_Ride_On_Bus_Stops contains the bus stop name and the longitude and latitude point data. This data is used to create a marker layer in the map. The data needed minimal cleaning.
bus <- read_csv('./sourcedData/Montgomery_County_Ride_On_Bus_Stops.csv')
names(bus) <- str_to_lower(str_replace_all(names(bus), ' ','_')) # clean the names
bus$stopname <- str_to_title(bus$stopname) # clean the stop names
bus$stopname <- str_replace(bus$stopname, "& @", "@") # replace some odd looking punctuation
glimpse(bus) # look at the bus data
## Rows: 1,000
## Columns: 2
## $ stopname <chr> "Rockledge Dr @6600", "Cheshire Dr & Old Georgetown Rd", "Che~
## $ point <chr> "POINT (-77.13205700033012 39.028854000012245)", "POINT (-77.~
The data contains stops that are across the street from each other, or otherwise close to each other. I filtered the data so that the bus stops are at least 300 feet apart. I parsed out the longitude and latitude fields and created a matrix to use in the filterByProximity function from the rangeBuilder package.
This is an example of code that I could have refactored using SpatialPoints, but it gets the job done as it is.
p1 <- bus
p1$xy <- substr(p1$point, 8, str_locate(p1$point, "\\)") - 1) # parse the point data into separate columns
options(digits=17) # set the decimal digits for float fields
options(pillar.sigfig = 17) # set significant digits for float fields
bus$lon <- substr(p1$xy, 1, str_locate(p1$xy, " ")) %>% as.numeric() # parse the point data into lon
bus$lat <- substr(p1$xy, str_locate(p1$xy, " "), length(p1$xy)) %>% as.numeric() # and lat
bus$category <- "Bus_Stop" # add category column to the data
bus_all <- bus %>% dplyr::select(category, name = stopname, lon, lat) # select fields for marker layer
bmat <- cbind(bus$lon, bus$lat) # create a matrix of lon & lat
b2 <- filterByProximity(bmat, 0.09, returnIndex = F) %>% # remove duplicates that are less than
as_tibble(.name_repair = "unique") # 300 ft apart
names(b2) <- c("lon", "lat") # name the columns
bus <- left_join(b2, bus_all) # join to get the desired bus stops
The Metro Bus Stop data comes from Maryland Open Data, Metro Bus Stops, Maryland_Transit_-_WMATA_Metro_Bus_Stops
The Metro Bus Stops are in a shape file which contains all of the stops in the system. The Cartographic Reference System (CRS) of the shape data is Mercator which I transformed to WGS84 to be compatible with the rest of the data.
In order to get the stops that are in Montgomery County I use the Zipcodes shape file from Data Montgomery at the mcatlas.org website, Montgomery Planning - MC Atlas Zipcodes-Montgomery. The data is filtered to select the stops that are in Montgomery County using the point.in.poly function from the spatialEco package. Then the stop name, longitude and latitude are selected from the data and it’s filtered so that the stops are at least 300 feet apart.
# metro bus data
metro_bus <- raster::shapefile("./sourcedData/Maryland_Transit_-_WMATA_Metro_Bus_Stops/TRAN_WMATAMetroBusStops_WMATA.shp")
# zipcode polygons for Montgomery County
zipcode <- raster::shapefile("./shapefiles/zipcode.shp", GDAL1_integer64_policy = TRUE)
zipcode@data <- zipcode@data %>% dplyr::select(zipcode, mail_city, shape_area, shape_leng)
metro_bus <- sp::spTransform(metro_bus, proj4string(zipcode)) # transform CRS from mercator to WGS84
require(spatialEco)
mbus <- point.in.poly(metro_bus, zipcode, duplicate = F) # filter for bus stops in MC
mbus <- mbus %>% as_tibble(.name_repair = "unique") %>% # create a df
dplyr::filter(pid1 != "NA") %>% # remove missing data
dplyr::arrange(AT_STR, ON_STR) # sort by cross street names
mbus$category <- "Metro_Bus_Stop" # add category column
mbus$name <- base::paste(mbus$ON_STR, "@", mbus$AT_STR) # create the stop name from the cross streets
mbus_all <- mbus %>% dplyr::select(category, # select the fields for the marker data
name,
lon = coords.x1,
lat = coords.x2)
bmat <- cbind(mbus_all$lon, mbus_all$lat) # matrix of lon & lat
b2 <- filterByProximity(bmat, 0.09, returnIndex = F) %>% # filter out stops less than 300 ft apart
as_tibble(.name_repair = "unique" )
names(b2) <- c("lon", "lat") # rename the lon / lat columns
mbus <- dplyr::left_join(b2, mbus_all) # join to get the desired bus stops
There are 13 Red Line Metro Rail stations in Montgomery County. The data from Data Montgomery Metro-Rail-Stations is already tidy and includes WGS84 longitude and latitude attributes.
metro <- read_csv('./sourcedData/Metro_stations.csv') # read metro station data
names(metro) <- str_to_lower(str_replace_all(names(metro), ' ','_')) # clean up the names
glimpse(metro) # look at the file
## Rows: 13
## Columns: 14
## $ objectid <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
## $ category <chr> "METRO STATIONS", "METRO STATIONS", "METRO STATIONS", "METRO~
## $ name <chr> "Wheaton", "Forest Glen", "Silver Spring", "Friendship Heigh~
## $ address <chr> "11171 Georgia Ave", "9730 Georgia Ave", "8400 Colesville Rd~
## $ city <chr> "Wheaton", "Wheaton", "Silver Spring", "Friendship Heights",~
## $ zipcode <dbl> 20902, 20902, 20910, 20815, 20814, 20892, 20852, 20852, 2085~
## $ phone <chr> "202-637-7000", "202-637-7000", "202-637-7000", "202-637-700~
## $ url <chr> "http://www.wmata.com", "http://www.wmata.com", "http://www.~
## $ data <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ point_x <dbl> 1298033.3787258500, 1300013.8594956901, 1303492.1235629299, ~
## $ point_y <dbl> 499621.71499564999, 491147.35620375001, 483379.39024451998, ~
## $ shape <chr> "(39.038541001000056, -77.05034726599996)", "(39.01527587900~
## $ latitude <dbl> 39.038533139999998, 39.015268030000001, 38.993943440000002, ~
## $ longitude <dbl> -77.050344989999999, -77.043358260000005, -77.03110723999999~
metro <- metro %>% dplyr::select(category, # the file already contains category
name, # the station name
address, # the station address
lon = longitude, lat = latitude) # geo point data
p1$category <- "Metro_station"
I’ve assigned a category field to each of the three datasets, and I combine them to create a single file with all of the public transportation points. This file will be used to calculate the distance of affordable housing from public transportation.
The raster package has a select() method which causes conflicts when I try to select from a tibble. So I need to specify the package when I have both packages loaded. This is an instance where using spatial data types would have been a bit cleaner.
b1 <- rbind(bus_all, mbus_all) # combine ride-on and metro bus stops
bmat <- cbind(b1$lon, b1$lat) # matrix of lon & lat
b2 <- filterByProximity(bmat, 0.09, returnIndex = F) %>% # filter out stops closer than 300 feet
as_tibble(.name_repair = "unique")
names(b2) <- c("lon", "lat") # rename the lon & lat fields
b1 <- left_join(b2, b1) # join to get desired bus stops
p1 <- metro %>% dplyr::select(!address) # get the metro station geo points
public_transport <- rbind(p1, b1) # bind into single file
pacman::p_unload(rgdal, GISTools, rangeBuilder)
The three marker files are written to the /markers folder and will be used for building the map. The combined pub_trans file is written to the /working directory.
# write_csv(bus, './markers/bus_stops.csv')
# write.csv(mbus, "./markers/metro_bus.csv")
# write.csv(metro, "./markers/metro.csv")
# write_csv(public_transport, './working/public_transport.csv')
I found electric vehicle charging locations data from the Alternative Fuels Data Center (AFDC) website - Alternative Fuel Stations. The data is downloadable by state, so I selected “ELEC” as the fuel-type and MD for the state.
The alternative fuel data contains 65 fields most of which I don’t need. I select the category, address data, access_code (public/private) and geom point data.
evc <- evc %>% dplyr::select(category = fuel_type_code, # select desired columns
address = street_address,
city,
zip,
access_code,
l1_num = ev_level1_evse_num,
l2_num = ev_level2_evse_num,
dc_fast_num = ev_dc_fast_count,
lat = latitude,
lon = longitude
)
evc$l1_num <- replace_na(evc$l1_num, 0 ) # replace n/a with zeros for counts
evc$l2_num <- replace_na(evc$l2_num, 0 ) # in the charging station columns
evc$dc_fast_num <- replace_na(evc$dc_fast_num, 0 )
evc$evse_tot <- evc$l1_num + evc$l2_num + evc$dc_fast_num # calculate the total station counts
The ev charging data was the first set of data I worked with, and I hadn’t found the zip code shape file yet. I used the zip code file from USPS and filtered it for Montgomery County zip codes.
zipcodes <- read_csv('./sourcedData/usazipcode-1512j.csv') %>% tibble() # get the USPS zipcode data
glimpse(zipcodes)
## Rows: 38,805
## Columns: 5
## $ zip <chr> "501", "1001", "1002", "1003", "1004", "1005", "1007", ~
## $ `Zipcode name` <chr> "HOLTSVILLE, NY", "AGAWAM, MA", "AMHERST, MA", "AMHERST~
## $ City <chr> "HOLTSVILLE", "AGAWAM", "AMHERST", "AMHERST", "AMHERST"~
## $ State <chr> "NY", "MA", "MA", "MA", "MA", "MA", "MA", "MA", "MA", "~
## $ `County Name` <chr> "SUFFOLK", "HAMPDEN", "HAMPSHIRE", "HAMPSHIRE", "HAMPSH~
names(zipcodes) <- str_to_lower(str_replace_all(names(zipcodes), ' ','_')) # clean the column names
moco_zip <- zipcodes %>%
filter( state == "MD" & county_name == "MONTGOMERY") %>% # filter for Montgomery County
dplyr::select(zip) # select the zipcode column
I use a semi-join to get the rows in ev charging that have Montgomery County zip codes.
evc <- semi_join(evc, moco_zip, by = "zip") # get ev records for Montgomery County
Then I sum up the total number of charging stations per location.
evc <- evc %>%
group_by(address, city, zip, access_code, lon, lat) %>%
summarise(evse_tot = sum(evse_tot)) # sum the total stations
evc$address <- paste0(evc$address, " ", evc$city, ", MD ", evc$zip) # create marker description
And write the EV Charging data file to the markers folder.
# write_csv(evc, './markers/ev_charging.csv')
2021 Annual Rental Facility Occupancy Survey Facilities includes the average reported rent by bedroom count for Montgomery County facilities. This data is available from
The affordable rents for 2016 through 2021 are separate columns in the data.
rents <- read_csv("./sourcedData/2021_Rental_Facility_Occupancy_Survey_Results.csv",
locale=locale(encoding = "windows-1252"))
names(rents) <- str_to_lower(str_replace_all(names(rents), ' ','_'))
glimpse(rents)
## Rows: 1,550
## Columns: 11
## $ community_name <chr> "8513 Flower", "Central"~
## $ community_address <chr> "8513 FLOWER AVE TAKOMA ~
## $ bedroom_types <chr> "1 bedroom", "2 bedroom"~
## $ average_rent_2016 <chr> NA, NA, NA, NA, "$ 875~
## $ average_rent_2017 <chr> NA, NA, "$ 650", NA, "~
## $ average_rent_2018 <chr> NA, "$ 2,624", "$ 65~
## $ average_rent_2019 <chr> "$ 800", "$ 2,637", ~
## $ average_rent_2020 <chr> "$ 895", "$ 2,644", ~
## $ average_rent_2021 <chr> "$ 895", "$ 2,653", ~
## $ `percent_change_from_previous_year_2020-2021` <dbl> 0.0000, 0.0032, 0.0000, ~
## $ geo_location <chr> NA, NA, NA, NA, NA, NA, ~
cat('\nunique addresses = ', length(unique(rents$community_address)),
'\nunique community names', length(unique(rents$community_name)))
##
## unique addresses = 706
## unique community names 703
I’m interesed in the most recent information so I select the 2021 rent data, and look at the summary information.
#Select the name, address, bedroom type, 2021 average rent and the geo-location fields
rents_2021 <- rents %>% dplyr::select(starts_with(c('comm', 'bed')), average_rent_2021, geo_location)
#Create an integer bedroom count field from the first character of bedroom_type, assign 0 to efficiency
rents_2021$bedroom_count <- str_sub(rents_2021$bedroom_types, 1,1) %>% str_replace('e', '0') %>% as.integer()
#Convert the average rent field to an integer and sort the data by bedroom count and average rent
rents_2021$average_rent_2021 <- as.integer(gsub("[$,]", "", rents_2021$average_rent_2021))
rents_2021 <- arrange(rents_2021, bedroom_count, average_rent_2021)
#Look at the summary data for 2021 rents
summary(rents_2021)
## community_name community_address bedroom_types average_rent_2021
## Length:1550 Length:1550 Length:1550 Min. : 155.00
## Class :character Class :character Class :character 1st Qu.: 1127.00
## Mode :character Mode :character Mode :character Median : 1422.00
## Mean : 1600.68
## 3rd Qu.: 1778.25
## Max. :13798.00
## geo_location bedroom_count
## Length:1550 Min. :0.00000
## Class :character 1st Qu.:1.00000
## Mode :character Median :2.00000
## Mean :1.56516
## 3rd Qu.:2.00000
## Max. :4.00000
rents_2021 %>% group_by(bedroom_count) %>% summarise(min(average_rent_2021), max(average_rent_2021))
## # A tibble: 5 x 3
## bedroom_count `min(average_rent_2021)` `max(average_rent_2021)`
## <int> <int> <int>
## 1 0 500 8211
## 2 1 166 11442
## 3 2 155 13798
## 4 3 600 6391
## 5 4 1138 6325
I create a boxplot to look at the spread of rents by bedroom counts.
rp <- ggplot(rents_2021,aes(x=bedroom_count, y=average_rent_2021, fill = factor(bedroom_count))) +
geom_boxplot() +
guides(fill = "none") +
scale_x_continuous(n.breaks = 6)
rp
There are large numbers of high outliers. The highest rents are for apartments at a senior living facility in Kensington that provides meal, assisted living and medical services. The lowest rents are less than $200 a month for 1 or 2 bedroom apartments in Gaithersburg and Takoma Park. These may be subsidized housing or data entry errors.
According to the Montgomery County Department of Housing and Community Affairs annual report Montgomery County currently has more than 20,000 households who are severely housing cost burdened; they earn less than $31,000/ year and spend more than half their income on rent. And more than 80 percent of renters earning up to $70,000/year are housing cost burdened. https://www.montgomerycountymd.gov/DHCA/Resources/Files/director/publications/dhca_annual_report-fy20-21.pdf
HUD defines housing cost burden as over 30% of income spent on housing. https://www.huduser.gov/portal/pdredge/pdr-edge-featd-article-081417.html
#Filter for rents that are <= to 30% of a 70k income
affordable <- (70000 * .3) / 12
affordable_rents <- rents_2021 %>%
dplyr::filter(average_rent_2021 <= affordable)
#Look at the summary data for affordable rents
summary(affordable_rents)
## community_name community_address bedroom_types average_rent_2021
## Length:1150 Length:1150 Length:1150 Min. : 155.00
## Class :character Class :character Class :character 1st Qu.:1047.00
## Mode :character Mode :character Mode :character Median :1273.00
## Mean :1256.63
## 3rd Qu.:1494.00
## Max. :1750.00
## geo_location bedroom_count
## Length:1150 Min. :0.00000
## Class :character 1st Qu.:1.00000
## Mode :character Median :1.00000
## Mean :1.37043
## 3rd Qu.:2.00000
## Max. :4.00000
affordable_rents %>% group_by(bedroom_count) %>% summarise(min(average_rent_2021), max(average_rent_2021))
## # A tibble: 5 x 3
## bedroom_count `min(average_rent_2021)` `max(average_rent_2021)`
## <int> <int> <int>
## 1 0 500 1743
## 2 1 166 1749
## 3 2 155 1746
## 4 3 600 1750
## 5 4 1138 1719
affordable_pct <- round(nrow(affordable_rents) / nrow(rents_2021), 2) * 100
The affordable threshold for for an income of $70,000/year is $1,750.00 a month. 74 percent of the rents in the dataset are affordable.
geo_loc <- affordable_rents %>%
dplyr::filter(is.na(geo_location)) %>% # find the records w/o geoloc data
dplyr::distinct(community_name, community_address, geo_location)
dplyr::filter(geo_loc, is.na(community_address)) # find the records w/o address
## # A tibble: 2 x 3
## community_name community_address geo_location
## <chr> <chr> <chr>
## 1 Boulevard Of Chevy Chase <NA> <NA>
## 2 Charter House <NA> <NA>
There are 424 facilities missing geo-location information, the majority can be geo-coded using the address field. Two facilities have missing addresses. The address for Charter House is 1316 Fenwick Lane Silver Spring, MD 20910. An obvious address doesn’t show up in a search for Boulevard Of Chevy Chase.
# assign the address for Charter House
affordable_rents[affordable_rents$community_name == 'Charter House', "community_address"] <-
'1316 Fenwick Lane Silver Spring, MD 20910'
affordable_rents <- affordable_rents %>%
dplyr::filter(!is.na(community_address)) # filter the remaining blank address
There are multiple rows in the file, one for each apartment type in each building. I only need one record for each facility, so I’ll filter out the duplicates.
dupes <- duplicated(affordable_rents$community_address) # find duplicate rows
affordable_rents <- affordable_rents %>%
dplyr::filter(!dupes) %>% # filter the duplicates
rename(name = community_name, address = community_address) # rename the columns to match mpdu file
The Moderately Prices Dwelling Units (MPDU) data is available from https://montgomeryplanning.org/tools/gis-and-mapping/data-downloads/. There are 71 apartment buildings listed in the file. The file does not contain the address of the building or geo-location data.
mpdu <- read_csv("./sourcedData/RentalsGrid.csv") %>% tibble()
names(mpdu)
## [1] "Name" "City"
## [3] "Unit Type" "Efficiency"
## [5] "One Bedroom" "One Bedroom and Den"
## [7] "2 Bedrooms, 1 Bathroom" "2 Bedrooms, 2 Bathrooms"
## [9] "3 Bedrooms" "Phone"
## [11] "Email" "Web"
I manually updated the file with the addresses by copying them from the websites listed for each record and saved a new file in the working directory.
mpdu <- read_csv("./working/MPDU_address.csv") %>% tibble() %>% # read the updated rental grid file
dplyr::select(name, address)
glimpse(mpdu)
## Rows: 71
## Columns: 2
## $ name <chr> "7001 Arlington at Bethesda", "7770 Norfolk Avenue", "Bainbrid~
## $ address <chr> "7001 ARLINGTON RD BETHESDA, MD 20814", "7770 Norfolk Ave Beth~
Add description headers for dwelling data and bind the two data sets together.
mpdu$description <- "Moderately Priced Dwelling Unit (MPDU)"
affordable_rents$description <- "Affordable Rent"
dwellings <- bind_rows(mpdu, affordable_rents)
Make sure there are no duplicated addresses between the two files.
dwellings$address <- str_to_upper(dwellings$address) # change all the addresses to upper case
dupes <- duplicated(dwellings) # find duplicate rows
dwellings <- dwellings %>%
filter(!dupes) # filter duplicate rows
Geo-code the addresses.
pacman::p_load(ggmap)
#dwellings <- mutate_geocode(dwellings,address)
# Write and read to avoid running geocode for repeat executions
# write_csv(dwellings, './working/dwellings_working.csv')
dwellings <- read_csv('./working/dwellings_working.csv',
locale=locale(encoding = "windows-1252")) %>% tibble()
Read the transportation files.
ptrans <- read_csv("./working/Public_Transport.csv") # combined public transportation data from pubtrans.rmd
bikeshare <- read_csv("./markers/bike.csv") # bike sharing marker layer file from bikeshare.rmd
ev_charging <- read_csv("./markers/ev_charging.csv") %>% # ev charging marker layer file from ev_charging.rmd
tibble() %>%
dplyr::filter(access_code == "public") # only use the public charging stations
spDists from the sp package computes the distance between two geo-location point and returns the value in kilometers, the kilometer to mile conversion factor = 0.6213712 is used to compute the distance in miles.
km2mi <- 0.6213712 # set the kilometer to mile conversion factor
mmat <- cbind(dwellings$lon, dwellings$lat) # create matrix of the dwelling lon/lat
pmat <- cbind(ptrans$lon, ptrans$lat) # create matrix public trans lat & lon
d1 <- spDists(mmat, pmat, longlat = TRUE) # compute the distance between dwelling and public trans
# output is a large matrix of all distances
dwellings$distance <- apply(d1, 1, min) * km2mi # apply kilometer to mile conversion to the minimum distance
# take a look at the percentage > 1/4 mile from public transportation
d2 <- dwellings %>% dplyr::filter(distance > 0.25)
pct_far <- round(count(d2) / count(dwellings), 2) %>% as.numeric()
pct_far <- paste0(pct_far * 100, '%')
# compute distances for bikeshare and ev_charging
bmat <- cbind(bikeshare$lon, bikeshare$lat) # matrix of bikeshare lon/lat
d1 <- spDists(mmat, bmat, longlat = TRUE) # compute the distance
dwellings$distance_b <- apply(d1, 1, min) * km2mi # convert the minimum distance to miles
emat <- cbind(ev_charging$lon, ev_charging$lat) # matrix of charging station lon/lat
d1 <- spDists(mmat, emat, longlat = TRUE) # compute the distance
dwellings$distance_ev <- apply(d1, 1, min) * km2mi # convert minimum distance to miles
42% of affordable dwellings are more than 1/4 mile (as the crow flies) from public transportation
Calculate the walking distance from dwellings to public transportation.
d2 <- spDists(cbind(dwellings$lon, dwellings$lat), cbind(ptrans$lon, ptrans$lat), longlat = TRUE)
d2_min <- apply(d2, 1, min) # find the location with min geo-distance
ind_m <- rep(0, nrow(d2))
for (i in 1:nrow(d2)) {
ind_m[i] <- which(d2[i,] == d2_min[i], arr.ind = TRUE) # use which() function to find index for min distance
}
orig <- paste0(dwellings$lat, "+", dwellings$lon)
dest <- paste0(ptrans$lat[ind_m], "+", ptrans$lon[ind_m])
# pacman::p_load(gmapsdistance)
#
# results <- gmapsdistance(origin = orig,
# destination = dest,
# combinations = "pairwise",
# mode = "walking")
#
#
# walking <- tibble(distance_w = results[["Distance"]][["Distance"]])
#
# # Write and read to avoid running gmapsdistance for repeat executions
# write_csv(tibble(walking), './working/walking_distance.csv')
walking <- read_csv('./working/walking_distance.csv') %>% tibble()
# convert meters to miles
dwellings$distance_w <- walking$distance_w / 1000
dwellings$distance_w <- dwellings$distance_w * km2mi
Create the pop-up text for the dwellings markers.
dwellings$address <- str_to_title(dwellings$address)
dwellings$description <- paste0("<b>", dwellings$name, "</b><br/>",
dwellings$address, "<br/>",
dwellings$description,"<br/>",
"Distance to Public Transportation: ", round(dwellings$distance,2)," mi.<br/>",
"Walking Distance to Public Transportation: ", round(dwellings$distance_w,2)," mi.<br/>",
"Distance to Electric Vehicle Charging: ", round(dwellings$distance_ev,2)," mi.<br/>",
"Distance to Bikeshare: ", round(dwellings$distance_b,2), " mi.")
# write_csv(dwellings, "./markers/dwellings.csv")
The income map layer is a heat map polygon layer using the Census Block Group shape data and the American Community Survey Income data.
The ACS Income file contains counts of survey results for each census block group. There are sixteen income ranges going from up to $10,000 per year to over $200,000 per year. Each range has a count and a margin of error (ME) field. The ME field isn’t needed to create the map layer, the income counts, total count and id columns are selected.
The median income file contains the calculated median income and margin-of-error for each Census Block Group.
glimpse(med_income)
## Rows: 614
## Columns: 4
## $ median_income <chr> "156250", "222619", "135313", "113910", "218173", "17298~
## $ margin_of_err <chr> "13398", "69001", "30011", "12407", "79761", "22856", "8~
## $ geo_id <chr> "1500000US240317001011", "1500000US240317001012", "15000~
## $ block_group <chr> "Block Group 1, Census Tract 7001.01, Montgomery County,~
The income data is pivoted longer into two columns and the income ranges are converted to numeric values. Then the median income for each block group is joined to the income data.
p1 <- income %>%
dplyr::select (cols = !starts_with("ME")) # remove the Margin of Error columns
# rename the income range columns
names(p1) <- c('total',
"$10,000", "$14,999", "$19,999", "$24,999", "$29,999", "$34,999", "$39,999", "$44,999",
"$49,999", "$59,999", "$74,999", "$99,999", "$124,999", "$149,999", "$199,999", "$200,000",
"geo_id", "block_group" )
# pivot the income range counts into a longer dataset
p1 <- p1 %>% pivot_longer(starts_with("$"), names_to = "income", values_to = "cnt")
# function to convert the currency character amounts into numeric
convertCurrency <- function(currency) {
currency1 <- sub('$','',as.character(currency),fixed=TRUE)
currency2 <- as.numeric(gsub('\\,','',as.character(currency1)))
currency2
}
p1$income <- convertCurrency(p1$income) # convert the income range amounts to numeric
med_income <- dplyr::select(med_income, -margin_of_err) # remove the ME column from median income
p1 <- left_join(p1, med_income) # join median income data to income data
The percentage of responses that are below 50% of Montgomery County’s median income is computed for each block group.
mc_med_income <- 110389 # the median income for Montgomery County
p2 <- p1 %>% filter(income <= mc_med_income) # get the records below median income
p3 <- p2 %>%
group_by(geo_id) %>%
summarise(sum(cnt),
pct = sum((cnt/total)), # compute the percentage below median income
median_income,
total,
block_group) %>% # per block group
distinct()
p2 <- p1 %>% filter(income <= mc_med_income/2) # get record below 50% median income
acs_income <- p2 %>% # acs_income will be the final dataset
group_by(geo_id) %>%
summarise(cnt_less= sum(cnt),
LMI50_pct = sum((cnt/total)), # compute percentage below 50% median income
median_income,
total,
block_group) %>% # per block group
distinct()
acs_income$pct <- p3$pct # add pct below median income to acs_income
acs_income$bg2010 <- str_sub(acs_income$geo_id, 10) # substring out the block group name
acs_income <- acs_income %>%
dplyr::select(bg2010, cnt_less, pct, LMI50_pct, median_income, total, block_group)
# write_csv(acs_income, './income.csv')
The Census Block Group polygon shape data is read from the shape file and joined to ACS income fields.
pacman::p_load(raster, sp)
census <- shapefile("./sourcedData/CensusBlockGroups2010/geo_export_eb847666-3b06-45be-88d3-a9f779b2211e.shp",
GDAL1_integer64_policy = TRUE)
p4 <- acs_income
p4$bg2010 <- as.character(p4$bg2010) # convert block group ids to character
LMI50 <- p4 %>% dplyr::select(bg2010, LMI50_pct, i_total = total, cnt_less) # select the < 50% median income data
census@data <- left_join(census@data, LMI50) # join to the shape file data
# shapefile(census, filename = 'income.shp', overwrite = TRUE) # write the census income shape file
detach("package:raster", unload = TRUE)
Test the income heatmap layer using leaflet.
pacman::p_load(leaflet)
#colorNumeric
nc_pal <- colorNumeric(palette = "Oranges",
domain = census@data$LMI50_pct)
m_income <- census %>%
leaflet() %>%
addProviderTiles("CartoDB")%>%
addPolygons(weight = 1, fillOpacity = 0.5,
color = ~nc_pal(LMI50_pct),
label = ~paste0("Census Block ID: ", bg2010, "\nPct < 50% of median income: ",
round(LMI50_pct * 100, 2)),
highlight = highlightOptions(weight = 3,color = "red",
bringToFront = TRUE))
m_income
The shape files and data files for the map layers are added to the environment. The affordable dwelling data is split into two data frames:
Create the palette for the income heatmap and custom icons for the marker layers.
# palette for income block group heat map
income_pal <- colorNumeric(palette = "Oranges", domain = acs_income@data$LMI50_pct)
# Icons for marker layers
near_icon <- makeIcon("./markers/home-purple.png", 18, 18)
far_icon <- makeIcon("./markers/home-red.png", 18, 18)
bike_icon <- makeIcon("./markers/bicycle-brown.png", 19, 19)
#car_icon <- makeIcon("./markers/car.png", 18, 18)
bolt_icon <- makeIcon("./markers/bolt.png", 19, 19)
metro_icon <- makeIcon("./markers/metro.png", 16, 16)
bus_icon <- makeIcon("./markers/rideon_2.png", 10, 10)
image_list <- c("./markers/home-purple.png",
"./markers/home-red.png",
"./markers/bolt.png",
"./markers/bicycle-brown.png",
"./markers/rideon_2.png",
"./markers/metro.png",
"./markers/blue-circle.png")
image_labels <- c("<= 1/4 mi. / pub. trans",
"> 1/4 mi. / pub. trans",
"EV Charging",
"BikeShare",
"Ride-On Bus stop",
"Metro Station",
"Metro Bus stop")
Build the interactive map using leaflet:
evc_map <- leaflet() %>%
addProviderTiles("CartoDB", group = "Carto - gray") %>%
addProviderTiles("OpenStreetMap.Mapnik", group = "OpenStreetMap - color") %>%
setView(-77.1607098, 39.150, zoom = 11) %>%
addPolygons(data = acs_income,
weight = 1, fillOpacity = 0.5,
color = ~income_pal(LMI50_pct),
label = ~paste0("Census Block: ", bgLbls),
popup = ~paste0("<b>American Community Survey (ACS)</b><br/> ",
"<b>Census.gov, Low-to-Moderate Income</b><br/> ",
"Pct LMI: ", round(LMI50_pct * 100, 2),
"<br/>LMI Count: ", cnt_less,
"<br/>Survey total: ", i_total),
highlight = highlightOptions(weight = 3,color = "red",
bringToFront = TRUE),
group = "Census Block (LMI)") %>%
addPolygons(data = mc_plan,
weight = 1, fillOpacity = 0,
dashArray = "dash", #look up dash pattern
color = "grey",
label = ~paste0("Planning Area: ", NAME),
highlight = highlightOptions(weight = 3,color = "black",
bringToFront = FALSE),
group = "Planning Area") %>%
addPolygons(data = zipcode,
weight = 1, fillOpacity = 0,
color = "blue",
label = ~paste0("Zipcode: ", zipcode),
highlight = highlightOptions(weight = 3,color = "red",
bringToFront = FALSE),
group = "Zipcode") %>%
addMarkers(data = ev_charging,
icon = bolt_icon,
label = ~paste0("EV Charging: ", address),
group = "EV Charging") %>%
addMarkers(data = near,
icon = near_icon,
label = ~htmlEscape(address),
popup = ~paste0(description),
group = "Affordable Dwelling",
# clusterOptions = markerClusterOptions()
) %>%
addMarkers(data = far,
icon = far_icon,
label = ~htmlEscape(address),
popup = ~paste0(description),
group = "Affordable Dwelling",
# clusterOptions = markerClusterOptions()
) %>%
addMarkers(data = metro,
icon = metro_icon,
label = ~htmlEscape(paste0(name, " Metro Station", ', ', address)),
group = "Metro") %>%
addMarkers(data = bus,
icon = bus_icon,
label = ~paste0(" Bus Stop: ", name),
group = "Ride-On Bus") %>%
addCircleMarkers(data = metro_bus,
radius = 2,
fill = T,
label = ~paste0(" Bus Stop: ", name),
group = "Metro Bus") %>%
addMarkers(data = bikeshare,
icon = bike_icon,
label = ~paste0(category, ": ", description),
group = "BikeShare") %>%
addLegend(data = acs_income, pal = income_pal, values = ~LMI50_pct,
title = "% LMI",
labFormat = labelFormat(prefix = "(", suffix = ")%", between = ", ",
transform = function(x) 100 * x),
group = "Census Block (LMI)", position = "bottomleft" ) %>%
addLegendImage(images = image_list,
labels = image_labels,
labelStyle = "font-size: 12px; vertical-align: middle;",
width = 16,
height = 16,
orientation = 'vertical',
position = 'bottomleft',
group = "Legend") %>%
addLayersControl(baseGroups = c("Carto - gray","OpenStreetMap - color"),
overlayGroups = c("Zipcode", "Census Block (LMI)", "Planning Area",
"Affordable Dwelling", "EV Charging",
"BikeShare", "Metro", "Ride-On Bus", "Metro Bus", "Legend"),
position = "topleft") %>%
addSearchOSM() %>%
addResetMapButton()
# htmlwidgets::saveWidget(evc_map, "./MoCo_TransportationEquityMap.html", title = "Transportation Equity in Montgomery County, MD")
# hideGroup(evc_map, c("Ride-On Bus", "Metro Bus", "Planning Area"))
# markdown::rpubsUpload("MoCo_TransportationEquityMap.html", "./MoCo_TransportationEquityMap.html", "https://api.rpubs.com/api/v1/document/891486/7c245b76630748f4a260643dc9094fe6")
Because I built the map in several iterations I researched how to upload the html file to rPubs without having it create a new link id. Using the rpubsUpload from the markdown package allows this once you have captured the ID from an initial upload.
My research was prompted from my first conversations with my sister-in-law, I wanted to have references for each of the ideas we discussed. I found more information and references as I was researching data sources.
I tried several options for creating my presentation in R Markdown. I wanted reusable code since I knew I’d be going through several iterations in order to create the presentation. I finally settled on a Slidy presentation. The presentation themes for Slidy are simple, I didn’t have the time to explore using more interesting themes. In order to document the coding for the presentation, I’ve pulled the code into the report .Rmd file.
After receiving feedback on the presentation from the “dry-run”, I decided to edit and re-order the slides so that I can use them as I present the map layers. I added a few plots to visually present statistics. I pulled the link to the interactive map to the front of the presentation and added acknowledgments on the same slide.
I begin the presentation by looking at income information for LMI residents and display the income heatmap layer of the map.
# pacman::p_unload(leaflet, raster, htmltools, leaflet.extras, leaflegend)
# detach("package:spatialEco", unload = TRUE)
# detach("raster", unload = TRUE, force = TRUE)
pacman::p_load(tidyverse, knitr, kableExtra)
# American Communities Survey Income data
income <- read_csv("./sourcedData/ACS_income_2019.csv") %>% tibble()
p1 <- income %>%
dplyr::select (cols = !starts_with("ME")) # deselect the margin of error columns
# rename the columns
names(p1) <- c('total',
"lt_10K", "lt_15K", "lt_20K", "lt_25K", "lt_30K", "lt_35K", "lt_40K", "lt_45K",
"lt_50K", "lt_60K", "lt_75K", "lt_100K", "lt_125K", "lt_150K", "lt_200K", "gt_200K",
"geo_id", "block_group" )
I spent some time adjusting the format of the tables and plots in order for them to render well in the presentation.
| # Less than AMI | % | # Moderate Income | % | # Low Income | % | # Less than $30K | % |
|---|---|---|---|---|---|---|---|
| 170,414 | 46 | 46,858 | 13 | 78,990 | 21 | 42,708 | 12 |
The next slide looks at affordable housing information and the affordable housing marker layer of the map on top of the income heatmap. Showing how affordable housing aligns with the census blocks.
I used the boxplot from my EDA code, but turned it horizontally. I wanted the table of rents statistics and the plot side-by-side on the slide. So I did some research and found the gridExtra package to layout the slide the way I wanted. I couldn’t use kable objects grid.arrange, so I created the table with tableGrob.
pacman::p_load(gridExtra)
rents <- read_csv("./sourcedData/2021_Rental_Facility_Occupancy_Survey_Results.csv")
names(rents) <- str_to_lower(str_replace_all(names(rents), ' ','_'))
#Select the name, address, bedroom type, 2021 average rent
rents_2021 <- rents %>% dplyr::select(starts_with(c('comm', 'bed')), average_rent_2021)
rents_2021$bedroom_count <- str_sub(rents_2021$bedroom_types, 1,1) %>% str_replace('e', '0') %>% as.integer()
#Convert the average rent field to an integer and sort the data by bedroom count and average rent
rents_2021$average_rent_2021 <- as.integer(gsub("[$,]", "", rents_2021$average_rent_2021))
rents_2021 <- arrange(rents_2021, bedroom_count, average_rent_2021)
r1 <- rents_2021 %>% group_by(bedroom_count) %>% # calc max/min rents by bedroom count
summarise(min(average_rent_2021),
max(average_rent_2021))
r1$avg <- rents_2021 %>% # calc the average rent by bedroom count
group_by(bedroom_count) %>%
summarise( avg = mean(average_rent_2021)) %>%
dplyr::select(avg) %>%
round(2)
# modify the table theme
rtt <- gridExtra::ttheme_minimal(core = list(fg_params = list(fontsize=14)),
colhead = list(fg_params = list(fontsize=14, fontface="bold")))
# create the table
rt <- gridExtra::tableGrob(r1, cols = c("# Bedrooms", "Min. Rent", "Max. Rent", "Avg. Rent"),
theme=rtt, rows = NULL) # transform into a tableGrob
# box plot of rents by bedroom count
rp <- ggplot(rents_2021,aes(y=bedroom_count, x=average_rent_2021, fill = factor(bedroom_count))) +
geom_boxplot(show.legend = F) +
scale_x_continuous(n.breaks = 6) +
labs(x = "Average Rent 2021", y = "# Bedrooms") +
theme_minimal() +
theme(legend.text=element_text(size=14), legend.title = element_blank(), legend.position = c(.85, .85),
axis.text=element_text(size=10), axis.title=element_text(size=14))
gridExtra::grid.arrange(rt, rp, nrow=1) # arrange the table and plot for display
I added a small table showing the percentage of rents in the survey that are below $1,750 and 1,250.
r2 <- rents_2021 %>% filter(average_rent_2021 <= 1750) %>% # filter moderately affordable rents
summarise(le1750 = n(), pct1750 = n() / nrow(rents_2021))
r2$pct1750 <- round(r2$pct1750 * 100) # calc percentage moderately affordable
r3 <- rents_2021 %>% filter(average_rent_2021 <= 1250) %>% # filter for affordable rents
summarise(le1250 = n(), pct1250 = n() / nrow(rents_2021))
r3$pct1250 <- round(r3$pct1250 * 100) # cals percentage affordable
cbind(r2, r3) %>% kable(caption = #format the table
"<caption font-size:28px>Affordable Rents</caption>",
col.names = c("# <= $1750", "%",
"# <= $1250", "%"),
format.args = list(big.mark = ",")) %>%
kable_styling(full_width = F, position = "left", font_size = 18)
| # <= $1750 | % | # <= $1250 | % |
|---|---|---|---|
| 1,150 | 74 | 550 | 35 |
The next slide is the public transportation marker where I added a frequency polygon plot to illustrate the distances from affordable dwelling locations to the transportation options. With this slide I demonstrate the Ride-on bus, Metro station, Metro bus and bike sharing marker layers on the map.
mpdu <- read_csv("./markers/dwellings.csv")
# format the statistics for distances to trans options
s2 <- tibble(trans = c("Public Transit (Geom)", "Public Transit (Walking)", "Bike Sharing", "EV Charging"))
s2$avg <- c(paste(round(mean(mpdu$distance), 2), "mi"),
paste(round(mean(mpdu$distance_w), 2), "mi"),
paste(round(mean(mpdu$distance_b), 2), "mi"),
paste(round(mean(mpdu$distance_ev), 2), "mi"))
s2$max <- c(paste(round(max(mpdu$distance), 2), "mi"),
paste(round(max(mpdu$distance_w), 2), "mi"),
paste(round(max(mpdu$distance_b), 2), "mi"),
paste(round(max(mpdu$distance_ev), 2), "mi"))
s2$min <- c(paste(round(min(mpdu$distance), 2), "mi"),
paste(round(min(mpdu$distance_w), 2), "mi"),
paste(round(min(mpdu$distance_b), 2), "mi"),
paste(round(min(mpdu$distance_ev), 2), "mi"))
# s2 %>% kable(caption = "Distance from Transportation Options",
# col.names = c("Transportation",
# "Avg",
# "Max",
# "Min")) %>%
# kable_styling(full_width = F)
hj <- matrix(c(0, 1, 1, 1), ncol=ncol(s2), nrow=nrow(s2), byrow=TRUE) # set up format for the table header
x <- matrix(c(0.05, 1, 1, 1), ncol=ncol(s2), nrow=nrow(s2), byrow=TRUE) # set up format for the table body
tt1 <- ttheme_minimal(core = list(fg_params = list(hjust = as.vector(hj), # modify the table theme
x = as.vector(x), fontsize=14)), # using the setup matrices
colhead = list(fg_params = list(fontsize=14, fontface="bold")))
t1 <- tableGrob(s2, cols = c("Transportation", "Avg", "Max","Min"), # create the table
theme=tt1, rows = NULL) # transform into a tableGrob
# get the distance columns from the dwellings data
distances <- mpdu %>% dplyr::select(Geometric = distance, BikeShare = distance_b,
EV_Charging = distance_ev, Walking = distance_w) %>%
pivot_longer(everything(), names_to = "category", values_to = "distance")
# create a frequency polygon line plot of the distance data
p_dist <- ggplot(distances, aes(distance, color = category)) +
geom_freqpoly(size = 1.5) +
scale_x_log10(breaks = c(0.1, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0)) +
labs(x = "Distance (miles)", y = NULL) +
theme_minimal() +
theme(legend.text=element_text(size=14), legend.title = element_blank(), legend.position = c(.85, .85),
axis.text=element_text(size=10), axis.text.x=element_text(angle = 30), axis.title=element_text(size=14))
# arrange the table and plot in a single row
grid.arrange(t1, p_dist, nrow=1)
# subset the distance data by 1/4 mile, 1/2 mile and 1 mile distances.
distances <- mpdu %>% mutate(ptrans.25 = (distance > 0.25),
ptrans.50 = (distance > 0.5), ptrans1 = (distance > 1),
walk.25 = (distance_w > 0.25),
walk.50 = (distance_w > 0.5), walk1 = (distance > 1),
bike.25 = (distance_b > 0.25),
bike.50 = (distance_b > 0.5), bike1 = (distance_b > 1),
evchrg.25 = (distance_ev > 0.25),
evchrg.50 = (distance_ev > 0.5), evchrg1 = (distance_ev > 1))
cnt <- count(distances) # get the number of distance records
ptrans = c(count(distances, wt = ptrans.25), # count the ptrans records by distance category
count(distances, wt = ptrans.50),
count(distances, wt = ptrans1)) %>% unlist() %>% unname()
ptrans.pct <- ptrans # compute the ptrans percentage by category
for (i in 1:3) {ptrans.pct[i] = round(ptrans[i] / cnt * 100)}
walk = c(count(distances, wt = walk.25), # count the walking records by distance category
count(distances, wt = walk.50),
count(distances, wt = walk1)) %>% unlist() %>% unname()
walk.pct <- walk # compute the walking percentage by category
for (i in 1:3) {walk.pct[i] = round(walk[i] / cnt * 100)}
bike = c(count(distances, wt = bike.25), # count the bike share records by distance category
count(distances, wt = bike.50),
count(distances, wt = bike1)) %>% unlist() %>% unname()
bike.pct <- bike # compute the bike share percentage by category
for (i in 1:3) {bike.pct[i] = round(bike[i] / cnt * 100)}
evchrg = c(count(distances, wt = evchrg.25), # count the ev charging records by distance category
count(distances, wt = evchrg.50),
count(distances, wt = evchrg1)) %>% unlist() %>% unname()
evchrg.pct <- evchrg # compute the ev charging pct by category
for (i in 1:3) {evchrg.pct[i] = round(evchrg[i] / cnt * 100)}
all = c(count(distances, wt = (ptrans.25 & bike.25 & evchrg.25)), # combined counts
count(distances, wt = (ptrans.50 & bike.50 & evchrg.50)),
count(distances, wt = (ptrans1 & bike1 & evchrg1))) %>% unlist() %>% unname()
all.pct <- all # combined percentage
for (i in 1:3) {all.pct[i] = round(all[i] / cnt * 100)}
s1 <- tibble(distance = c("1/4 mi", "1/2 mi", "1 mile"),
ptrans,
ptrans.pct,
walk,
walk.pct,
bike,
bike.pct,
evchrg,
evchrg.pct,
all,
all.pct)
# format the table
s1 %>% kable(caption =
"<caption font-size:24px>Distance from Transportation Options (Counts and Percentage)</caption>",
col.names = c("Further Than",
"Public Trans", "%",
"Walking", "%",
"Bike Sharing", "%",
"EV Charging", "%",
"All", "%")) %>%
kable_styling(full_width = F, font_size = 18) %>%
column_spec(c(1:9), width = "6em")
| Further Than | Public Trans | % | Walking | % | Bike Sharing | % | EV Charging | % | All | % |
|---|---|---|---|---|---|---|---|---|---|---|
| 1/4 mi | 290 | 41 | 408 | 58 | 390 | 55 | 485 | 69 | 205 | 29 |
| 1/2 mi | 64 | 9 | 170 | 24 | 188 | 27 | 376 | 53 | 29 | 4 |
| 1 mile | 15 | 2 | 15 | 2 | 137 | 19 | 75 | 11 | 13 | 2 |
The final layer to demonstrate is the EV charging markers layer. I create a small table to show the counts of EV charging locations and stations using kable.
ev_charging <- read_csv("./markers/ev_charging.csv") %>% tibble()
ev_charging %>% group_by(access_code) %>% # summarise private and public ev charging counts
summarise(n = n(), total = sum(evse_tot)) %>% # display in a table
kable(caption =
"Electric Vehicle Charging Locations in Montgomery County",
col.names = c("Access", "# Locations", "# Stations"), font_size = 18) %>%
kable_styling(full_width = F, position = "left") %>%
column_spec(c(1:3), width = "8em")
| Access | # Locations | # Stations |
|---|---|---|
| private | 24 | 104 |
| public | 216 | 509 |
And I add my observations about access to public transportation to this slide.
The next two slides contain counts of the data for dwellings and transportation options within zipcodes and planning areas for Montgomery County. I use them to discuss the relative frequency of public transportation to affordable housing. First I convert the longitude and latitude data for the marker layers to spatial points and project them to the CRS for the zipcode polygons.
pacman::p_load(rgdal, GISTools)
# zipcode polygons for Montgomery County
zipcode <- raster::shapefile("./shapefiles/zipcode.shp", GDAL1_integer64_policy = TRUE)
# Montgomery County planning area shape files
mc_plan <- raster::shapefile("./shapefiles/planning_areas.shp", GDAL1_integer64_policy = TRUE)
pubtrans <- read_csv("./working/public_transport.csv") # public transportation locations
bikeshare <- read_csv("./markers/bike.csv") # Bike share locations
coordinates(mpdu) <- c("lon", "lat") # grab the dwellings lon/lat data
mpdu <- as(mpdu, "SpatialPoints") # convert to spatial points
proj4string(mpdu) <- CRS("+proj=longlat +datum=WGS84") # set a CRS for the points
proj4string(mpdu) <- proj4string(zipcode) # project the CRS to zipcode polygons
coordinates(bikeshare) <- c("lon", "lat") # repeat for bikeshare
bikeshare <- as(bikeshare,"SpatialPoints")
proj4string(bikeshare) <- CRS("+proj=longlat +datum=WGS84")
proj4string(bikeshare) <- proj4string(zipcode)
coordinates(ev_charging) <- c("lon", "lat") # repeat for ev charging
ev_charging <- as(ev_charging,"SpatialPoints")
proj4string(ev_charging) <- CRS("+proj=longlat +datum=WGS84")
proj4string(ev_charging) <- proj4string(zipcode)
coordinates(pubtrans) <- c("lon", "lat") # repeat for public transportation
pubtrans <- as(pubtrans,"SpatialPoints")
proj4string(pubtrans) <- CRS("+proj=longlat +datum=WGS84")
proj4string(pubtrans) <- proj4string(zipcode)
Then count the number of points in the zipcode polygons and create the table to display them.
mpdu_cnt <- GISTools::poly.counts(mpdu, zipcode) # count dwellings in zipcode
bike_cnt <- GISTools::poly.counts(bikeshare, zipcode) # count bike share in zipcode
evc_cnt <- GISTools::poly.counts(ev_charging, zipcode) # count ev charging in zipcode
pubtrans_cnt <- GISTools::poly.counts(pubtrans, zipcode) # count public trans in zipcode
zip_counts <- tibble(zip = zipcode@data$zipcode, # put zipcode and city name in a tibble
city = zipcode@data$mail_city)
zip_counts$dwelling <- mpdu_cnt # add the counts
zip_counts$ptrans <- pubtrans_cnt
zip_counts$bike <- bike_cnt
zip_counts$evc <- evc_cnt
zip_counts <- zip_counts %>% dplyr::filter(dwelling + bike + evc + ptrans > 0) # filter out zipcode with 0 counts
zip_counts %>% dplyr::arrange(desc(dwelling)) %>% # sort desc by dwelling count
head(6) %>% # select the first few rows
kable(caption = "Top 6 counts of Affordable Dwellings by Zipcode", # format the table for display
col.names = c("Zipcode", "City", "Dwellings",
"Public Trans", "Bike Sharing",
"EV Charging")) %>%
kable_styling(full_width = F)
| Zipcode | City | Dwellings | Public Trans | Bike Sharing | EV Charging |
|---|---|---|---|---|---|
| 20912 | TAKOMA PARK | 251 | 34 | 7 | 7 |
| 20910 | SILVER SPRING | 96 | 65 | 14 | 17 |
| 20814 | BETHESDA | 42 | 26 | 8 | 27 |
| 20877 | GAITHERSBURG | 39 | 9 | 0 | 3 |
| 20901 | SILVER SPRING | 34 | 53 | 4 | 0 |
| 20852 | ROCKVILLE | 27 | 40 | 10 | 35 |
zip_counts %>% dplyr::arrange(desc(ptrans)) %>% # re-sort the table the pub. trans counts
head(6) %>%
kable(caption = "Top 6 counts of Public Transit by Zipcode",
col.names = c("Zipcode", "City", "Dwellings",
"Public Trans", "Bike Sharing",
"EV Charging")) %>%
kable_styling(full_width = F)
| Zipcode | City | Dwellings | Public Trans | Bike Sharing | EV Charging |
|---|---|---|---|---|---|
| 20904 | SILVER SPRING | 15 | 141 | 7 | 15 |
| 20815 | CHEVY CHASE | 22 | 74 | 6 | 10 |
| 20906 | SILVER SPRING | 16 | 71 | 0 | 2 |
| 20902 | SILVER SPRING | 23 | 66 | 8 | 8 |
| 20910 | SILVER SPRING | 96 | 65 | 14 | 17 |
| 20850 | ROCKVILLE | 25 | 57 | 19 | 43 |
The planning area polygon shapes use the same CRS as the zipcodes, so it isn’t necessary to project the points data again. I count the points by planning area and format a table to put on the slide.
mpdu_cnt <- GISTools::poly.counts(mpdu, mc_plan) # count dwelling points in planning area
bike_cnt <- GISTools::poly.counts(bikeshare, mc_plan) # count bike share points in planning area
evc_cnt <- GISTools::poly.counts(ev_charging, mc_plan) # count ev charging in planning area
pubtrans_cnt <- GISTools::poly.counts(pubtrans, mc_plan) # count public trans in planning area
plan_counts <- tibble(plan = mc_plan@data$PLANNING, # create a tibble with planning id and name
name = mc_plan@data$NAME)
plan_counts$dwelling <- mpdu_cnt # add counts to the tibble
plan_counts$ptrans <- pubtrans_cnt
plan_counts$bike <- bike_cnt
plan_counts$evc <- evc_cnt
plan_counts %>% dplyr::arrange(desc(dwelling)) %>% # sort descending by dwelling count
head(10) %>% # select a few rows and format for display
kable(caption =
"Top 10 counts of Affordable Dwellings by Planning Area",
col.names = c("Plan", "Name", "Dwellings",
"Public Trans", "Bike Sharing",
"EV Charging")) %>%
kable_styling(full_width = F)
| Plan | Name | Dwellings | Public Trans | Bike Sharing | EV Charging |
|---|---|---|---|---|---|
| 37 | Takoma Park | 271 | 44 | 7 | 7 |
| 36 | Silver Spring | 100 | 63 | 14 | 17 |
| 35 | Bethesda/Chevy Chase | 67 | 146 | 18 | 47 |
| 21 | Gaithersburg City | 40 | 16 | 0 | 10 |
| 31 | Kensington/Wheaton | 36 | 124 | 8 | 9 |
| 32 | Kemp Mill/4 Corners | 32 | 37 | 1 | 0 |
| 30 | North Bethesda | 29 | 49 | 7 | 30 |
| 20 | Gaithersburg Vicinity | 27 | 22 | 12 | 29 |
| 19 | Germantown | 20 | 68 | 0 | 15 |
| 26 | Rockville | 20 | 74 | 14 | 37 |
Concluding the map demonstration the next slide outlines my conclusions and recommendations for the Transportation Equity map and project. I include my data sources and references on two additional slides. Then I outline what my next steps would be if I were to continue with the project on the final slide.
This project was challenging in many ways. Taking a deep dive into working with spatial data I needed to do a lot of research and experimentation to get the results I wanted. Since I knew what my goal was early on I was able to focus my efforts towards it.
Creating this report itself was a challenge, particularly because of the overloading of functions, i.e. select(), between dplyr and raster. I wanted to minimize the amount of duplicate code between the report and the separate .Rmd files I had created. Getting the child documents to knit from the parent report document meant tweaking the child .Rmd files.
I did a bit statistical analysis for the project, but some of the statistics that I state in the presentation came from my research.
I’m very pleased and proud of the result. I enjoyed this challenge.
Affordable dwellings, Data Montgomery - 2021_Rental_Facility_Occupancy_Survey_Results
Moderately Priced Dwelling Units (MPDU) Rentals List, apps.montgomerycountymd.gov/dhca-mpdu/Rental/List
Bike Sharing stations, Data Montgomery - Bikeshare dataset
Electric Vehicle Charging stations, Alternative Fuels Data Center - Alternative Fuel Stations
Ride On Bus Stops, Data Montgomery - Montgomery_County_Ride_On_Bus_Stops
Metro Rail Stations, Data Montgomery Metro-Rail-Stations
Metro Bus Stops, Maryland_Transit_-_WMATA_Metro_Bus_Stops
Low-Moderate Income American Community Surveys Income data
Zipcodes, Montgomery Planning - MC Atlas Zipcodes-Montgomery
Montgomery County Planning Areas, Montgomery Planning - Planning_Areas
Census Block Groups, Data Montgomery - Census Tracts 2010 (geographic data)