1 Introduction

IKEA is looking to open a new store in Boston, MA. In order to find a suitable locations for the new store, we must consider collected data sets from online GIS resources. We must ensure the locations satisfy the following criteria: (1) The store is located in census tracts with appropriate median income (25,000-80,000) and household value (200,000-500,000); (2) The store will be built in the current “Open Land” areas; (3) The store should be within 2000ft of major roads; (4) The store should be at least 200,000 sq ft. This markdown file will allow for the manupulation and analysis of cenus, land, and road map data to determine the most suitable locations for the new Ikea store.

2 Methods

2.1 Uploading the Data

Upload the necessary packages and data.

library(sf)
library(tidyverse)


census  <- st_read("~/R_DataScience/assignments/CS_07_spatial_case_study_data/Boston_CensusTracts.shp")

land <- st_read("~/R_DataScience/assignments/CS_07_spatial_case_study_data/Boston_LandUse.shp")

roads <- st_read("~/R_DataScience/assignments/CS_07_spatial_case_study_data/Boston_MajorRoads.shp")

2.2 Preparing the Data

Filter the census data for the range of Med_Income as 25,000-80,000 and the range of Med_HouseV as 200,000-500,000.

census <- census  %>% 
  filter(Med_Income <= 80000 & Med_Income >= 25000 ) %>% 
  filter(Med_HouseV >= 200000 & Med_HouseV <= 500000)

Filter the land use data for just ‘Open Land’ value for LU05_DESC.

land  <- land  %>% 
  filter(LU05_DESC  == "Open Land")

Create a buffer zone for the major roads in Boston with distance of 2000ft using st_buffer().

roads <- roads %>% 
  st_buffer(dist = 2000)

Check the geometry validity of outputs from previous steps. In case there is any invalid polygon, use st_make_valid() to make the geometry valid.

st_is_valid(land)  # there is a false
land <- st_make_valid(land)
st_is_valid(land)

st_is_valid(roads) # all true
st_is_valid(census)  # all true

2.3 Using the Data to find avalaible plots within desired parameters.

Get intersections of all selected areas from steps previous steps using st_intersection() with valid geometries. Union adjacent polygons to one polygon and also keep polygons with no shared boundaries separate.

available_land <-  census  %>% 
  st_intersection(roads) %>% 
  st_intersection(land) %>% 
  st_union()  %>% 
  st_cast('POLYGON')

Calculate area of unioned polygons using st_area() and convert the output to numeric values.

area_available_land <- as.numeric(st_area(available_land)) # convert the output to numeric values
df_area <- data.frame(area_available_land) # turn the vector of values into a data frame
colnames(df_area) <- "land_area" # name the columns of the data frame
df_area <- df_area %>% 
  add_column(ID = c(1:91)) # add a column ID so the rows in this data frame can be matched with the rows of another data frame 

# This also idicates there are 91 candidate plots.

Make the geometries of the plots where all requirements are met (the points of intersection) into a data frame

geometry <- data.frame(available_land) %>% 
  add_column(ID = c(1:91)) # add a column ID so the rows in this data frame can be matched with the rows of another data frame

Combine geometry and area of each available plot into one data frame.

df_size_shape <- merge(df_area, geometry)

Select the polygons with area larger than 200,000 sq ft.

large_area  <- df_size_shape %>% 
  filter("land_area" >200000)

2.4 Graphing the Results

Prepare to produce the map with the final candidate polygons, and some other features on the map to provide context (e.g. Boston census tracts with median household value in the census tract).

# convert the data frame to an sf object
graphing_large_area <- st_as_sf(large_area) 


# find the bounds of the graph that focus on the specific plots of available land in the target medium income and distance from highways using the bounds of the available area matching these parameters
bound <- st_bbox(graphing_large_area)

3 Results

Produce the final graph.

ggplot() +
  geom_sf(data = census, aes(fill = Med_Income)) +
  geom_sf(data = graphing_large_area, aes(colour = "red"), fill = "red") +
  coord_sf (xlim = c(bound[1], bound[3]),
            ylim = c(bound[2], bound[4])) +
  theme_bw() +
  labs(xlab = "Latitude",
     ylab = "Longitude",
     caption = "Figure 1: The shaded polygon represnets continuous plots \n of land avaialble that are within 2000 feet of major \n highways and in an area with median household income. \n The average income is broken down into a gradient to \n provide further information", 
     title = "Map of Available Land In Boston",
     fill = "Average Income") +
  scale_color_manual(values = c("red"),
    name="Locations for Ikea Stores", 
    labels= c("Candidate Locations"))

# save the graph 
png(filename = "CS07_map_kellyteitel")
ggplot() +
  geom_sf(data = census, aes(fill = Med_Income)) +
  geom_sf(data = graphing_large_area, aes(colour = "red"), fill = "red") +
  coord_sf (xlim = c(bound[1], bound[3]),
            ylim = c(bound[2], bound[4])) +
  theme_bw() +
  labs(xlab = "Latitude",
       ylab = "Longitude",
       caption = "Figure 1: The red polygons represnets continuous plots \n of land avaialble that are within 2000 feet of major \n highways and in an area with median household income. \n The average income is broken down into a gradient to \n provide further information.", 
       title = "Map of Available Land In Boston",
       fill = "Average Income") +
  scale_color_manual(values = c("red"),
                     name="Locations for Ikea Stores", 
                     labels= c("Candidate Locations"))
dev.off()

4 Conclusion

The most desirable locations would be those surrounded by plots with average incomes closer to 50000 dollars, since households with lower incomes may not be able to spend as much money at stores similar to Ikea, whereas households with higher incomes may be able to afford higher priced stores. Thus, the most desirable plots would be around 71 degrees W and 42.4 degrees north, where most of the plots are surrounded by 50000 dollar average income households. However, all of the plots shaded red would be suitable and comply with the requirements given in the introduction. There are 91 plots that fit the criteria, mostly in the upper right or lower left of Boston.