GIS Research Project Blog

Author

Brian Surratt

Published

April 20, 2023

Blog Post 1

In December 2014, Minneapolis, Minnesota passed a city ordinance allowing for the construction of accessory dwelling units (ADUs, also known as “granny flats”) on existing single-family lots. [1] The ordinance was the first in a series of steps to address the costs of housing in the city by increasing housing supply. In 2018, Minneapolis continued this approach by banning single-family zoning, thus requiring a 3-unit minimum on each lot for new construction.

The 2014 ordinance allows single-family homeowners to construct ADUs on their property. Theoretically, this should add housing supply to the city and apply downward pressure on housing costs. Critics argued ADUs wouldn’t make a significant impact on housing supply since they are relatively expensive to build (around $100,000 each at the time) and the property owner had to continue to reside in either the existing home or the ADU. In other words, they would primarily be used by family members and would not enter the general rental market.

The 2018 ordinance banning single-family zoning was one of the first of its kind. Single-family zoning is highly valued by homeowners but advocates of high-density housing argue it should be phased out. It is highly unusual for existing single-family zoning to be modified, so Minneapolis is entering uncharted territory for housing policy.

This research project will analyze changes in rental affordability in Minneapolis between 2014, when the ADU ordinance was passed, and 2020. Comparing housing costs before and after the ADU ordinance and single-family zoning ban will provide evidence of the effects of these changes in housing policy. The research question is how did rental affordability change in Minneapolis, Minnesota, five years after the city passed an ordinance allowing ADUs on single family lots in December 2014 and two years after the city banned single family zoning in 2018?

I will map 3 variables from the U.S. Census Bureau American Community Survey Public Use Microdata (2014 and 2020) related to housing costs in Minneapolis, Minnesota, in 2014 and 2020, by census tract. The variables are percent of renters in each tract (derived from the TEN variable, housing tenure), the percent of renters who are cost burdened (derived from GRPIP, gross rental as a percentate of household income), and the change in percent of renters who are cost burdened from 2014 to 2020. “Cost burdened” is defined as paying greater than 30% of income towards housing costs.

I will produce the following maps and tables:

Map 1: Percent of all residents who are renters in each census tract in 2014.
Map 2: Precent of all residents who are renters in each census tract in 2020.
Map 3: Percent of renters in each census tract who are cost burdened in 2014.
Map 4: Percent of renters in each census tract who are cost burdened in 2020.
Map 5: Change in percent of renters who are cost burdened from 2014 to 2019.
Table 1: List of census tracts with percent cost burdened in 2014, in 2020, and change in percentage over the 5-year period.

References

ACCESSORY DWELLING UNIT ZONING CODE TEXT AMENDMENT PASSES IN MINNEAPOLIS. (2014, Dec 09). US Fed News Service, Including US State News Retrieved from https://login.libweb.lib.utsa.edu/login?url=https://www.proquest.com/wire-feeds/accessory-dwelling-unit-zoning-code-text/docview/1634265244/se-2

Blog Post 2

Description of data and GIS processes (250-500 words, 1-2 tables, 2-3 figures). Here you will describe your data, including the source and origin, as well as a plan for what GIS operations you will be conducting during the course of your project.

Description of data

The source of data is the U.S. Census Bureau’s American Community Survey Public Use Microdata from 2014 and 2019. I will access the data via the get_pums() function in r. The variables I will use are the following:

PUMA: Public use area microdata area code. The PUMAs for Minneapolis are 1405, 1406, and 1407. In the 2014 data, this is a 4 character variable. In the 2019 data, this is a 5 character variable.
TYPE: Type of unit, filtered for “1” which is a housing unit. This removes group quarters.
TEN: This variable is housing tenure and can have four values, 1 = “owned with mortgage,” 2 = “owned free and clear,” 3 = “rented,” and 4 = “occupied without payment of rent.”
GRPIP: Gross rent as a percentage of household income past 12 months.
RELP: This is the relationship of the respondent to the household reference person. For the 2014 data this must be filtered to “00” to limit to one response per household. For the 2019 data, the variable name is RELSHIPP and the filter must be set to “20”.

Plan for GIS operations

For the GIS operations, I will download the shapefiles using the purrr::map() function and merge the data with the Minneapolis housing data. I will produce the following maps:

Map 1: Percent of all residents who are renters in each census tract in 2014
Map 2: Percent of all residents who are renters in each census tract in 2019
Map 3: Percent of renters in each census tract who are cost burdened in 2014
Map 4: Percent of renters in each census tract who are cost burdened in 2019
Map 5: Change in percent of renters who are cost burdened from 2014 to 2019

Cleaning the data and initial summary statistics

First, let’s download and clean Minnesota PUMS data for 2014.

Code

mnpums2014 <- get_pums(
  variables = c("PUMA", "TYPE", "TEN", "GRPIP", "HHT", "RELP"),
  state = "MN",
  variables_filter = list(SPORDER = 1, TYPE = 1), # SPORDER = 1 gets households, TYPE = 1 gets housing units and eliminates group quarters.
  #puma = c(1405, 1406, 1407), # I can't figure out how to select by puma.  Maybe capitalize "PUMA"?
  survey = "acs1",
  year = 2014
  )

Getting data from the 2014 1-year ACS Public Use Microdata Sample

This is a sample of all respondents in Minnesota for 2014. Let’s check the sample size.

Code

nrow(mnpums2014)

[1] 21524

Let’s create a new dataframe with only the PUMAs that cover Minneapolis (1405, 1406, and 1407) in 2014.

Code

dat2014 <- mnpums2014 %>%
  filter(PUMA %in% c("1405", "1406", "1407"))

Let’s modify the PUMA variable in the 2014 dataframe so the PUMA is 5 digits.

Code

dat2014 <- dat2014 %>%
  mutate(PUMA = case_when(.$PUMA == "1405" ~ "01405",
                          .$PUMA == "1406" ~ "01406",
                          .$PUMA == "1407" ~ "01407",
                          )
         )

Let’s check the sample size.

Code

nrow(dat2014)

[1] 1023

Let’s check the distribution by PUMA.

Code

tabyl(dat2014$PUMA)

 dat2014$PUMA   n   percent
        01405 344 0.3362659
        01406 331 0.3235582
        01407 348 0.3401760

In order to get just one observation per household, we need to ensure RELP == 0 for every record.

Code

tabyl(dat2014$RELP)

 dat2014$RELP    n percent
            0 1023       1

So now we know there is only one observation per household. Let’s see the distribution of renters vs. non-renters. 1 is owned with a mortgage, 2 is owned free and clear, 3 is rented, 4 is occupied without payment of rent.

Code

tabyl(dat2014$TEN)

 dat2014$TEN   n     percent
           1 401 0.391984360
           2 158 0.154447703
           3 461 0.450635386
           4   3 0.002932551

Let’s recode the TEN variable categories to reflect renters and non-renters.

Code

dat2014 <- dat2014 %>%
  mutate(tenure = (ifelse(.$TEN == 3, 'Renter', 'Non-renter')))

tabyl(dat2014$tenure)

 dat2014$tenure   n   percent
     Non-renter 562 0.5493646
         Renter 461 0.4506354

Making a data frame with percent of renters in 2014 and 2019

Code

renters2014 <- dat2014%>%
  group_by(PUMA, tenure) %>%
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N)) %>% 
  filter(tenure == 'Renter')

`summarise()` has grouped output by 'PUMA'. You can override using the
`.groups` argument.

Code

renters2014

# A tibble: 3 × 4
# Groups:   PUMA [3]
  PUMA  tenure     N Proportion
  <chr> <chr>  <dbl>      <dbl>
1 01405 Renter 25280      0.477
2 01406 Renter 21451      0.409
3 01407 Renter 40292      0.608

Let’s filter only renters and check the distribution of rent as a percentage of income.

Code

dat2014 <- dat2014 %>%
  filter(tenure == "Renter")

dat2014$GRPIP <- (as.numeric(dat2014$GRPIP))/100

hist(dat2014$GRPIP)

Let’s create a new variable called “cost_burden” with three levels:

Paying less than 30% of income on rent is “Not cost burdened”.
Paying between 30% of income on rent and 49.4% of income on rent is “Cost burdened”.
Paying greater than 50% of income on rent is “Extremely cost burdened”.

Code

dat2014 <- dat2014 %>%
  mutate(cost_burden = case_when(.$GRPIP <.30 ~ "Not cost burdened",
                                 .$GRPIP >=.30 & .$GRPIP <.50 ~ "Cost burdened",
                                 .$GRPIP >=.50 ~ "Extremely cost burdened",
                                 )
         )

Let’s check the distribution of rent cost burden in Minneapolis in 2014.

Code

tabyl(dat2014$cost_burden)

     dat2014$cost_burden   n   percent
           Cost burdened  95 0.2060738
 Extremely cost burdened 136 0.2950108
       Not cost burdened 230 0.4989154

Now, let’s download and clean Minnesota PUMS data for 2019. (PUMS data was not released for 2020 and PUMS geographies are not available for 2021, so those years are not available.)

Code

mnpums2019 <- get_pums(
  variables = c("PUMA", "TYPE", "TEN", "GRPIP", "HHT", "RELSHIPP"),
  state = "MN",
  variables_filter = list(SPORDER = 1, TYPE = 1), # SPORDER = 1 gets households, TYPE = 1 gets housing units and eliminates group quarters.
  #puma = c(1405, 1406, 1407), # I can't figure out how to select by puma.  Maybe capitalize "PUMA".
  survey = "acs1",
  year = 2019
  )

Getting data from the 2019 1-year ACS Public Use Microdata Sample

This is a sample of PUMS respondents in Minnesota for 2019. Let’s check the sample size.

Code

nrow(mnpums2019)

[1] 22576

First, let’s create a new dataframe with only the PUMAs that cover Minneapolis (1405, 1406, and 1407) in 2019.

Code

dat2019 <- mnpums2019 %>%
  filter(PUMA %in% c("01405", "01406", "01407"))

Let’s check the sample size.

Code

nrow(dat2019)

[1] 1054

Let’s check the distribution by PUMA.

Code

tabyl(dat2019$PUMA)

 dat2019$PUMA   n   percent
        01405 350 0.3320683
        01406 336 0.3187856
        01407 368 0.3491461

In order to get just one observation per household, we need to ensure RELSHIPP == 20 for every record.

Code

tabyl(dat2019$RELSHIPP)

 dat2019$RELSHIPP    n percent
               20 1054       1

Code

tabyl(dat2019$TEN)

 dat2019$TEN   n   percent
           1 413 0.3918406
           2 174 0.1650854
           3 461 0.4373814
           4   6 0.0056926

Let’s recode the TEN variable categories to reflect renters and non-renters.

Code

dat2019 <- dat2019 %>%
  mutate(tenure = (ifelse(.$TEN == 3, 'Renter', 'Non-renter')))

tabyl(dat2019$tenure)

 dat2019$tenure   n   percent
     Non-renter 593 0.5626186
         Renter 461 0.4373814

Making a data frame with percent of renters in 2019.

Code

renters2019 <- dat2019%>%
  group_by(PUMA, tenure) %>%
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N)) %>% 
  filter(tenure == 'Renter')

`summarise()` has grouped output by 'PUMA'. You can override using the
`.groups` argument.

Code

renters2019

# A tibble: 3 × 4
# Groups:   PUMA [3]
  PUMA  tenure     N Proportion
  <chr> <chr>  <dbl>      <dbl>
1 01405 Renter 27542      0.495
2 01406 Renter 23169      0.428
3 01407 Renter 44253      0.596

Let’s filter only renters and check the distribution of rent as a percentage of income.

Code

dat2019 <- dat2019 %>%
  filter(tenure == "Renter")

dat2019$GRPIP <- (as.numeric(dat2019$GRPIP))/100

hist(dat2019$GRPIP)

Let’s create a new variable called “cost_burden” with three levels:

Paying less than 30% of income on rent is “Not cost burdened”.
Paying between 30% of income on rent and 49.4% of income on rent is “Cost burdened.
Paying greater than 50% of income on rent is “Extremely cost burdened”.

Code

dat2019 <- dat2019 %>%
  mutate(cost_burden = case_when(.$GRPIP <.30 ~ "Not cost burdened",
                                 .$GRPIP >=.30 & .$GRPIP <.50 ~ "Cost burdened",
                                 .$GRPIP >=.50 ~ "Extremely cost burdened",
                                 )
         )

Let’s check the distribution of rent cost burden in Minneapolis in 2019.

Code

tabyl(dat2019$cost_burden)

     dat2019$cost_burden   n   percent
           Cost burdened  86 0.1865510
 Extremely cost burdened  94 0.2039046
       Not cost burdened 281 0.6095445

Now I need to merge the 2014 and 2019 dataframes.

Let’s rename RELSHIPP as RELP in the 2019 dataframe.

Code

dat2019 <- dat2019  %>%
  rename_at('RELSHIPP', ~'RELP')

Let’s add a year column to both dataframes.

Code

dat2014 <- dat2014 %>%
  mutate(year = "2014")

dat2019 <- dat2019 %>%
  mutate(year = "2019")

Let’s merge these dataframes into one.

Code

dat <- rbind(dat2014, dat2019)

head(dat)

# A tibble: 6 × 14
  SERIALNO  WGTP PWGTP PUMA  TEN   GRPIP HHT   RELP  SPORDER TYPE  ST    tenure
  <chr>    <dbl> <dbl> <chr> <chr> <dbl> <chr> <chr> <chr>   <chr> <chr> <chr> 
1 37569       92    92 01405 3      0.24 3     0     1       1     27    Renter
2 77963      196   195 01405 3      0.39 4     0     1       1     27    Renter
3 89323      137   137 01406 3      0.24 6     0     1       1     27    Renter
4 103528      88    88 01405 3      0.29 4     0     1       1     27    Renter
5 107132     106   106 01406 3      0.06 1     0     1       1     27    Renter
6 114983     145   145 01407 3      0.19 1     0     1       1     27    Renter
# ℹ 2 more variables: cost_burden <chr>, year <chr>

Let’s make a table showing the change in rent cost burden in the city from 2014 to 2019. This code weighs the data with WGTP, the variable for household weight.

Code

tab1 <- dat %>% 
  group_by(year, cost_burden) %>% 
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N))

`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.

Code

gt(tab1)

cost_burden	N	Proportion
2014
Cost burdened	17160	0.1971892
Extremely cost burdened	28138	0.3233398
Not cost burdened	41725	0.4794709
2019
Cost burdened	16108	0.1696222
Extremely cost burdened	20488	0.2157449
Not cost burdened	58368	0.6146329

This table shows the share of cost burdened renters declined from 19.7% in 2014 to 17.0% in 2019. The proportion of “Extremely cost burdened” renters declined from 32.2% in 2014 to 21.6% in 2019. The share of “Not cost burdened” renters rose from 47.9% in 2014 to 61.5% in 2019.

Let’s create a new variable called “cost_burden2” with two levels:

Paying less than 30% of income on rent is “Not cost burdened”.
Paying 30% or greater of income on rent is “Cost burdened.

Code

dat <- dat %>%
  mutate(cost_burden2 = case_when(.$GRPIP <.30 ~ "Not cost burdened",
                                  .$GRPIP >=.30 ~ "Cost burdened",
                                 )
         )

Let’s make a table showing the change in rent cost burden (cost_burden2) in the city from 2014 to 2019. This code weighs the data with WGTP, the variable for household weight.

Code

tab2 <- dat %>% 
  group_by(year, cost_burden2) %>% 
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N))

`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.

Code

gt(tab2)

cost_burden2	N	Proportion
2014
Cost burdened	45298	0.5205291
Not cost burdened	41725	0.4794709
2019
Cost burdened	36596	0.3853671
Not cost burdened	58368	0.6146329

Blog Post 3

Downloading the shape files for 2014 and 2019.

Code

# What package is the map function from (purrr)?  Why are we calling tigris::pumas?  What is 'cb'?  Can I use this sf file with tmap?

hmap2014 <- map("27053", # Getting PUMA geography for Hennepin County.
                tigris::pumas,
                class = "sf",
                cb = TRUE,
                year = 2014) %>%
  reduce(rbind) # This changes the list to a dataframe.  left_join won't work if you don't do this.

Using first two digits of 27053 - '27' (minnesota) - for FIPS code.FALSE


Downloading: 11 kB     
Downloading: 11 kB     
Downloading: 11 kB     
Downloading: 11 kB     
Downloading: 15 kB     
Downloading: 15 kB     
Downloading: 20 kB     
Downloading: 20 kB     
Downloading: 20 kB     
Downloading: 20 kB     
Downloading: 24 kB     
Downloading: 24 kB     
Downloading: 24 kB     
Downloading: 24 kB     
Downloading: 28 kB     
Downloading: 28 kB     
Downloading: 32 kB     
Downloading: 32 kB     
Downloading: 36 kB     
Downloading: 36 kB     
Downloading: 36 kB     
Downloading: 36 kB     
Downloading: 41 kB     
Downloading: 41 kB     
Downloading: 44 kB     
Downloading: 44 kB     
Downloading: 44 kB     
Downloading: 44 kB     
Downloading: 52 kB     
Downloading: 52 kB     
Downloading: 52 kB     
Downloading: 52 kB     
Downloading: 56 kB     
Downloading: 56 kB     
Downloading: 60 kB     
Downloading: 60 kB     
Downloading: 64 kB     
Downloading: 64 kB     
Downloading: 65 kB     
Downloading: 65 kB     
Downloading: 68 kB     
Downloading: 68 kB     
Downloading: 72 kB     
Downloading: 72 kB     
Downloading: 72 kB     
Downloading: 72 kB     
Downloading: 77 kB     
Downloading: 77 kB     
Downloading: 81 kB     
Downloading: 81 kB     
Downloading: 81 kB     
Downloading: 81 kB     
Downloading: 85 kB     
Downloading: 85 kB     
Downloading: 89 kB     
Downloading: 89 kB     
Downloading: 93 kB     
Downloading: 93 kB     
Downloading: 93 kB     
Downloading: 93 kB     
Downloading: 98 kB     
Downloading: 98 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 170 kB     
Downloading: 170 kB     
Downloading: 170 kB     
Downloading: 170 kB     
Downloading: 180 kB     
Downloading: 180 kB     
Downloading: 180 kB     
Downloading: 180 kB

Code

hcpumas2014 <- as.vector(hmap2014$PUMACE10)

Code

hmap2019 <- map("27053",
                tigris::pumas,
                class = "sf",
                cb = TRUE,
                year = 2019) %>%
  reduce(rbind)

Using first two digits of 27053 - '27' (minnesota) - for FIPS code.FALSE


Downloading: 7.3 kB     
Downloading: 7.3 kB     
Downloading: 20 kB     
Downloading: 20 kB     
Downloading: 24 kB     
Downloading: 24 kB     
Downloading: 36 kB     
Downloading: 36 kB     
Downloading: 41 kB     
Downloading: 41 kB     
Downloading: 49 kB     
Downloading: 49 kB     
Downloading: 52 kB     
Downloading: 52 kB     
Downloading: 56 kB     
Downloading: 56 kB     
Downloading: 85 kB     
Downloading: 85 kB     
Downloading: 93 kB     
Downloading: 93 kB     
Downloading: 100 kB     
Downloading: 100 kB     
Downloading: 110 kB     
Downloading: 110 kB     
Downloading: 120 kB     
Downloading: 120 kB     
Downloading: 130 kB     
Downloading: 130 kB     
Downloading: 140 kB     
Downloading: 140 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 150 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB     
Downloading: 160 kB

Code

hcpumas2019 <- as.vector(hmap2019$PUMACE10)

# Cartographic boundary PUMAs are not yet available for years after 2019. Use the argument `year = 2019` instead to request your data.

Code

mapview(hmap2014)

Warning in CPL_transform(x, crs, aoi, pipeline, reverse, desired_accuracy, :
GDAL Message 1: Sub-geometry 0 has coordinate dimension 2, but container has 3

Warning in CPL_transform(x, crs, aoi, pipeline, reverse, desired_accuracy, :
GDAL Message 1: Sub-geometry 1 has coordinate dimension 2, but container has 3

Warning in CPL_transform(x, crs, aoi, pipeline, reverse, desired_accuracy, :
GDAL Message 1: Sub-geometry 2 has coordinate dimension 2, but container has 3

Warning in CPL_transform(x, crs, aoi, pipeline, reverse, desired_accuracy, :
GDAL Message 1: Sub-geometry 3 has coordinate dimension 2, but container has 3

Warning in CPL_transform(x, crs, aoi, pipeline, reverse, desired_accuracy, :
GDAL Message 1: Sub-geometry 4 has coordinate dimension 2, but container has 3

Filter for the 3 PUMAs I am interested in for 2014.

Code

mlps2014_sf <- hmap2014 %>%
  filter(PUMACE10 %in% c('01405', '01406', '01407'))

mapview(mlps2014_sf)

Make a map showing percent of renters in 2014.

Code

# These PUMAS are missing the 0 in the front.
map1 <- mlps2014_sf %>%
  left_join(renters2014, by = c("PUMACE10" = "PUMA")) %>%
  ggplot(aes(fill = Proportion)) +
  geom_sf() +
  scale_fill_viridis_b(
    name = NULL,
    option = "magma",
    labels = scales::label_percent(1)
    ) +
  # labs(title = "Mean Gross Rent as a Percentage of Houshold Income \nMapped by Public Use Microdata Area") +
  ggtitle("Percent of households who rent, 2014",
          subtitle = "Source: U.S. Census Bureau, 2014 ACS 1 year") +
  theme_void()

map1

Make a map showing percent of renters in 2019.

Code

map2 <- mlps2014_sf %>%
  left_join(renters2019, by = c("PUMACE10" = "PUMA")) %>%
  ggplot(aes(fill = Proportion)) +
  geom_sf() +
  scale_fill_viridis_b(
    name = NULL,
    option = "magma",
    labels = scales::label_percent(1)
    ) +
  # labs(title = "Mean Gross Rent as a Percentage of Houshold Income \nMapped by Public Use Microdata Area") +
  ggtitle("Percent of households who rent, 2019",
          subtitle = "Source: U.S. Census Bureau, 2014 ACS 1 year") +
  theme_void()

map2

Make a dataframe grouped by PUMA showing percent of renters who are cost burdened in 2014.

Code

cb2014 <- dat %>%
  filter(year == 2014) %>%
  group_by(PUMA, cost_burden2) %>%
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N)) %>% 
  filter(cost_burden2 == 'Cost burdened')

`summarise()` has grouped output by 'PUMA'. You can override using the
`.groups` argument.

Code

cb2014

# A tibble: 3 × 4
# Groups:   PUMA [3]
  PUMA  cost_burden2      N Proportion
  <chr> <chr>         <dbl>      <dbl>
1 01405 Cost burdened 14819      0.586
2 01406 Cost burdened 12346      0.576
3 01407 Cost burdened 18133      0.450

Join my 2014 cost burden data with the shapefile and make a map.

Make a dataframe grouped by PUMA showing percent of renters who are cost burdened in 2019.

Code

cb2019 <- dat %>%
  filter(year == 2019) %>%
  group_by(PUMA, cost_burden2) %>%
  summarise(N = sum(WGTP)) %>%
  mutate(Proportion = N/sum(N)) %>% 
  filter(cost_burden2 == 'Cost burdened')

`summarise()` has grouped output by 'PUMA'. You can override using the
`.groups` argument.

Code

cb2019

# A tibble: 3 × 4
# Groups:   PUMA [3]
  PUMA  cost_burden2      N Proportion
  <chr> <chr>         <dbl>      <dbl>
1 01405 Cost burdened 12676      0.460
2 01406 Cost burdened  9089      0.392
3 01407 Cost burdened 14831      0.335

Join my 2019 cost burden data with the shapefile and make a map.

This raises some questions. - How can I modify a shape file to show only certain PUMAs? OK, I did this by filtering the shapefile by PUMA. - Did the PUMAs change between 2014 and 2019? Could I overlay these to compare? Or could I research how often USCB updates PUMAs? - Should I just use the shapefile for one year or for both years? - Should I map the whole county rather than just the 3 PUMAs?

Next steps

Change maps to tmaps and mapview to improve formatting.
Make a map showing cost burden change over time.
Should I make maps for the whole county? How would I do this?
Instead of having three cost burden levels, it would make more sense to dichotomize the variable into “Cost burdened” and “Not cost burdened.” The data would be better formatted for creating choropleth maps of change in cost burden from 2014 to 2019.
The change in # renters and cost burden from 2014 to 2019 can be tested for statistical significance.
Tables showing change in cost burden can be better formatted, such as showing cost burden in rows and the year in columns.
Graphs can be made showing the change over time.
Tables and graphs can be faceted by PUMS.
Create the five MAPS.