Project: Disaster Recovery Committees after Hurricane Sandy

Lead: Timothy Fraser
Current Contributors: Vincent Rago
Past Contributors: Lucy Hewitt, Jessica Chen, & Matthew Cherkerzian
Lab Head: Daniel Aldrich

2022-06-09

This document summarizes steps and code for collaborating on our analysis of disaster recovery committees in New York City neighborhoods after Hurricane Sandy. Please follow along below to get a sense of the dataset and analysis strategies!

0. Setup

0.1. Load Packages

You’ll need the following packages:

library(tidyverse) # data wrangling
library(broom) # for data wrangling
library(moderndive) # for familiar functions
library(viridis) # color palletes
library(GGally) # correlation matrices
library(texreg) # making tables
library(lmtest) # for hypothesis testing
library(simulate) # for simulation

0.2. Load Dataset

You’ll be working with this dataset (raw_data/co_dataset.rds). This is a dataset of disaster recovery committees in New York City neighborhoods, where each row represents a committee active after Hurricane Sandy.

# Import data
codat <- read_rds("raw_data/co_dataset.rds") %>%
  select(id:health_care)

# Let's check out its contents
codat %>% glimpse()

## Rows: 47
## Columns: 23
## $ id                      <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
## $ committee               <chr> "Gravesend and Bensonhurst", "Broad Channel", …
## $ region                  <chr> "New York City", "New York City", "New York Ci…
## $ ihard                   <dbl> 0.07531114, 0.05871634, 0.08314851, 0.09293561…
## $ isoft                   <dbl> 0.03486258, 0.03203687, 0.04511576, 0.04141895…
## $ isc                     <dbl> 0.008408815, 0.009475107, 0.009246055, 0.00767…
## $ iv                      <dbl> 0.02481181, 0.02483757, 0.02189349, 0.02353204…
## $ ih                      <dbl> 0.03468467, 0.03828717, 0.04234415, 0.02768607…
## $ ie                      <dbl> 0.02583709, 0.01693471, 0.01791286, 0.02460168…
## $ ilocal                  <dbl> 0.011079070, 0.005563093, 0.007801588, 0.01120…
## $ inonlocal               <dbl> 0.002919036, 0.004145049, 0.004208752, 0.00429…
## $ idis                    <dbl> 0.03134647, 0.04884068, 0.04285741, 0.02614388…
## $ women                   <dbl> 33.33333, 18.18182, 33.33333, 50.00000, 24.137…
## $ business                <dbl> 22.22222, 36.36364, 11.11111, 0.00000, 20.6896…
## $ social_org              <dbl> 66.666667, 36.363636, 55.555556, 100.000000, 6…
## $ religious_org           <dbl> 0.000000, 27.272727, 11.111111, 30.000000, 3.4…
## $ community_participation <dbl> 11.111111, 9.090909, 11.111111, 10.000000, 3.4…
## $ govt                    <dbl> 0.000000, 0.000000, 0.000000, 0.000000, 6.8965…
## $ emergency               <dbl> 0.000000, 18.181818, 11.111111, 10.000000, 3.4…
## $ expert                  <dbl> 11.111111, 0.000000, 11.111111, 10.000000, 3.4…
## $ influential_citizen     <dbl> 22.222222, 18.181818, 0.000000, 10.000000, 3.4…
## $ elected_official        <dbl> 0.000000, 0.000000, 0.000000, 0.000000, 0.0000…
## $ health_care             <dbl> 0.000000, 0.000000, 0.000000, 10.000000, 6.896…

0.3 Codebook

What do these variables mean?

id: unique id number per committee
committee: name of neighborhoods represented
region: region of New York represented

Below are a list of weighted word frequency indices, describing the % (0 to 1) of total words in each committee’s recovery plan referencing each concept.

ihard: % about ‘hard’ policy tools (infrastructure), like bridges, seawalls, potholes, etc.
isoft: % about ‘soft’ policy tools (community development), like social assistance.
isc: % about social capital concepts, like trust, neighbors, friends, etc.
iv: % about social vulnerability, like poverty, inequity, etc.
ih: % about housing policy, like rent
ie: % about economic policy, like
local: % about local actors, like neighborhoods, city council, etc.
inonlocal: % about nonlocal actors, like state representatives, mayor, etc.
idis: % about academic disaster resilience concepts, signifying expertise (eg. resilience, recovery, mitigation, adaptation)

Below are a list of traits that might affect the frequency of concepts in these recovery plans.

women: % of committee members who are women
business: % of committee members representing businesses
social_org: % of members representing social organizations (nonprofits, community groups, neighborhood associations, etc).
religious_org: % of members representing religious groups.
community_participation: % of members representing social OR religious groups.
govt: % of members representing local, state, or federal government.
expert: % of members on committee for their expertise in an area (eg. engineering, construction, or academics).
influential_citizen: % of members who are otherwise an influential local citizen, like an author, activist, community organizer, etc.
elected_official: % of members on committee who are elected officials, at any level of government.
health_care: % of members on committee representing health care systems, clinics, etc.

0.4. Summary of Tasks

In this project, we want to do 3 things:

describe types of recovery plans that committees wrote (eg. how much each was oriented towards soft, hard, social capital, vulnerability-focused policy, etc.).
describe membership patterns on committees (eg. whose interests were most frequently represented by membership?).
describe the relationship between membership patterns and types of recovery plans committees developed.

We will use the following resources to do so:

1. RStudio Cloud Project (sandy)
1. Overleaf Manuscript (sandy)
1. Google Sheets (edgelist_sandy_2022)

To get you started, I suggest the following steps:

1. Descriptives

# Import data
codat <- read_rds("raw_data/co_dataset.rds")

codat %>%
  select(`Hard` = ihard,
         `Soft` = isoft,
         `Social Capital` = isc,
         `Vulnerability` = iv,
         `Housing` = ih,
         `Economy` = ie,
         `Local` = ilocal,
         `Non-Local` = inonlocal,
         `Disaster Expert` = idis) %>%
  ggcorr(low = "red", mid = "white", high = "blue", label = TRUE)

2. Models

For example, try out the following. Try to add as many variables as possible to the model.

library(tidyverse)
library(broom)
library(moderndive)
library(GGally)

# Import data
codat <- read_rds("raw_data/co_dataset.rds")

# How much MORE do they write about soft/other policy than hard policy?
m1 <- codat %>%
  lm(formula = isoft - ihard ~  
       business + community_participation + women + govt)
m2 <- codat %>%
  lm(formula = isc - ihard ~ 
       business + community_participation + women + govt)
m3 <- codat %>%
  lm(formula = iv - ihard ~ 
       business + community_participation + women + govt)
m4 <- codat %>%
  lm(formula = isoft + isc - ihard ~ 
       business + community_participation + women + govt)
m5 <- codat %>%
  lm(formula = isoft + isc + iv - ihard ~ 
       business + community_participation + women + govt)

texreg::screenreg(list(m1,m2,m3,m4,m5),
                  stars = c(0.001, 0.01, 0.05, 0.10))

## 
## =============================================================================
##                          Model 1    Model 2    Model 3    Model 4    Model 5 
## -----------------------------------------------------------------------------
## (Intercept)              -0.04 ***  -0.06 ***  -0.05 ***  -0.03 ***  -0.01   
##                          (0.01)     (0.01)     (0.01)     (0.01)     (0.01)  
## business                  0.00 .     0.00      -0.00       0.00 .     0.00   
##                          (0.00)     (0.00)     (0.00)     (0.00)     (0.00)  
## community_participation   0.00      -0.00       0.00       0.00       0.00   
##                          (0.00)     (0.00)     (0.00)     (0.00)     (0.00)  
## women                     0.00      -0.00      -0.00       0.00       0.00   
##                          (0.00)     (0.00)     (0.00)     (0.00)     (0.00)  
## govt                     -0.00 **   -0.00 **   -0.00 *    -0.00 **   -0.00 **
##                          (0.00)     (0.00)     (0.00)     (0.00)     (0.00)  
## -----------------------------------------------------------------------------
## R^2                       0.34       0.19       0.14       0.34       0.26   
## Adj. R^2                  0.27       0.12       0.05       0.28       0.19   
## Num. obs.                47         47         47         47         47      
## =============================================================================
## *** p < 0.001; ** p < 0.01; * p < 0.05; . p < 0.1

remove(m1,m2,m3,m4,m5)

Then, try adding more terms. Does adding a predictor / using a different set of predictors significantly improve the model likelihood?

m1 <- codat %>%
  lm(formula = isoft - ihard ~  
       business + social_org + religious_org + women + govt)

# Simplify to community participation
m2 <- codat %>%
  lm(formula = isoft - ihard ~  
       business + community_participation + women + govt)

# Unclear. But the lrtest function may help guide you when choosing what to include or not include!
lrtest(m1,m2)

## Likelihood ratio test
## 
## Model 1: isoft - ihard ~ business + social_org + religious_org + women + 
##     govt
## Model 2: isoft - ihard ~ business + community_participation + women + 
##     govt
##   #Df LogLik Df  Chisq Pr(>Chisq)
## 1   7 150.96                     
## 2   6 150.86 -1 0.1884     0.6642