HW 5 Spatial Analysis

Calculate segregation indices for a geography of your choosing, calculate two indices of segregation.

What are the two groups you used for your index?

Create a map and a descriptive summary of your indices

Data

Downloading ACS Table B03002 Hispanic Origin by Race for the State of California.

race_table10 <-  get_acs(geography = "tract",
                         year=2010,
                         geometry = T,
                         output="wide",
                         table = "B03002", # Hispanic or Latino Origin by Race
                         cache_table = T,
                         state = "CA")

From the downloaded data organizing it better to calculate segregation indices.

df <- race_table10 %>% 
  mutate(nhwhite = B03002_003E,
         nhblack = B03002_004E,
         nhother = B03002_005E + B03002_006E + B03002_007E + B03002_008E + B03002_009E + B03002_010E,
         hisp = B03002_012E,
         total = B03002_001E) %>% 
  mutate(countyGEOID = substr(GEOID, 1,5)) %>% 
  select(countyGEOID, GEOID, NAME, nhwhite, nhblack, nhother, hisp, total)

Exploratory Plots

quantile_variable <- function(x, breaks = 4){
  cut(x, breaks=quantile(x, p=seq(0,1,length.out = breaks), na.rm=T ), include.lowest=T )
}

ggplot(df, mapping = aes(fill = quantile_variable(nhwhite/total, breaks = 6))) +
  geom_sf(colour = NA) +
  scale_fill_brewer(palette = "YlOrRd") +
  labs(title = "Proportion of Non-Hispanic White in CA") +
  guides(fill=guide_legend(title="% Non-Hispanic White"))

ggplot(df, mapping = aes(fill = quantile_variable(nhblack/total, breaks = 6))) +
  geom_sf(colour = NA) +
  scale_fill_brewer(palette = "YlOrRd") +
  labs(title = "Proportion of Non-Hispanic Black in CA Tracts") +
  guides(fill=guide_legend(title="% Non-Hispanic Black"))

When looking at Non-Hispanic Black and Non-Hispanic White visually we see that some high white areas are low black areas.

Segregation Indices

Interaction Index

\[ \text{Interaction} = \sum_i \frac{b_i}{B} * \frac{a_i}{t_i} \]

The interaction index is a measure of how much two groups interact with each other. And we will be using it to calculate the interaction between non-hispanic white and non-hispanic black at the county level

Looking at the the number of tracts by county

county_tract <- df %>% 
  st_set_geometry(NULL) %>% 
  group_by(countyGEOID) %>% 
  summarise(n = n())

county_tract %>% 
  arrange(n)

## # A tibble: 58 x 2
##    countyGEOID     n
##    <chr>       <int>
##  1 06003           1
##  2 06091           1
##  3 06051           3
##  4 06049           4
##  5 06011           5
##  6 06105           5
##  7 06021           6
##  8 06027           6
##  9 06043           6
## 10 06063           7
## # ... with 48 more rows

We see there are only 58 counties in california and 2 of which only have 2 tracts. In those counties the interaction will just be the the proportion of non-hispanic white.

df <- df %>% 
  group_by(countyGEOID) %>% 
  mutate(c.nhblack = sum(nhblack, na.rm = T),
         interaction = sum(nhblack/c.nhblack * nhwhite/total, na.rm = T))

ggplot(df, mapping = aes(fill = interaction)) +
  geom_sf(colour = NA) +
  scale_fill_distiller(palette = "YlOrRd") +
  labs(title = "Interaction by County of Non-Hispanic Blacks" ) +
  guides(fill=guide_legend(title="Interaction Probablity"))

We see that there are some areas with an extremely low interaction probability. And it seems less interaction occurs in south California compared to northern California.

Isolation Index

\[ \text{Isolation} = \sum_i \frac{b_i}{B} * \frac{b_i}{t_i} \]

Isolation is a measure of how isolated a group is. We share look at the Isolation of non-hispanic blacks are.

df <- df %>% 
  group_by(countyGEOID) %>% 
  mutate(Isolation = sum(nhblack/c.nhblack * nhblack/total, na.rm = TRUE))

ggplot(df, mapping = aes(fill = Isolation)) +
  geom_sf(colour = NA) +
  scale_fill_distiller(palette = "YlOrRd") +
  labs(title = "Isolation by County of Non-Hispanic Blacks") +
  guides(fill=guide_legend(title="Isolation Probablity"))

We can see an opposite trend in the Isolation compared to the Interaction. It seems north has very low isolation. One thing of note is that Isolation probability max is only .31 which indicates overall there is not very high isolation areas in California.