A story about one of the retail chains (J.C. Penny) releasing the list of stores closing in 2017 crossed paths with my Feedly reading list today and jogged my memory that there were a number of chains closing many of their doors this year and I wanted to see the impact that might have on various states.

library(httr)
library(rvest)
library(knitr)
library(kableExtra)
library(ggalt)
library(statebins)
library(hrbrthemes)
library(epidata)
library(tidyverse)

options(knitr.table.format = "html")
update_geom_font_defaults(font_rc_light, size = 2.75)

“Closing” lists of four major retailers — K-Mart, Sears, Macy’s and J.C. Penny — abound (HTML formatting a list seems to be the “easy way out” story-wise for many blogs and newspapers). We can dig a bit deeper than just a plain set of lists, but first we need the data.

The Boston Globe has a nice, predictable, mostly-uniform pattern to their list-closing “stories”, so we’ll use that data. Site content can change quickly, so it makes sense to try to cache content whenever possible as we scrape it. To that end, we’ll use httr::GET vs xml2::read_html since GET preserves all of the original request and response information and read_html returns an external pointer that has no current support for serialization without extra work.

closings <- list(
  kmart = "https://www.bostonglobe.com/metro/2017/01/05/the-full-list-kmart-stores-closing-around/4kJ0YVofUWHy5QJXuPBAuM/story.html",
  sears = "https://www.bostonglobe.com/metro/2017/01/05/the-full-list-sears-stores-closing-around/yHaP6nV2C4gYw7KLhuWuFN/story.html",
  macys = "https://www.bostonglobe.com/metro/2017/01/05/the-full-list-macy-stores-closing-around/6TY8a3vy7yneKV1nYcwY7K/story.html",
    jcp = "https://www.bostonglobe.com/business/2017/03/17/the-full-list-penney-stores-closing-around/vhoHjI3k75k2pSuQt2mZpO/story.html"
)

saved_pgs <- "saved_store_urls.rds"

if (file.exists(saved_pgs)) {
  pgs <- read_rds(saved_pgs)
} else {
  pgs <- map(closings, GET)
  write_rds(pgs, saved_pgs)
}

This is what we get from that scraping effort:

map(pgs, content) %>%
  map(html_table) %>%
  walk(~glimpse(.[[1]]))
## Observations: 108
## Variables: 3
## $ X1 <chr> "300 Highway 78 E", "2003 US Hwy 280 Bypass", "3600 Wilson ...
## $ X2 <chr> "Jasper", "Phenix City", "Bakersfield", "Coalinga", "Kingsb...
## $ X3 <chr> "AL", "AL", "CA", "CA", "CA", "CA", "CO", "CO", "CT", "FL",...
## Observations: 42
## Variables: 4
## $ X1 <chr> "301 Cox Creek Pkwy", "1901 S Caraway Road", "90 Elm St; En...
## $ X2 <chr> "Florence", "Jonesboro", "Enfield", "Lake Wales", "Albany",...
## $ X3 <chr> "AL", "AR", "CT", "FL", "GA", "GA", "IN", "KS", "KY", "LA",...
## $ X4 <chr> "Y", "N", "N", "Y", "Y", "N", "N", "N", "Y", "Y", "Y", "Y",...
## Observations: 68
## Variables: 6
## $ X1 <chr> "Mission Valley Apparel", "Paseo Nuevo", "*Laurel Plaza", "...
## $ X2 <chr> "San Diego", "Santa Barbara", "North Hollywood", "Simi Vall...
## $ X3 <chr> "CA", "CA", "CA", "CA", "FL", "FL", "FL", "FL", "FL", "GA",...
## $ X4 <int> 385000, 141000, 475000, 190000, 101000, 195000, 143000, 140...
## $ X5 <int> 1961, 1990, 1995, 2006, 1995, 2000, 1977, 1974, 2000, 1981,...
## $ X6 <int> 140, 77, 105, 105, 68, 83, 86, 73, 72, 69, 9, 57, 54, 87, 5...
## Observations: 138
## Variables: 3
## $ Mall/Shopping Center <chr> "Auburn Mall", "Tannehill Promenade", "Ga...
## $ City                 <chr> "Auburn", "Bessemer", "Gadsden", "Jasper"...
## $ State                <chr> "AL", "AL", "AL", "AL", "AR", "AR", "AZ",...

We now need to normalize the content of the lists.

map(pgs, content) %>%
  map(html_table) %>%
  map(~.[[1]]) %>%
  map_df(select, abb=3, .id = "store") -> closings

We’re ultimately just looking for city/state for this simple exercise, but one could do more precise geolocation (perhaps with rgeocodio) and combine that with local population data, job loss estimates, current unemployment levels, etc. to make a real story out of the closings vs just do the easy thing and publish a list of stores.

count(closings, abb) %>%
  left_join(data_frame(name = state.name, abb = state.abb)) %>%
  left_join(usmap::statepop, by = c("abb"="abbr")) %>%
  mutate(per_capita = (n/pop_2015) * 1000000) %>%
  select(name, n, per_capita) -> closings_by_state

2017 Retail Stores (Combined) Closing

Sears, J.C. Penny, K-Mart and Macy’s are all shutting down retail locations across the U.S.

Sorted by closigs per-capita (1MM)

Retail Stores Closing (sorted by per-capita)
State Total Stores Closing Per Capita (1MM)
North Dakota 5 6.6056568
South Dakota 5 5.8243221
Wyoming 2 3.4123462
West Virginia 6 3.2535703
Maine 4 3.0090392
Kansas 8 2.7475915
Kentucky 10 2.2598400
Iowa 7 2.2407895
Michigan 22 2.2171662
Minnesota 12 2.1859540
Oklahoma 8 2.0453359
Montana 2 1.9362040
Pennsylvania 23 1.7965237
Oregon 7 1.7374137
Louisiana 8 1.7127966
Mississippi 5 1.6709370
Utah 5 1.6689370
Ohio 19 1.6360379
Nebraska 3 1.5821199
Alabama 7 1.4406319
Hawaii 2 1.3970354
Wisconsin 8 1.3861606
Missouri 8 1.3149953
North Carolina 13 1.2944595
South Carolina 6 1.2254537
Indiana 8 1.2085176
Idaho 2 1.2085103
Georgia 12 1.1747591
Colorado 6 1.0995911
Virginia 9 1.0736022
Nevada 3 1.0377589
Arkansas 3 1.0073185
New Mexico 2 0.9591825
Rhode Island 1 0.9467025
Illinois 12 0.9331263
New Jersey 8 0.8930552
Massachusetts 6 0.8830773
Florida 17 0.8386252
Connecticut 3 0.8354484
Texas 22 0.8008995
Tennessee 5 0.7575414
Washington 5 0.6973159
New York 12 0.6061895
Maryland 2 0.3329781
California 12 0.3065540
Arizona 1 0.1464544

Sorted by total stores closing

Retail Stores Closing (sorted by total stores)
State Total Stores Closing Per Capita (1MM)
Pennsylvania 23 1.7965237
Michigan 22 2.2171662
Texas 22 0.8008995
Ohio 19 1.6360379
Florida 17 0.8386252
North Carolina 13 1.2944595
California 12 0.3065540
Georgia 12 1.1747591
Illinois 12 0.9331263
Minnesota 12 2.1859540
New York 12 0.6061895
Kentucky 10 2.2598400
Virginia 9 1.0736022
Indiana 8 1.2085176
Kansas 8 2.7475915
Louisiana 8 1.7127966
Missouri 8 1.3149953
New Jersey 8 0.8930552
Oklahoma 8 2.0453359
Wisconsin 8 1.3861606
Alabama 7 1.4406319
Iowa 7 2.2407895
Oregon 7 1.7374137
Colorado 6 1.0995911
Massachusetts 6 0.8830773
South Carolina 6 1.2254537
West Virginia 6 3.2535703
Mississippi 5 1.6709370
North Dakota 5 6.6056568
South Dakota 5 5.8243221
Tennessee 5 0.7575414
Utah 5 1.6689370
Washington 5 0.6973159
Maine 4 3.0090392
Arkansas 3 1.0073185
Connecticut 3 0.8354484
Nebraska 3 1.5821199
Nevada 3 1.0377589
Hawaii 2 1.3970354
Idaho 2 1.2085103
Maryland 2 0.3329781
Montana 2 1.9362040
New Mexico 2 0.9591825
Wyoming 2 3.4123462
Arizona 1 0.1464544
Rhode Island 1 0.9467025

Sorted by state name

Retail Stores Closing (sorted by state name)
State Total Stores Closing Per Capita (1MM)
Alabama 7 1.4406319
Arizona 1 0.1464544
Arkansas 3 1.0073185
California 12 0.3065540
Colorado 6 1.0995911
Connecticut 3 0.8354484
Florida 17 0.8386252
Georgia 12 1.1747591
Hawaii 2 1.3970354
Idaho 2 1.2085103
Illinois 12 0.9331263
Indiana 8 1.2085176
Iowa 7 2.2407895
Kansas 8 2.7475915
Kentucky 10 2.2598400
Louisiana 8 1.7127966
Maine 4 3.0090392
Maryland 2 0.3329781
Massachusetts 6 0.8830773
Michigan 22 2.2171662
Minnesota 12 2.1859540
Mississippi 5 1.6709370
Missouri 8 1.3149953
Montana 2 1.9362040
Nebraska 3 1.5821199
Nevada 3 1.0377589
New Jersey 8 0.8930552
New Mexico 2 0.9591825
New York 12 0.6061895
North Carolina 13 1.2944595
North Dakota 5 6.6056568
Ohio 19 1.6360379
Oklahoma 8 2.0453359
Oregon 7 1.7374137
Pennsylvania 23 1.7965237
Rhode Island 1 0.9467025
South Carolina 6 1.2254537
South Dakota 5 5.8243221
Tennessee 5 0.7575414
Texas 22 0.8008995
Utah 5 1.6689370
Virginia 9 1.0736022
Washington 5 0.6973159
West Virginia 6 3.2535703
Wisconsin 8 1.3861606
Wyoming 2 3.4123462

Compared to unemployment

I’d have used the epidata to get the current state unemployment data but it’s not current, so we can either use a package to get data from the Bureau of Labor Statistics or just scrape it. We’ll use the U-6 rate since that is an expanded definition including “total unemployed, plus all marginally attached workers, plus total employed part time for economic reasons, as a percent of the civilian labor force plus all marginally attached workers” and is likely to more representative for the populations working at retail chains.

pg <- read_html("https://www.bls.gov/lau/stalt16q4.htm")

html_nodes(pg, "table#alternmeas16\\:IV") %>% 
  html_table(header = TRUE, fill = TRUE) %>%
  .[[1]] %>% 
  docxtractr::assign_colnames(1) %>% 
  rename(name=State) %>% 
  as_data_frame() %>% 
  slice(2:52) %>% 
  type_convert() %>% 
  left_join(closings_by_state, by="name") %>% 
  filter(!is.na(n)) -> with_unemp

ggplot(with_unemp, aes(per_capita, `U-6`)) +
  geom_label(aes(label=name), fill="#8c96c6", color="white", size=3.5, family=font_rc) +
  scale_x_continuous(limits=c(-0.125, 6.75)) +
  labs(x="Closings per-capita (1MM)", 
       y="BLS Labor Underutilization (U-6 rate)",
       title="Per-capita store closings compared to current BLS U-6 Rate") +
  theme_ipsum_rc(grid="XY")

If I were a reporter, I think I’d be digging a bit deeper on the impact of these (and the half-dozen or so other) retailers closing locations in New Mexico, Nevada, West Virginia, Wyoming, (mebbe Maine, though I’m super-b ased :-), North Dakota & South Dakota.

