Nashville: Progressive, inclusive, segregated

Diversity probably isn’t the first thing that comes to mind when you think of Nashville, Tennessee.

Thirty minutes from Murfreesboro, where I teach journalism at Middle Tennessee State University, Nashville is famous for music - particularly country music. The city is home to the Grand Ole Opry, the Music Row recording studio and publishing district, and the Lower Broad neighborhood, with its honky tonk bars, live music, and wall-to-wall tourists.

Look past the facade of guitars, cowboy hats and beer, though, and you’ll find one of the most diverse cities in the state, progressive and inclusive, yet marked by geographic segregation left over from its painful Jim Crow past.

This interactive map lets you explore racial diversity across the city’s 35 neighborhoods.

Percent “white alone,” by Nashville neighborhood

Data source: 2019-2023 American Community Survey

Note, especially, the contrast between the rest of the city and heavily minority Northwest Nashville, where the occupying Union Army built fortifications during the American Civil War, where four prominent HBCUs were founded and still operate, where Dr. Martin Luther King Jr. spoke during his visits to the city, and where white supremacists dynamited the home of Black civil rights attorney and Nashville City councilman Z. Alexander Looby in April of 1960.

A lot has changed in Nashville. A lot hasn’t.

This session will introduce you to the R, the free, open-source programming language that produced this map, and show you how you could use it and data from the U.S. Census Bureau’s American Community Survey to explore diversity - and much more - in your community.

Coding? Yikes!

I might as well acknowledge, up front, that writing R code likely will seem harder than pointing-and-clicking your way through data analysis with something like the Microsoft Excel spreadsheet. I’ll have more to say about that later. For now, trust me on two things:

Mainly, it’s harder only at first. Stick with coding, and it quickly will become as easy as using a spreadsheet. Before long, it will become easier, especially when you get into all the things R can do that Excel can’t.
Beginning is easier than you think. The Getting started with R section of this tutorial includes a couple of fun, engaging YouTube videos that will show you how to install R and begin using it. You’ll also find links to more such videos.

The script, step by step

Let me walk you through the script that produced the map you saw above. I’ll break the script into parts and explain at least a little about what each part does. I’ll put the whole script after this section, so you can easily copy and paste the whole thing later and try it out.

Loading some add-ons

Most R scripts start by loading one or more add-on programs that expand what basic R can do and make R code easier to write and understand. These add-ons, called “packages,” are free and open-source, just like base R. My script uses six such packages. Here’s the code for setting up and activating them:

# Installing and loading required packages

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidycensus")) install.packages("tidycensus")
if (!require("sf")) install.packages("sf")
if (!require("mapview")) install.packages("mapview")
if (!require("leaflet")) install.packages("leaflet")
if (!require("leaflet.extras2")) install.packages("leaflet.extras2")

library(tidyverse)
library(tidycensus)
library(sf)
library(mapview)
library(leaflet)
library(leafpop)

When you actually run this code, you’ll see a whole lot of activity in R Studio’s output pane. The first time will take a little longer, because R will be installing the packages. After the first time, R needs only to load them into memory.

Look for a small, red stop sign icon in the upper-right corner of the output pane. As long as that stop sign is visible, R is working. When it disappears, R will be done, and you can proceed.

Providing your Census API key

You’ll need to get your own, unique access key from the U.S. Census Bureau, activate it, and plug it into the code below. Otherwise, the code won’t work.

Use this form to request one. The key is free, and you’ll receive it via e-mail almost immediately after completing and submitting the form. When the e-mail with your key arrives, open the e-mail and click the activation link you’ll find inside the e-mail. Here’s the e-mail that delivered my API key. I have blurred the key, because you’re not supposed to share your key with anyone.

After you have activated your key, carefully copy and paste the key into this code, in place of PasteYourAPIKeyBetweenTheseQuoteMarks. The key will be a long string of letters and numbers.

Be sure you don’t accidentally delete the quote marks in the code or include spaces, punctuation, or anything other than the key’s characters.

census_api_key("PasteYourAPIKeyBetweenTheseQuoteMarks")

Picking an estimate

I’m looking for the percentage of people in each Davidson County division who identify as “white.” The American Community Survey estimates that percentage with a variable named DP05_0037P.

The ACS provides thousands of estimates, each with a unique variable name. I’ll explain later how to find the variable name for whatever estimate you want to get. For now, just trust that DP05_0037P is the variable I need. This code shares that variable name with R:

VariableList = 
  c(Estimate_ = "DP05_0037P")

Retrieving the data

Time to get some data! Loosely translated, this code uses the get_acs() function from the tidycensus package to:

Connect to the Census Bureau’s server (get_acs)
Say, “I want county subdivision-level data” (geography = "county subdivision")
Add, “For every county subdivision in Tennessee” (state = "TN")
Add, “For the variables I specified (variables = VariableList)
Add, “As of 2023” (year = 2023)
Add, “According to the five-year American Community Survey” (survey = "acs5")
Add, “Laid out with a row for each subdivision and a column for each variable”(output = "wide")
Add, “With the subdivision borders included” (geometry = TRUE)
Add, “And put all of it in a data frame called”mydata” (mydata <-)
Once you’re done, take the data frame’s “NAME” variable, which contains the division number, county name, and state, all inconveniently lumped together, and split it up so that there’s a column for each part: one column called “Division,” a second column called “County,” and a third column called “State.” (separate_wider_delim)

If you’ve ever imported external data into Excel, you might be getting an inkling of why code is so much better. Doing all of the things listed above in Excel would take a bit. R will do all of it in no time, and fully automatically. Furthermore, Excel would have no idea what to do with the subdivision border information that comes included and ready to go in the R data.

# ---------- County subdivision analysis ----------

# Fetching data

mydata <- get_acs(
  geography = "county subdivision",
  state = "TN",
  variables = VariableList,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

# Reformatting data

mydata <-
  separate_wider_delim(mydata,
                       NAME,
                       delim = ", ",
                       names = c("Division", "County", "State"))

Filtering the data

Once you run the above code, the mydata data frame will appear in the upper-right pane of RStudio under the “Environment” tab. Clicking on the data frame will open it in RStudio’s viewer. It will look something like this:

If turned into a map, this file would show every county division in the entire state of Tennessee - all 844 of them. That might be useful for some purposes, but I’m after only the ones in Davidson County, where Nashville is.

This code filter’s mydata for divisions in Davidson County, then puts them in a new data frame called filtereddata.

filtereddata <- mydata %>% 
  filter(County == "Davidson County")

A closer look at the data

Open the filtereddata data frame, and you’ll see only the 35 divisions in Davidson County:

The first row contains data for Davidson County’s “District 13” division. The Estimate_E value, 54.1, indicates that an estimated 54.1 percent of District 13’s population is white. The “E” in Estimate_E stands for “estimated.”

“Estimated” is an important adjective to keep in mind. ACS figures are not based on actual counts the way figures in the 10-year Census are. Instead, ACS estimates are derived from random-sample surveys. The surveys are high quality, but anything short of an actual count could be at least a little bit wrong. Each district’s Estimate_M indicates how wrong.

Let’s take District 13 as an example. District 13’s Estimate_M of of 6.3 means that the actual percentage of whites in District 13 is probably somewhere within 6.3 percentage points of the district’s Estimate_E, which is 54.1 percent. Specifically, the actual figure for District 13 is probably between 54.1 - 6.3, or 47.8 percent, and 54.1 + 6.3, or 60.4 percent. This range is called the estimate’s “margin of error.” The “M” in Estimate_M stands for “margin.”

Interpreting the error margins

One way to understand how helpful this information can be is to graph each district’s estimate along with its error margin. Here is some code that will make such a graph for the estimated white perentages in Davidson County districts. The graph appears below the code:

# Plotting data

# Plotting data

mygraph <- ggplot(filtereddata, aes(x = Estimate_E, y = reorder(Division, Estimate_E))) +
  geom_errorbarh(aes(xmin = Estimate_E - Estimate_M, xmax = Estimate_E + Estimate_M)) +
  geom_point(size = 3, color = "darkblue") +
  theme_minimal(base_size = 12.5) +
  labs(
    title = "Pct. White Alone",
    subtitle = "Davidson County subdivisions. Brackets show error margins.",
    x = "2018-2022 ACS estimate",
    y = "")

mygraph

District 2, in the chart’s bottom-left corner, really stands out, doesn’t it? It has the lowest percentage of whites, just above 20 percent. That means it has the highest percentage of nonwhites - just under 80 percent. Furthermore, its error margin doesn’t overlap with the error margins for any of the other districts. It almost certainly has a lower percentage of whites than any other district in the county.

Districts 3, 21, 32, 9, 30, 1, and 28 all have error margins that overlap one another. That means their white population percentages probably aren’t all that different from one another. But none of their error margins exceed the 50 percent mark. So, all of these districts probably have nonwhite majorities. On the map, they are shaded purple and blue.

In the upper-right corner of the graph, meanwhile, districts 25, 23, 35, 24, and 34 stand out as having distinctly high proportions of white residents. These are the least-diverse districts in the county. On the map, they are bright yellow.

If Davidson County were more integrated geographically, these extremes would not be evident. Instead, more districts would have similar proportions and overlapping error margins.

Making the map

You’ve seen the map already. Here is the code that creates and displays it:

# Mapping data

mapdata <- filtereddata %>% 
  rename(Estimate = Estimate_E,
         Range = Estimate_M)

mapdata <- st_as_sf(mapdata)

mapviewOptions(basemaps.color.shuffle = FALSE)

DivisionMap <- mapview(mapdata,
                     zcol = "Estimate",
                     layer.name = "Estimate",
                     popup = popupTable(
                       mapdata,
                       feature.id = FALSE,
                       row.numbers = FALSE,
                       zcol = c(
                         "State",
                         "County",
                         "Division",
                         "Estimate",
                         "Range")))

DivisionMap

Exporting the data

Finally, suppose you wanted the data in a comma-separated value file that you could import into some other application. If you run this code, you’ll find the data in a file called DivisionData.csv stored on your computer in the same directory as your R script.

# Exporting data in .csv format

CSVdata <- st_drop_geometry(mapdata)
write_csv(CSVdata, "DivisionData.csv")

The whole script

As promised, here’s the whole script, without interruption:

# Installing and loading required packages

if (!require("tidyverse")) install.packages("tidyverse")
if (!require("tidycensus")) install.packages("tidycensus")
if (!require("sf")) install.packages("sf")
if (!require("mapview")) install.packages("mapview")
if (!require("leaflet")) install.packages("leaflet")
if (!require("leaflet.extras2")) install.packages("leaflet.extras2")

library(tidyverse)
library(tidycensus)
library(sf)
library(mapview)
library(leaflet)
library(leafpop)

# Transmitting API key

census_api_key("PasteYourAPIKeyBetweenTheseQuoteMarks")

# Fetching ACS codebooks

DetailedTables <- load_variables(2022, "acs5", cache = TRUE)
SubjectTables <- load_variables(2022, "acs5/subject", cache = TRUE)
ProfileTables <- load_variables(2022, "acs5/profile", cache = TRUE)

# Specifying target variables

VariableList = 
  c(Estimate_ = "DP05_0037P")

# ---------- County subdivision analysis ----------

# Fetching data

mydata <- get_acs(
  geography = "county subdivision",
  state = "TN",
  variables = VariableList,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

# Reformatting data

mydata <-
  separate_wider_delim(mydata,
                       NAME,
                       delim = ", ",
                       names = c("Division", "County", "State"))

# Filtering data

filtereddata <- mydata %>% 
  filter(County == "Davidson County")

# Plotting data

mygraph <- ggplot(filtereddata, aes(x = Estimate_E, y = reorder(Division, Estimate_E))) + 
  geom_errorbarh(aes(xmin = Estimate_E - Estimate_M, xmax = Estimate_E + Estimate_M)) + 
  geom_point(size = 3, color = "darkblue") + 
  theme_minimal(base_size = 12.5) + 
  labs(title = "Pct. White Alone", 
       subtitle = "Davidson County subdivisions. Brackets show error margins.", 
       x = "2019-2023 ACS estimate", 
       y = "")

mygraph

# Mapping data

mapdata <- filtereddata %>% 
  rename(Estimate = Estimate_E,
         Range = Estimate_M)

mapdata <- st_as_sf(mapdata)

mapviewOptions(basemaps.color.shuffle = FALSE)

DivisionMap <- mapview(mapdata,
                     zcol = "Estimate",
                     layer.name = "Estimate",
                     popup = popupTable(
                       mapdata,
                       feature.id = FALSE,
                       row.numbers = FALSE,
                       zcol = c(
                         "State",
                         "County",
                         "Division",
                         "Estimate",
                         "Range")))

DivisionMap

# Exporting data in .csv format

CSVdata <- st_drop_geometry(mapdata)
write_csv(CSVdata, "DivisionData.csv")

Same analysis, different estimate

But race isn’t the only, or even the most substantial, way to think about diversity. Cultural diversity matters, too. The ACS variable DP02_0114P estimates the proportion of people age 5 and older who speak a language other than English while at home.

What? Do it all over again?

Yeah. Let’s run the entire analysis again, but with the DP02_0114P language estimate instead of the DP05_0037P race estimate. Sound like a slog? It isn’t. Because all you have to do is change DP05_0037P to DP02_0114P in the VariableList code and tweak the graphic’s headline. Here’s a script. The <== This has changed notes indicate the (only) two lines of code that had to be edited:

# Specifying target variables

VariableList = 
  c(Estimate_ = "DP02_0114P") # <== This has changed

# ---------- County subdivision analysis ----------

# Fetching data

mydata <- get_acs(
  geography = "county subdivision",
  state = "TN",
  variables = VariableList,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

# Reformatting data

mydata <-
  separate_wider_delim(mydata,
                       NAME,
                       delim = ", ",
                       names = c("Division", "County", "State"))

# Filtering data

filtereddata <- mydata %>% 
  filter(County == "Davidson County")

# Plotting data

mygraph <- ggplot(filtereddata, aes(x = Estimate_E, y = reorder(Division, Estimate_E))) + 
  geom_errorbarh(aes(xmin = Estimate_E - Estimate_M, xmax = Estimate_E + Estimate_M)) + 
  geom_point(size = 3, color = "darkblue") + 
  theme_minimal(base_size = 12.5) + 
  labs(title = "Pct. non-English at home", # <== This has changed
       subtitle = "Davidson County subdivisions. Brackets show error margins.",
       x = "2019-2023 ACS estimate", 
       y = "")

mygraph

# Mapping data

mapdata <- filtereddata %>% 
  rename(Estimate = Estimate_E,
         Range = Estimate_M)

mapdata <- st_as_sf(mapdata)

mapviewOptions(basemaps.color.shuffle = FALSE)

DivisionMap <- mapview(mapdata,
                     zcol = "Estimate",
                     layer.name = "Estimate",
                     popup = popupTable(
                       mapdata,
                       feature.id = FALSE,
                       row.numbers = FALSE,
                       zcol = c(
                         "State",
                         "County",
                         "Division",
                         "Estimate",
                         "Range")))

DivisionMap

# Exporting data in .csv format

CSVdata <- st_drop_geometry(mapdata)
write_csv(CSVdata, "DivisionData.csv")

Percent non-English at home, by Nashville neighborhood

Data source: 2019-2023 American Community Survey

Interpreting the new graphic and map

Now, we can look at Davidson County’s diversity from a whole different perspective. Two districts - 28 and 30 - stand out in the chart as having majorities of people who speak a language other than English while at home. In 11 more - 4, 15, 9, 33, 29, 16, 26, 13, 32, 27, and 31 - between 20 and 40 percent don’t use English at home. In the rest, under 20 percent speak a language besides English at home.

The map shows that Davidson County’s other-than-English population is concentrated in the Southeastern fourth of the county. The two districts with proportions above 50 percent stand out as yellow. In Davidson County, these are culturally Hispanic and Latino areas.

Same analysis, different place

But enough about Nashville/Davidson County. You’re probably more interested in exploring where you live than in exploring where I live. We’re from all over, of course. So let’s pick somewhere we all might have been, or would like to visit: New York City.

Language diversity in the Big Apple

I’ll stick with DP02_0114P, the variable that estimates the proportion of people age 5 and older who speak a language other than English while at home. Just as easily, though, I could go back to DP05_0037P, the race estimate, or even some other variable from the ACS. To switch to New York City, though, all I have to do is change state = "TN" to state = "NY" in the get_acs section of the code, like this:

mydata <- get_acs(
  geography = "county subdivision",
  state = "NY", # <== This has changed 
  variables = VariableList,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

… change the filter section of the code from filter(County == "Davidson County") to filter for the several counties that lie within New York City. The | character can be used to indicate “or” within a filter function. The code ends up looking like this:

filtereddata <- mydata %>% 
  filter(County == "Suffolk County"|
           County == "Nassau County"|
           County == "Queens County"|
           County == "Kings County"|
           County == "Richmond County"|
           County == "New York County"|
           County == "Bronx County"|
           County == "Westchester County"|
           County == "Rockland County")

… and, finally, change the chart’s subheading code from subtitle = "Davidson County subdivisions. Brackets show error margins." to subtitle = "New York City subdivisions. Brackets show error margins." Because New York has so many divisions, the chart is going to be unreadably crowded. In practice, I’d either skip it or make a separate chart for each county. But I’ll leave it in, as is. Here’s what the revamped code looks like:

mygraph <- ggplot(filtereddata, aes(x = Estimate_E, y = reorder(Division, Estimate_E))) + 
  geom_errorbarh(aes(xmin = Estimate_E - Estimate_M, xmax = Estimate_E + Estimate_M)) + 
  geom_point(size = 3, color = "darkblue") + 
  theme_minimal(base_size = 12.5) + 
  labs(title = "Pct. non-English at home",
       subtitle = "New York City subdivisions. Brackets show error margins.", # <== This has changed
       x = "2019-2023 ACS estimate", 
       y = "")

The updated script

Everything else can stay the same. Any data frames or .csv files you created with previous versions of the code will simply get overwritten. Here’s the full script, with the two changes detailed above:

# Specifying target variables

VariableList = 
  c(Estimate_ = "DP02_0114P")

# ---------- County subdivision analysis ----------

# Fetching data

mydata <- get_acs(
  geography = "county subdivision",
  state = "NY", # <== This has changed
  variables = VariableList,
  year = 2023,
  survey = "acs5",
  output = "wide",
  geometry = TRUE)

# Reformatting data

mydata <-
  separate_wider_delim(mydata,
                       NAME,
                       delim = ", ",
                       names = c("Division", "County", "State"))

# Filtering data

filtereddata <- mydata %>% # <== This section has changed
  filter(County == "Suffolk County"|
           County == "Nassau County"|
           County == "Queens County"|
           County == "Kings County"|
           County == "Richmond County"|
           County == "New York County"|
           County == "Bronx County"|
           County == "Westchester County"|
           County == "Rockland County")

# Plotting data

mygraph <- ggplot(filtereddata, aes(x = Estimate_E, y = reorder(Division, Estimate_E))) + 
  geom_errorbarh(aes(xmin = Estimate_E - Estimate_M, xmax = Estimate_E + Estimate_M)) + 
  geom_point(size = 3, color = "darkblue") + 
  theme_minimal(base_size = 12.5) + 
  labs(title = "Pct. non-English at home", # <== This has changed
       subtitle = "New York City subdivisions. Brackets show error margins.", # <== This has changed
       x = "2019-2023 ACS estimate", 
       y = "")

mygraph

# Mapping data

mapdata <- filtereddata %>% 
  rename(Estimate = Estimate_E,
         Range = Estimate_M)

mapdata <- st_as_sf(mapdata)

mapviewOptions(basemaps.color.shuffle = FALSE)

DivisionMap <- mapview(mapdata,
                     zcol = "Estimate",
                     layer.name = "Estimate",
                     popup = popupTable(
                       mapdata,
                       feature.id = FALSE,
                       row.numbers = FALSE,
                       zcol = c(
                         "State",
                         "County",
                         "Division",
                         "Estimate",
                         "Range")))

DivisionMap

# Exporting data in .csv format

CSVdata <- st_drop_geometry(mapdata)
write_csv(CSVdata, "DivisionData.csv")

Percent non-English at home, by New York City area

Data source: 2019-2023 American Community Survey

Getting started with R

I hope you are interested by now in setting up R on your own computer and trying out the scripts shown above. This section provides some resources that can help you not only get started but also build your R coding skills.

Two helpful YouTube videos

These two videos, both by Dr. Greg Martin, a public health expert in Ireland and host of one of YouTube’s leading “how to” channels about R programming, can get you going.

The first video explains some basic things about R and how it works. The second shows you how to install and use RStudio, a programming environment that will let you easily write, test, and run R code and look at the data frames and output R produces.

Going further with R

“How to” videos, web pages and free e-books are available all over the Internet. If there’s something you need help doing in R, just Google “R” and a description of what you’re trying to figure out.

For a more structured approach, though, I suggest subscribing to these two excellent YouTube channels. The first is where I got the two videos embedded above. The second is from Andrew Gard, a mathematics professor at Wake Forest College. Gard’s channel includes more material than Martin’s does about advanced statistical analysis - something journalists may not need very often. But both offer excellent material for R beginners and videos with good production values.

RProgramming101, from Dr. Greg Martin.
Equitable Equations, from Dr. Andrew Gard.

Another good source for you is the tidycensus website built by tidycensus developer and super-nice guy Kyle Walker.

You also can reach out to me. You’ll find my contact information at drkblake.com.

Finding other ACS variables

I promised to tell you how I knew that DP05_0037P would estimate the percentage of whites and that DP02_0114P would estimate the percentage of people age 5 and older who speak a language other than English while at home.

You can look up these and other variable names in the set of codebooks for whatever American Community Survey’ dataset you are working with. In data analysis, a codebook is a document that lists each variable in a dataset and explains what the variable measures.

There are a few ways to get access to the American Community Survey’s codebooks. They can be big files; the ACS measures nearly 50,000 variables. Fortunately, the tidyverse package has a built-in process that makes it easy.

First, R needs access to tidyverse and your API key

Begin by ensuring, if you haven’t already, that the tidyverse package is installed and loaded, and that you have supplied R with your Census API key. If you have successfully run any of the above code since opening RStudio, you almost certainly already have done both.

But if you haven’t, this code will handle everything. Just be sure to replace PasteYourAPIKeyBetweenTheseQuoteMarks with your actual API key before you run the code.

# Make sure tidycensus is installed and loaded 
# Make sure you have provided R with you Census API key  

if (!require("tidyverse")) install.packages("tidyverse")
library(tidyverse)

census_api_key("PasteYourAPIKeyBetweenTheseQuoteMarks")

Retrieving the 2023 five-year ACS codebooks

If tidyverse and your API key are good to go, running this code will download each of the 2022 five-year ACS codebooks, put them in three data frames called DetailedTables, SubjectTables, and ProfileTables, and put them in the environment pane in RStudio.

# Fetch ACS codebooks

DetailedTables <- load_variables(2023, "acs5", cache = TRUE)
SubjectTables <- load_variables(2023, "acs5/subject", cache = TRUE)
ProfileTables <- load_variables(2023, "acs5/profile", cache = TRUE)

Viewing a codebook

Click any of the three dataframes, and RStudio will open it under a tab in the upper-left pane. You’ll be able to scroll through the codebook. The “name” column will contain the variable names. The “label” column will explain what the variable estimates (although not always terribly clearly). The “concept” column will give you the general topic that the variable fits into.

Some tips:

I recommend starting with the ProfileTables codebook. It’s the smallest of the three and contains the most-frequently-used variables. It also is the first place to look for variables that estimate percentages, rather than raw counts. For journalists, percentages often (but not always) are more helpful and informative than raw counts.
Clicking View / Panes / Zoom Source will give you a better view. So will using your mouse to grab the right border of the codebook’s “label” column and move the border farther to the right.
Paying attention to the concept column can be important. For example, the ProfileTables codebook contains variables for the United States and also for Puerto Rico. If you want to estimate something about a place in the U.S., make sure you’re not trying to do so with a variable for Puerto Rico, and vice versa. Another hint is that Puerto Rico variable names always end with “PR.”
Click the Filter icon to open a filter box at the top of each column. Typing a word or phrase into a box will filter the codebook for rows that contain, in that column, the word or phrase you typed in. For example, here’s the ProfileTables codebook, filtered for “United States” concept variables that estimate something about “language.” Note the entry for the DP02_0114P variable, the one I used to estimate language diversity.

Codebooks for other ACS datasets

Changing acs5 to acs1 will get the codebooks for the single-year ACS instead of the five-year ACS. Single-year ACS datasets can be more up-to-date than five-year datasets. But they include data only for places that have 60,000 or more residents. Error margins on some single-year dataset variables also tend to be wider.

You also can pick a different year to get the single-year or five-year ACS dataset for that year. Right now, 2023 is the latest-available five-year dataset. It came out on Dec. 12, 2024. The single-year 2023 dataset is already out. See the Census Bureau’s data releases page for details.