Religion in the United States

Religion plays a vital role in a lot of American households. Despite it being a controversial topic, it’s a part of many individuals and families lives. However, with the new generations, it appears that religion is not as an important of a factor in their lives.

While this is an observation, can religion data from the United States give us an insight into congregational adherence and religiosity? A few other questions are raised from this.

Religion data from SocialExplorer will be analyzed and plotted to answer these questions and understand congregational adherence over time in the United States.

Reading in the Dataset

library(readr)
library(dplyr)

religion_2010 <- read_csv("relig2010.csv") %>%
  mutate(GEOID = parse_integer(FIPS)) %>%
  rename(Total_Adherents = 'Total Adherents - Major Religions')

head(religion_2010)

To start, the data is read in from a CSV from SocialExplorer. This data is at the county-level of every state in the US. This will show levels of congregational adherance in 2010 (rates of a county’s population who still stick with their religious views).

Obtaining County-Level Spatial Shapefiles & Data Preparation

library(tigris)
library(tidyverse)
library(sf)
library(spdep)
library(tmaptools)

options(tigris_class = 'sf')
us_counties <- counties(cb = TRUE)

us_counties_plot <- us_counties %>%
    mutate(GEOID = parse_integer(GEOID)) %>%
    left_join(religion_2010, by = "GEOID") %>%
    filter(!(STATEFP %in% c("02","15","60","66","69","72","78")))
           
border_lines <- us_counties %>%
  aggregate_map(by="STATEFP")

Using the tigris package, county-level spatial files from the United States can be obtained. With this, the county-level data from the original dataset can be joined to the GEOID code. This resulting dataframe can be used to plot the data and answer the research questions for this analysis.

Adherence Rate by % of Population - How Faithful is the United States?

library(tmap)

tm_shape(us_counties_plot, projection = 2163) + tm_polygons('Major Religions - Adherence Rate (% of Population)', palette='-RdBu', midpoint = 20, border.alpha = .5, title = 'Adherence Rate: % of Pop.') + tm_shape(border_lines) + tm_borders(col='black',alpha=1,lwd=.32) + tm_style('cobalt') + tm_layout(frame = TRUE) + tm_layout(panel.labels='Adherence Rates by % of Population')

When looking at the map for 2010, there are still high levels of congregational adherance. There are much higher rates specifically for mid-western, southern, and mid-northern states. These areas are known to have higher conservative populations which could explain why religious faith is still important in the modern day for these states.

Some other observations stuck out from this map. In New York and California, two states known to be progressive, have a few counties that have noticeably high adherence rates. In general however, the west and east coast appear to have considerably lower adherence rates.

Investigating New York

ny <- us_counties_plot %>%
  filter(STATEFP == '36')

tm_shape(ny) + tm_polygons('Major Religions - Adherence Rate (% of Population)', palette='-RdBu', midpoint = 20, border.alpha = .5, title = 'Adherence Rate: % of Pop.') + tm_style('cobalt') + tm_layout(frame = TRUE) + tm_facets('STATEFP') + tm_layout(panel.labels='New York Adherence Rates - 2010') + tm_text("NAME", size = "AREA")

When looking at New York, the plots were very surprising. Many people still stick with their beliefs in religion throughout the state at relatively high levels. Despite the data being from 2010, I expected to see much less red throughout the state. Right next to Nassau, the city is visible which has very low rates, which I would expect. Upstate New York is known to be more conservative and religious but I didn’t believe to this degree. Even Nassau and Suffolk have noticeably high adherence rates!

Investigating California

ca <- us_counties_plot %>%
  filter(STATEFP == '06')

tm_shape(ca) + tm_polygons('Major Religions - Adherence Rate (% of Population)', palette='-RdBu', midpoint = 20, border.alpha = .5, title = 'Adherence Rate: % of Pop.') + tm_style('cobalt') + tm_layout(frame = TRUE) + tm_facets('STATEFP') + tm_layout(panel.labels='California Adherence Rates - 2010') + tm_text("NAME", size = "AREA")

California’s adherence rates suprised me as well. There were much higher rates than I expected. However, compared to New York, the rates are much lower across the state. One county specifically stands out here, by the name of Mono county. Northern California sees very low rates, with southern California seeing middle to high rates, which was still surprising.

Comparing the Progressive States with a Conservative One, Texas

tx <- us_counties_plot %>%
  filter(STATEFP == '48')

tm_shape(tx) + tm_polygons('Major Religions - Adherence Rate (% of Population)', palette='-RdBu', midpoint = 20, border.alpha = .5, title = 'Adherence Rate: % of Pop.') + tm_style('cobalt') + tm_layout(frame = TRUE) + tm_facets('STATEFP') + tm_layout(panel.labels='Texas Adherence Rates - 2010') 

Comparing New York and California to Texas gives an interesting perspective on the data. Southern states are known to be highly conservative with higher religious values, so this plot isn’t shocking. However, we see many smaller counties but a majority of the state is at more than 40 to 60% adherence rate.

Summary

Despite the data being from 2010, there were some interesting things discovered about states and the faith of those living in them. While I expected much higher rates for the south, there was a very high concentration of people still attending in the mid-northern states, as well as the mid-west.

It was also very surprising to see New York with high rates of adherence despite being known as a more progressive state. New York seems to be much more in line with Texas. Seeing New York’s rates spread similiary to Texas was not something I expected when plotting the data.

After analyzing California more, it was surprising to see many counties with 40 to 60%+ in congregational adherence rates. As you travel further north however, these rates decline. One county in particular, Mono county, had rates much higher than any other county in California.

If more recent data was collected, I believe there would be reduced rates across the country, especially in New York and California, as those from newer generations have became older and religion is not as big a staple in American lives as it used to be.

Displaying the Data Differently - Spatial or Non-Spatial?

library(ggplot2)

us_counties_bar <-  us_counties_plot %>%
  rename(State_Abbv = 'State Abbreviation') %>%
  rename(Adher_Rate = 'Major Religions - Adherence Rate (% of Population)')


ggplot(us_counties_bar, aes(State_Abbv, Adher_Rate, color = State_Abbv)) + geom_point() + ggtitle('Adherence Rate by % of Population - United States 2010') + xlab('State') + ylab('Adherence Rate')

When trying to plot 50 different states non-spatially, the data gets very messy and hard to interpret. We must rely heavily on the legend and it is hard to decipher the values on the graph.

It would appear that when trying to look at data at the state, then county level, spatial plotting is the best method.

A comparison: cb = FALSE vs. cb = TRUE

options(tigris_class = 'sf')
us_counties_2 <- counties(cb = FALSE)

us_counties_plot_2 <- us_counties_2 %>%
    mutate(GEOID = parse_integer(GEOID)) %>%
    left_join(religion_2010, by = "GEOID") %>%
    filter(!(STATEFP %in% c("02","15","60","66","69","72","78")))
           
border_lines_2 <- us_counties_2 %>%
  aggregate_map(by="STATEFP")

tm_shape(us_counties_plot_2, projection = 2163) + tm_polygons('Major Religions - Adherence Rate (% of Population)', palette='-RdBu', midpoint = 20, border.alpha = .5, title = 'Adherence Rate: % of Pop.') + tm_shape(border_lines_2) + tm_borders(col='black',alpha=1,lwd=.32) + tm_style('cobalt') + tm_layout(frame = TRUE) + tm_layout(panel.labels='Adherence Rates by % of Population')

When setting ‘cb = FALSE’ on the county-level data, the data takes much longer to load. That is because tigris is obtaining the most up to date and detailed version of the map. When setting ‘cb = TRUE’, tigris will use a generalized map file, which can save a lot of time developing plots at the expense of getting a lesser quality version of the map.