The Data

The data comes from Social Explorer which is subscription based. To access it, go to Ryan Clement’s data site, and click “Social Explorer.” Let’s load the data

SE<-read_excel("SE_data.xlsx")

The variables we will use are the following:

-adherents = percentage of population that is affiliated with a religious body (includes members, children, and those who regularly attend a worship service)

-evangelical = percentage of population that identifies as Christian evangelical

-crime_violent = total violent crimes (murder, manslaughter, rape, aggravated assault) per 100,000

-crime_property = total property crime(burglarires, larcenies, motor vehicle thefts) per 100,000

-crime_hate = total hate crimes (anti-race, anti-ethnicity, anti-ancestry, anti-religious anti-sexual orientation, anti-disability, anti-gender) per 100,000

-income_median = Median Household Income (In 2018 Inflation Adjusted Dollars)

-covid_cases = Cumulative Confirmed Cases Rate per 100,000

-covid_deaths = Cumulative Death Rate per 100,000

Explore the Data

Each observation (or row) is a state in the USA.

head(SE, n=10)
## # A tibble: 10 x 10
##    Geo_FIPS Geo_NAME             adherents evangelical crime_violent crime_property
##       <dbl> <chr>                    <dbl>       <dbl>         <dbl>          <dbl>
##  1        1 Alabama                   62.9       42.0           520.          2817.
##  2        2 Alaska                    33.9       14.2           885.          3300.
##  3        4 Arizona                   37.2       11.9           475.          2677.
##  4        5 Arkansas                  55.4       39.0           544.          2913.
##  5        6 California                45.0        9.40          447.          2380.
##  6        8 Colorado                  37.8       12.0           397.          2672.
##  7        9 Connecticut               51.2        4.40          207.          1681.
##  8       10 Delaware                  41.8        7.20          424.          2324.
##  9       11 District Of Columbia      55.2       12.5           996.          4374.
## 10       12 Florida                   39.1       16.2           385.          2282.
## # … with 4 more variables: crime_hate <dbl>, income_median <dbl>,
## #   covid_cases <dbl>, covid_deaths <dbl>

The first state is Alabama and has an adherent rate of 63% of the population with 42% of the population identifying as a Christian evangelical.

Q1: What is the adherent rate in California? What percentage identify as evangelical?

Let’s look at summary statistics

summary(SE)
##     Geo_FIPS       Geo_NAME           adherents      evangelical    
##  Min.   : 1.00   Length:51          Min.   :27.63   Min.   : 2.281  
##  1st Qu.:16.50   Class :character   1st Qu.:40.45   1st Qu.: 9.489  
##  Median :29.00   Mode  :character   Median :50.83   Median :12.922  
##  Mean   :28.96                      Mean   :48.47   Mean   :15.945  
##  3rd Qu.:41.50                      3rd Qu.:55.33   3rd Qu.:19.120  
##  Max.   :56.00                      Max.   :79.11   Max.   :42.041  
##                                                                     
##  crime_violent   crime_property   crime_hate      income_median  
##  Min.   :112.1   Min.   :1248   Min.   : 0.1674   Min.   :43567  
##  1st Qu.:241.5   1st Qu.:1673   1st Qu.: 0.9492   1st Qu.:53178  
##  Median :350.5   Median :2282   Median : 1.5990   Median :59116  
##  Mean   :378.0   Mean   :2246   Mean   : 2.4992   Mean   :60621  
##  3rd Qu.:457.7   3rd Qu.:2674   3rd Qu.: 2.4815   3rd Qu.:68392  
##  Max.   :995.9   Max.   :4374   Max.   :25.4821   Max.   :82604  
##                                 NA's   :2                        
##   covid_cases     covid_deaths   
##  Min.   : 2472   Min.   : 35.02  
##  1st Qu.: 9583   1st Qu.:131.38  
##  Median :10641   Median :175.85  
##  Mean   :10157   Mean   :169.21  
##  3rd Qu.:11441   3rd Qu.:205.55  
##  Max.   :14641   Max.   :295.58  
## 

How do you interpret the table? Using income_median, for all 50 states + the District of Columbia, on average, median household income is $60,621. The lowest median income is $43,567 and the highest is $82,604. We can order the states from highest to lowest income

 arrange(SE,-income_median)
## # A tibble: 51 x 10
##    Geo_FIPS Geo_NAME             adherents evangelical crime_violent crime_property
##       <dbl> <chr>                    <dbl>       <dbl>         <dbl>          <dbl>
##  1       11 District Of Columbia      55.2       12.5           996.          4374.
##  2       24 Maryland                  41.8       12.0           469.          2033.
##  3       34 New Jersey                54.7        4.33          208.          1405.
##  4       15 Hawaii                    41.3        9.58          249.          2870.
##  5       25 Massachusetts             57.2        3.43          338.          1263.
##  6        2 Alaska                    33.9       14.2           885.          3300.
##  7        9 Connecticut               51.2        4.40          207.          1681.
##  8       33 New Hampshire             35.2        3.58          173.          1248.
##  9       51 Virginia                  44.8       19.1           200.          1666.
## 10        6 California                45.0        9.40          447.          2380.
## # … with 41 more rows, and 4 more variables: crime_hate <dbl>,
## #   income_median <dbl>, covid_cases <dbl>, covid_deaths <dbl>

The District of Columbia has the highest median income, followed by Maryland and New Jersey. If you go to the bottom of the table (click on “5” or “6”) we see that Arkansas, West Virginia, and Mississippi have the loweset median incomes.

Q2: Which states have the highest rates of adherence? Which states have the highest percentage of evangelicals?

Religion and Crime

Do states with a higher rate religious adherence have lower or higher rates of crime?
One of the best ways to examine this association is to use a scatterplot.

ggplot(data = SE, aes(x = adherents, y = crime_violent)) + 
  geom_point()

Each dot is a state. The x-axis is the percentage of adherents while the y-axis is the number of violent crimes per 100,000. Sometimes with just the scattplot, it can be difficult to discern if there is a pattern. Let’s add a linear regression line.

ggplot(data = SE, aes(x = adherents, y = crime_violent)) + 
  geom_point() +  geom_smooth(method ='lm')
## `geom_smooth()` using formula 'y ~ x'

It looks like there is a slightly positive association between religious adherence and violent crime - in other words as adherence increases, so does violent crime. Let’s format the figure.

ggplot(data = SE, aes(x = adherents, y = crime_violent)) + 
  geom_point() +  geom_smooth(method ='lm') +
  ggtitle("Religiosity and Violent Crime") +
  xlab("Pct Religious Adherent") +
  ylab("Violent Crimes per 100,000") 
## `geom_smooth()` using formula 'y ~ x'

We can also estimate the linear regression to see how “strong” the association is between adherents and violent crime.

result.1<-lm(crime_violent ~ adherents, data = SE)
summary(result.1)
## 
## Call:
## lm(formula = crime_violent ~ adherents, data = SE)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -246.88 -142.90  -29.97   77.71  611.82 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 333.9191   123.4762   2.704  0.00938 **
## adherents     0.9089     2.4929   0.365  0.71698   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 181.1 on 49 degrees of freedom
## Multiple R-squared:  0.002706,   Adjusted R-squared:  -0.01765 
## F-statistic: 0.1329 on 1 and 49 DF,  p-value: 0.717

We see that the slope of the regression line is .9089 and it is NOT statistically significant at the 5% level (P-value = .7169) and there are no stars (*) in the row. Given that it is not statistically significant, we infer that there is no evidence that greater adherence is associated with more or less violent crime.

Q3: Using both a scatter plot with a linear regression line AND a linear regression, is there any evidence that evangelical Christianity is associated with more or less property crime?

Q4: Based on your results for Q3, can we infer that Evangelical Christianity is causing a change in property crime? Why or why not?

Q5: Use the data methods above, is there evidence that religiosity is associated with higher income?

Q6: Based on the results for crime and income, should society be encouraging religious participation?