The Data

The data comes from the ARDA. To get started, let’s load the data.

GSS<-read_excel("GSS_Secular.xlsx")

The variables we will use are the following -Income_High = 1 if household income +$75,000

-NONES = 1 if individual is unaffiliated with any organized religion

-Believe_God = 1 if individual is very confident that God exists

-attend_regular = 1 if individual regularly attends a worship service (+2 or more a month)

-college = 1 if individual has a 4-year college degree

Each observation (or row) is an individual person.

head(GSS, n=10)
## # A tibble: 10 x 7
##       ID Income_High NONES Believe_God attend_regular college   AGE
##    <dbl>       <dbl> <dbl>       <dbl>          <dbl>   <dbl> <dbl>
##  1     2           0     0           1              0       0    74
##  2     3           1     1           0              0       1    42
##  3     4           1     0           1              1       1    63
##  4     5           1     0           1              1       1    71
##  5     7           0     0           0              1       0    59
##  6     8           0     0           1              1       0    43
##  7     9           0     1           0              0       0    62
##  8    10           1     0           0              0       0    55
##  9    11           1     0           0              0       1    59
## 10    12           0     0           1              1       0    34

The first person has an id=2, and does not have a college degree (college) and believes in God (Believe_God)

Q1: Does the person with ID=8 believe in God? Does this person regularly attend worship services?

Let’s look at summary statistics

summary(GSS)
##        ID          Income_High         NONES         Believe_God    
##  Min.   :   2.0   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.: 606.8   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :1184.5   Median :0.0000   Median :0.0000   Median :1.0000  
##  Mean   :1184.8   Mean   :0.3549   Mean   :0.2339   Mean   :0.5354  
##  3rd Qu.:1770.2   3rd Qu.:1.0000   3rd Qu.:0.0000   3rd Qu.:1.0000  
##  Max.   :2347.0   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##                                                                     
##  attend_regular      college            AGE       
##  Min.   :0.0000   Min.   :0.0000   Min.   :18.00  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:34.00  
##  Median :0.0000   Median :0.0000   Median :48.00  
##  Mean   :0.3374   Mean   :0.3332   Mean   :48.85  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:63.00  
##  Max.   :1.0000   Max.   :1.0000   Max.   :89.00  
##                                    NA's   :5

How do you interpret the table? Using Income_High, the mean is the proportion who have high incomes (+$75,000). In other words, 35.49% of the sample has a high household income.

Q2: What percent have a college degree? What percent believe in God?

Measuring Religosity

We use the term religiosity to measure the degree of an individual’s religious beliefs. A challenge is how to measure it. There are three variables in the dataset (Believe_God, NONES, attend_regular).

Q3: Which of these do you think best measures religiosity? Why?

Secularization Hypothesis

One version of the secularization hypothesis posits that as income or education increase, religiosity should decrease.

Are high income individual’s less likely to regularly attend a worship service? We can create a two-way table that shows the percentage

table_income <-table(GSS$Income_High,GSS$attend_regular)
colnames(table_income)<-c("Doesnt_Attend", "Attend_Worship")
rownames(table_income)<-c("Low_Med_Income", "High_Income")
table_income
##                 
##                  Doesnt_Attend Attend_Worship
##   Low_Med_Income           913            452
##   High_Income              489            262

The table shows the total number for each cell. For example, there are 452 individuals who have low-medium income and attend worship. There are also a total of 1365 people in the low-medium income group (913+452).

Q4: How many individuals are classified as High Income? How many high-income individuals attend worship services?

We can also look at the proportion for each income group that attends worship services. These are known as marginal tables. We use prop.table or proportions and it takes table_income and conditions it on the row (hence the 1).

prop.table(table_income,1)
##                 
##                  Doesnt_Attend Attend_Worship
##   Low_Med_Income     0.6688645      0.3311355
##   High_Income        0.6511318      0.3488682

We see that 33.1% of low-medium income individuals attend worship services and about 66.9% do not.

Q5: What percent of high income individuals attend worship service?

Based on these results, we don’t find any evidence that higher income is associated with less religiosity.

We can also present these results using a two-way bar graph. ggplot() is a function - data=GSS uses the GSS dataset for the plot - aes() specifies what is on the y and x axis - stat_summary specifies to plot the mean fun='mean' and use a bar graph geom='bar'

ggplot(data=GSS,aes(y=attend_regular,x=as.factor(Income_High)))+stat_summary(fun='mean',geom='bar')

The x-axis divides the sample into either low-med income (X=0) or high income (X=1). The y-axis is the proportion in each group that regularly attends a worship service. The height of the bar is the mean and corresponds to the two-way tables above. We can format the table.

ggplot(GSS,aes(y=attend_regular,x=as.factor(Income_High)))+stat_summary(fun='mean', geom='bar') +
  scale_x_discrete(name="Income", labels=c("Low-Med Income", "High INcome")) + ggtitle("Income and Religosity") + ylab("Regular Attendance Worship Service")

Q6: Is there evidence that higher levels of education lead to less relgiosity? Present a two-way table and bar graph. Interpret the results

Q7: Based on the income and education results, is there evidence in this analysis of the secularization hypothesis? Do you find these results suprising? Why or why not?