The data comes from the ARDA. To get started, let’s load the data.
GSS<-read_excel("GSS_Secular.xlsx")
The variables we will use are the following -Income_High = 1 if household income +$75,000
-NONES = 1 if individual is unaffiliated with any organized religion
-Believe_God = 1 if individual is very confident that God exists
-attend_regular = 1 if individual regularly attends a worship service (+2 or more a month)
-college = 1 if individual has a 4-year college degree
Each observation (or row) is an individual person.
head(GSS, n=10)
## # A tibble: 10 x 7
## ID Income_High NONES Believe_God attend_regular college AGE
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 0 0 1 0 0 74
## 2 3 1 1 0 0 1 42
## 3 4 1 0 1 1 1 63
## 4 5 1 0 1 1 1 71
## 5 7 0 0 0 1 0 59
## 6 8 0 0 1 1 0 43
## 7 9 0 1 0 0 0 62
## 8 10 1 0 0 0 0 55
## 9 11 1 0 0 0 1 59
## 10 12 0 0 1 1 0 34
The first person has an id=2, and does not have a college degree (college) and believes in God (Believe_God)
Q1: Does the person with ID=8 believe in God? Does this person regularly attend worship services?
Let’s look at summary statistics
summary(GSS)
## ID Income_High NONES Believe_God
## Min. : 2.0 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.: 606.8 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :1184.5 Median :0.0000 Median :0.0000 Median :1.0000
## Mean :1184.8 Mean :0.3549 Mean :0.2339 Mean :0.5354
## 3rd Qu.:1770.2 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:1.0000
## Max. :2347.0 Max. :1.0000 Max. :1.0000 Max. :1.0000
##
## attend_regular college AGE
## Min. :0.0000 Min. :0.0000 Min. :18.00
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:34.00
## Median :0.0000 Median :0.0000 Median :48.00
## Mean :0.3374 Mean :0.3332 Mean :48.85
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:63.00
## Max. :1.0000 Max. :1.0000 Max. :89.00
## NA's :5
How do you interpret the table? Using Income_High, the mean is the proportion who have high incomes (+$75,000). In other words, 35.49% of the sample has a high household income.
Q2: What percent have a college degree? What percent believe in God?
We use the term religiosity to measure the degree of an individual’s religious beliefs. A challenge is how to measure it. There are three variables in the dataset (Believe_God, NONES, attend_regular).
Q3: Which of these do you think best measures religiosity? Why?
One version of the secularization hypothesis posits that as income or education increase, religiosity should decrease.
Are high income individual’s less likely to regularly attend a worship service? We can create a two-way table that shows the percentage
table_income <-table(GSS$Income_High,GSS$attend_regular)
colnames(table_income)<-c("Doesnt_Attend", "Attend_Worship")
rownames(table_income)<-c("Low_Med_Income", "High_Income")
table_income
##
## Doesnt_Attend Attend_Worship
## Low_Med_Income 913 452
## High_Income 489 262
The table shows the total number for each cell. For example, there are 452 individuals who have low-medium income and attend worship. There are also a total of 1365 people in the low-medium income group (913+452).
Q4: How many individuals are classified as High Income? How many high-income individuals attend worship services?
We can also look at the proportion for each income group that attends worship services. These are known as marginal tables. We use prop.table or proportions and it takes table_income and conditions it on the row (hence the 1).
prop.table(table_income,1)
##
## Doesnt_Attend Attend_Worship
## Low_Med_Income 0.6688645 0.3311355
## High_Income 0.6511318 0.3488682
We see that 33.1% of low-medium income individuals attend worship services and about 66.9% do not.
Q5: What percent of high income individuals attend worship service?
Based on these results, we don’t find any evidence that higher income is associated with less religiosity.
We can also present these results using a two-way bar graph. ggplot() is a function - data=GSS uses the GSS dataset for the plot - aes() specifies what is on the y and x axis - stat_summary specifies to plot the mean fun='mean' and use a bar graph geom='bar'
ggplot(data=GSS,aes(y=attend_regular,x=as.factor(Income_High)))+stat_summary(fun='mean',geom='bar')
The x-axis divides the sample into either low-med income (X=0) or high income (X=1). The y-axis is the proportion in each group that regularly attends a worship service. The height of the bar is the mean and corresponds to the two-way tables above. We can format the table.
ggplot(GSS,aes(y=attend_regular,x=as.factor(Income_High)))+stat_summary(fun='mean', geom='bar') +
scale_x_discrete(name="Income", labels=c("Low-Med Income", "High INcome")) + ggtitle("Income and Religosity") + ylab("Regular Attendance Worship Service")
Q6: Is there evidence that higher levels of education lead to less relgiosity? Present a two-way table and bar graph. Interpret the results
Q7: Based on the income and education results, is there evidence in this analysis of the secularization hypothesis? Do you find these results suprising? Why or why not?