Setup

Load packages

library(ggplot2)
library(printr)

Load data

load("gss.Rdata")

Part 1: Data

General Social Survey (GSS)

Since 1972, the General Social Survey (GSS) has been monitoring societal change and studying the growing complexity of American society. The GSS aims to gather data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes; to examine the structure and functioning of society in general as well as the role played by relevant subgroups; to compare the United States to other societies in order to place American society in comparative perspective and develop cross-national models of human society; and to make high-quality data easily accessible to scholars, students, policy makers, and others, with minimal cost and waiting.

Survey Design

GSS has conducted 26 in-person, cross-sectional surveys of the adult household population of the U.S. Interviews have been conducted with a total of 51,020 respondents. The 1972-74 surveys used modified probability designs and the remaining surveys were completed using a full-probability sample design, producing a high-quality, representative sample of the adult population of the U.S.

Conclusions

  1. As our sample is a good representation of the US population, we can use it to make inferences about the population.

  2. Considering this is an observational study (as opposed to an experimental study with random assignment), we will refrain from establishing casual relationships between variables.


Part 2: Research question

There’s been a lot of discussion about the way morals and attitudes about sex are changing in this country. If a man and woman have sex relations before marriage, do you think it is:

Now, how often do you attend religious events:

A lot of research has been done on religion and sex separately. However, it would be interesting to explore any association between the two. So, does attending religious events have a relationship with opinion about pre-maritial sex? Let’s explore this question!


Part 3: Exploratory data analysis

Opinions of US Citizens on Pre-Maritial Sex:

Majority of US Citizens believe that it is not at all wrong to have pre-maritial sex

data <- gss[,c("premarsx","attend")]
data_clean <- na.omit(data)
ggplot(data_clean,aes(x=premarsx)) + geom_bar() +
    coord_flip() +
    theme(axis.title.x=element_blank(), axis.title.y=element_blank())

How often do US Citizens attend religious services?

Majority of US Citizens attend religious services every week

data <- gss[,c("premarsx","attend")]
data_clean <- na.omit(data)
ggplot(data_clean,aes(x=attend)) + geom_bar() +
    coord_flip() +
    theme(axis.title.x=element_blank(), axis.title.y=element_blank())

Two-Way Table (Observed Counts)

If you only observe the first column Less Than Once a Year and last column More Than Once a Week, the opinion on pre-marital sex are opposite to each other. Thus, there might be an association between the variables.

data <- gss[,c("premarsx","attend")]
data_clean1 <- na.omit(data)
data_clean1$premarsx <- droplevels(data_clean1$premarsx)
data_clean1$attend <- droplevels(data_clean1$attend)
table(data_clean1$premarsx,data_clean1$attend)
/ Lt Once A Year Once A Year Sevrl Times A Yr Once A Month 2-3X A Month Nrly Every Week Every Week More Thn Once Wk
Always Wrong 370 524 677 487 760 671 3130 1870
Almst Always Wrg 186 302 361 222 347 271 920 242
Sometimes Wrong 546 1069 1007 617 748 410 1339 201
Not Wrong At All 1456 2505 2001 1079 1158 460 1453 234

Part 4: Inference

Now we need to evaluate the relationship between two categorical variables, both with more than 2 levels:

  1. Opinion on pre-marital sex
  2. How often does one attend religious events

Thus, we will perform a Chi-Square Test of Independence:

Ho: Opinion on pre-marital sex and number of times a person attends a religious event are independent of each other.

Ha: Opinion on pre-marital sex and number of times a person attends a religious event are not independent on each other.

Checking Conditions:

  1. Independence
    • Is it a random sample? Yes
    • Are the total number samples (n) < 10% of population? Yes
    • Does each observation contributes to just 1 level in the two-way table? Yes
  2. Sample Size
    • Does each cell have at least 5 expected cases? Yes

Chi-Square Test:

Finding the expected counts. The observed counts were calculated in Part 3: Exploratory data analysis:

chisq <- chisq.test(y = data_clean$attend,x = data_clean$premarsx)
round(chisq$expected,0)
Lt Once A Year Once A Year Sevrl Times A Yr Once A Month 2-3X A Month Nrly Every Week Every Week More Thn Once Wk
Always Wrong 786 1352 1243 739 926 557 2103 783
Almst Always Wrg 264 454 418 248 311 187 706 263
Sometimes Wrong 550 946 870 517 648 389 1471 547
Not Wrong At All 958 1648 1515 901 1128 679 2563 954

\[\chi^2 = \sum_{i=1}^k (O_i-E_i)^2/E_i\] \[E_i = Row\;Total_i * Column\;Total_i\;/\;Table\;Total \] \[Degrees\;of\;Freedom = (Row-1)*(Column-1) \] Where:

  • O: Observed Counts
  • E: Expected Counts
  • K: Number of cells
print(chisq)
## 
##  Pearson's Chi-squared test
## 
## data:  data_clean$premarsx and data_clean$attend
## X-squared = 5624.8, df = 21, p-value < 2.2e-16

Observations:

  • The \(\chi^2\) statistic is 5624.8
  • The degrees of freedom are 21
  • The p-value: Probability of observing a \(\chi^2\) statistic at least as extreme as the one observed given Ho is true is approximately 0

Conclusions:

As p-value < 0.05, thus we reject Ho. Hence, opinion on pre-marital sex and number of times a person attends a religious event are dependent on each other.

Note: As chi-square test of independence does not have a single point estimate, we can not calculate a confidence interval.