Statistical inference with the GSS data

Setup

Load packages

library(ggplot2)
library(dplyr)
library(statsr)
library(tidyr)
library(knitr)
library(tidyverse)

Load data

load("gss.Rdata")

Part 1: Data

The General Social Survey (GSS) by NORC at the University of Chicago claims to be a comprehensive, unbiased perspective on American opinion on contemporary issues and trends ¹. The survey ensued in 1972 and continues to date while trying to improve and ameliorate bias with each iteration. The GSS has employed a biennial, dual sample design since 1994, i.e., the GSS conducts the survey in even years with two samples.

Study Design

The GSS, since its inception, has been using a rotation scheme as a means to alleviate bias introduced due to question order. The intrinsic ordering of questions can have a primacy and recency effect, which distort response behavior. ² From a statistical standpoint, rotation is advantageous when data needs to be aggregated. In retrospect, rotation doesn’t work when a correlation analysis needs to be conducted. The recency effect will be more apparent as participants will tend to associate newer options with the presented scale. ³

The rotation scheme also allows the survey to include “Permanent items” and “Regular items” as each set of items only appeared two-thirds of the time. A noted disadvantage of this approach was that it results in gaps (or NAs) in the data. However, the impact of such gaps can be reduced during aggregation.

The rotation, the across-time design was discontinued in 1987 as it causes an imbalance in the time-series. According to GSS Appendix Q⁴, “a split-ballot design was introduced, which splits the rotations”across random ballots (or sub-samples) within each survey rather than across surveys (and years)." Figure 1 shows how split-ballot rotation took place, starting in 1988.

Fig. 1: Split-ballot designed introduced in 1988. Splits are done across ballots and not surveys/years. (Source: GSS Codebook Appendix Q)

Sampling design

According to Appendix A⁵, survey samples through 1972 to 1974 followed a modified probability sample which was then conducted alongside a full probability sample design for 1975 to 1977 surveys. The variable sample denotes this separation. The appendix also states, “investigators who have applied statistical tests to previous General Social Survey data should continue to apply those tests.” For simplicity in this project, we will assume a random sample for years 1972 through 1977. Finally, the years 1977 and above used a full probability sample rather than this transition methodology, making it safe to assume a random sample.

Causality

Since there is no random assignment, blocking, or control involved, the study can only be deemed as observation (as opposed to an experiment). Since there is no guarantee that all confounding variables can be examined or measured in an observational study, no causal relationship can be established.

Generalizability

The survey was only conducted in English till 2006, any statistics involving non-English speaking participants until 2006 should note this. Furthermore, the survey was conducted in Spanish in addition to English speakers, which is said to include “60-65%” of the language exclusions.

The sample is conducted as a multi-stage sample with further block quotas. Each block constitutes of a quota based on sex, age, and employment status. To reduce the bias introduced by “not-at-home” cases, interviewers are instructed to interview only after 1500 hrs on weekdays or during weekends and/or holidays. The timings are based on empirical evidence from previous surveys.

The primary sampling units are Standard Metropolitan Statistical Areas and counties, which are again stratified by race, region, and age. Within these primary sampling units, blocks were selected with probabilities proportional to size. Furthermore, interviewers have been advised to start from the northwest corner of the block and continue until the block’s quota has been met.

According to Appendix B ⁶, all interviews are trained by area supervisors to minimize bias. Appendix B also includes a holistic list of specifications to be followed by an interviewer. GSS also documents their survey programming and also includes instructions on handling unusual cases. ⁷. Figure 2 shows an example of a response.

Fig. 2: Survey Programming for Address confirmation (Source: GSS Questionnaires)

Part 2: Research question

A rise in support in same-sex marriages has changed the attitudes, and laws countries have had on homosexuality. Prior to 2003 in the United States, same-sex sexual activity was illegal in fourteen states ⁸. On June 26, 2015, the supreme court legalized same-sex marriages in all states and required states to recognize out-of-state same-sex marriage. ⁹ This was in response to the landmark civil rights case, now termed as, Obergefell v. Hodges ¹⁰

According to a Pew Research study¹¹, The United States has seen a rise of 12% in the acceptance of homosexuality from 74% in 2002 to 86% in 2019. In developing countries like India, laws against the LGBTQ community’s criminalization have been revoked as of 2019. ¹². Furthermore, the same Pew Research study quotes a 22% increase in the acceptance of homosexuality in India from 15% in 2002 to 37% in 2019, which is post the decriminalized period. In summary, this acceptance seems to entail any legalization or decriminalization of LGBTQ rights.

There is an apparent trend, as shown in figure 3, in the rise of the acceptance of homosexuality in the United States, since 2015. However, it would be statistically incorrect to say legalization and acceptance are associated without proper evidence. Legalization may be one of the several factors that influence ones decision of homosexuality. Other influential factors may include religion, education, political status, ideologies, and so on.

Acceptance of homosexuality in society (Source: Pew Research Center)

Unfortunately, despite the increase in net acceptance, there seems to be a divide among members of the religious community. The same study mentions “significant differences” across highly religious and less religious countries and their acceptance of homosexuality. Religious reforms and upbringing can have an adverse impact on people who associate themselves with the LGTBQ community. Countries that define laws based on religious majority such as India (predominately Hindu) and Indonesia (primarily Muslim), can negatively affect people who identify themselves as homosexual. Therefore, it is important to understand whether religion had a significant impact on policymakers and their decision to legalize same-sex marriage.

A research question can be drawn out as follows:

Is there a relationship between confidence in organized religion and acceptance of homosexual sexual relations in the United States in 2002?

Before the 2015 bill, several states hadn’t legalized same-sex marriage. It would be interesting to observe the association between these two categorical variables. Furthermore, according to figure 3, acceptance percentages are over 15% lower in 2003 than in 2019. Additionally, since the earliest date of enactment/ruling for the same-sex marriage law is November 18, 2003 (Massachusetts) and effective on May 17, 2004, the impact of legalization would be less apparent. However, the impact would not be entirely eradicated since any inferences will be drawn out based on observation.

It should be noted that this project aims to understand the impact of one religious factor on acceptance of another single element of the LGBTQ community. This study will enable us to understand whether people who tend to believe more in organized religion (a structured system of faith) tend not to accept sexual relationships among members of the same sex.

Part 3: Exploratory data analysis

The two variables of this project will use to address the research question are:

conclerg: confidence in organized religion
homosex: Homosexual sex relations

`conclerg`

This variable answers the following question in the interview:

“I am going to name some institutions in this country. As far as the people running these institutions are concerned, would you say you have a great deal of confidence, only some confidence, or hardly any confidence at all in them?”

The respondents then selects one of the following options:

VALUE	LABEL
NA	IAP
1	A GREAT DEAL
2	ONLY SOME
3	HARDLY ANY
NA	DK
NA	NA

These factors are labeled as 1, 2, and 3. The order or the value, in this case, doesn’t impact the weightage of each option.

`homosex`

This variable answers the following question in the interview:

“What about sexual relations between two adults of the same sex?”

The respondents then selects one of the following options:

VALUE	LABEL
NA	IAP
1	ALWAYS WRONG
2	ALMST ALWAYS WRG
3	SOMETIMES WRONG
4	NOT WRONG AT ALL
5	OTHER
NA	DK
NA	NA

Similarly to the previous variable, these factors are labeled as 1, 2, 3, 4 and 5. The order or the value, in this case, doesn’t impact the weightage of each option.

NAs

In order to address the research question, we’re not going to consider respondents who answered Inapplicable (IAP), Don’t Know (DK), nor Not Applicable (NA). According to the Release Notes Panel, “could include refusing the question, giving a garbled answer, or declining the remaining questions in a given module.” In this case, ignoring or dropping such values is not going to impact our analysis. Furthermore, respondents who were marked as IAP were not eligible for the question. Lastly, we can also ignore the “OTHER” factor for homosex as it isn’t clear what this factor refers to. These rows can also be dropped without affecting our analysis adversely.

Exploration

# dropping rows for the data
gss_clean <- gss %>%
  filter(!is.na(homosex), !is.na(conclerg), homosex!="Other")

# finding the max and min years
summary(gss_clean$year)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1973    1977    1987    1988    1998    2012

We’re going to start by taking a look at how the acceptance of homosexual sexual relations has changed over the years 1973 - 2012. For this we’ll plot categories 1 - 4 as stacked bar charts. This helps us understand how the proportions of each response factor changes over the years.

ggplot(data=gss_clean, aes(x = year, fill = homosex)) + 
  geom_bar(position="fill", na.rm = TRUE) +
  ylab("Proportions for acceptance of homosexual sexual relations") + 
  xlab("Year") +
  ggtitle("Stacked bar chart showing proportions of homosex with respect to each year")

There seems to be a general rise in the acceptance of homosexual sexual relations. The stacked bar plot above shows an increase in the proportions of “Not Wrong At All” and decrease in the proportions of “Always Wrong”. A majority of the responses for homosex seem to be in the extremes as the proportions of responses for “Almst Always Wrg” and “Sometimes Wrong” look much smaller. We can confirm this using a line plot of the proportions:

homosex_year_prop <- gss_clean %>%
  group_by(homosex, year) %>%
  summarize(freq = n()) %>%
  group_by(year) %>%
  mutate(prop=freq/sum(freq))

## `summarise()` regrouping output by 'homosex' (override with `.groups` argument)

ggplot(data=homosex_year_prop, aes(x=year, y=prop)) +
  geom_line(aes(color=homosex, linetype=homosex)) +
  ylab("Proportions for acceptance of homosexual sexual relations") + 
  xlab("Year") +
  ggtitle("Line plot showing proportions of conclerg with respect to each year")

This general increasing trend in “Always Wrong” proportions and general decrease trend in “Almst Always Wrg” proportions can be confirmed using the plot above. These continue to increase and decrease until “Not Wrong At All” proportion is higher than that of “Always Wrong”. Furthermore, the acceptance of homosexuality is in line with the research conducted by the Pew Research Center mentioned in Part 2 of this project.

Now, we’ll take a look at how the acceptance of confidence in organized religion has changed over the years 1973 - 2012. For this, we’ll plot all categories as stacked bar charts. This helps us understand how the proportions of each response category changes over the years.

ggplot(data=gss_clean, aes(x = year, fill = conclerg)) + 
  geom_bar(position="fill", na.rm = TRUE) +
  ylab("Proportions for confidence in organization religion") + 
  xlab("Year") +
  ggtitle("Stacked bar chart showing proportions of conclerg with respect to each year")

Although not apparent, there seems to be a general decrease in the proportion responding “A Great Deal” with a substantial drop in 2003 followed by another decline in this proportion. Furthermore, there seems to be a general increase in the proportions for “Hardly Any” except the late 1980s where the proportions peaked. The “Only some” response appears to be consistent across the years. We can confirm this using a line plot of the proportions:

conclerg_year_prop <- gss_clean %>%
  group_by(conclerg, year) %>%
  summarize(freq = n()) %>%
  group_by(year) %>%
  mutate(prop=freq/sum(freq))

## `summarise()` regrouping output by 'conclerg' (override with `.groups` argument)

ggplot(data=conclerg_year_prop, aes(x=year, y=prop)) +
  geom_line(aes(color=conclerg, linetype=conclerg)) + 
  ylab("Proportions for confidence in organization religion") + 
  xlab("Year") +
  ggtitle("Line plots showing proportions of conclerg with respect to each year")

Based on the line plot above, there seems to be no apparent trend in the “Only some” response as proportions seem to increase and decrease regardless of the passage in time. Furthermore, the general increase in “Hardly Any” proportions and a general decrease in the “A Great Deal” proportions are evident. However, “A Great Deal” constituents to a more substantial proportion of the sample.

We will now look at the year 2002 specifically, which is concerning our research question.

homosex_year_prop %>%
  filter(year==2002) %>%
  select(c(homosex, prop))

## Adding missing grouping variables: `year`

## # A tibble: 4 x 3
## # Groups:   year [1]
##    year homosex            prop
##   <int> <fct>             <dbl>
## 1  2002 Always Wrong     0.540 
## 2  2002 Almst Always Wrg 0.0595
## 3  2002 Sometimes Wrong  0.0709
## 4  2002 Not Wrong At All 0.330

Based on the one-way contingency table above, higher proportions of people in 2002 tend to be against homosexual sexual relationships.

conclerg_year_prop %>%
  filter(year==2002) %>%
  select(c(conclerg, prop))

## Adding missing grouping variables: `year`

## # A tibble: 3 x 3
## # Groups:   year [1]
##    year conclerg      prop
##   <int> <fct>        <dbl>
## 1  2002 A Great Deal 0.172
## 2  2002 Only Some    0.577
## 3  2002 Hardly Any   0.252

Based on the table above, the proportions of “Hardly Any” exceed those of “A Great Deal.” Furthermore, according to the line chart illustrating the change in proportions of confidence in organized religion, the year 2002 is when there’s a sudden change in these proportions. 2003 and 2004 seem to show proportions where “A Great Deal” exceeds those of “Hardly Any.”

Now to summarize the association, we will construct a two-way contingency table for the year 2002. The table below shows the distribution of responses by accepting homosexual sexual activity and confidence in organized religion:

conclerg_homosex <- gss_clean %>%
  filter(year==2002) %>%
  group_by(conclerg, homosex) %>%
  summarize(freq = n()) %>%
  pivot_wider(id_cols=homosex, names_from=conclerg, values_from=freq)

conclerg_homosex <- conclerg_homosex %>% 
  remove_rownames %>% 
  column_to_rownames(var="homosex")

conclerg_homosex <- as.table(as.matrix(conclerg_homosex))
dimnames(conclerg_homosex) <- list(homosex = rownames(conclerg_homosex), 
                                   conclerg = colnames(conclerg_homosex))
conclerg_homosex

##                   conclerg
## homosex            A Great Deal Only Some Hardly Any
##   Always Wrong               51       138         47
##   Almst Always Wrg            3        20          3
##   Sometimes Wrong             4        21          6
##   Not Wrong At All           17        73         54

# Proportions of homosex factors within each conclerg factor
prop.table(conclerg_homosex, 2)

##                   conclerg
## homosex            A Great Deal  Only Some Hardly Any
##   Always Wrong       0.68000000 0.54761905 0.42727273
##   Almst Always Wrg   0.04000000 0.07936508 0.02727273
##   Sometimes Wrong    0.05333333 0.08333333 0.05454545
##   Not Wrong At All   0.22666667 0.28968254 0.49090909

The following can also be used as a stacked bar graph for comparing proportions of our response variable and explanatory variable. The stacked bar graph also shows us the proportions of the response variable with the explanatory variable. The response variable is the acceptance of homosexual sexual relations, plotted on the y-axis. On the other hand, the explanatory variable is the confidence in organized religion, plotted on the x-axis.

ggplot(data=gss_clean %>% filter(year==2002), aes(x = conclerg, fill = homosex)) + 
  geom_bar(position="fill", na.rm = TRUE) +
  ylab("Proportions of acceptance of homosexual sexual relations") + 
  xlab("Confidence in organized religion") +
  ggtitle("Stacked bar plot chart showing the proportions for homosex factors within each conclerg factor")

There seems to be a more substantial proportion of people that find homosexual sexual relations “Always Wrong” for the group that has “A Great Deal” of confidence in organized religion. On the other hand, this “Always Wrong” proportion seems to be much smaller for the “Hardly Any” group.

To check if there is an association between confidence in organized religion and acceptance of homosexual sexual relations in 2002, we need to perform a hypothesis test.

Part 4: Inference

Since we’re trying to test whether the two variables are associated as mentioned in the research question, we need to perform a hypothesis test. We’ll be using the chi-square test of independence, since we’re evaluating the independence of two categorical variables where at least one of them has more than two levels. In our case, homosex has 4 levels while conlerg has three.

Hypotheses

Our hypotheses can this be defined as:

\(H_{0}\): The acceptance of homosexual sexual relations is independent of the confidence in organized religion in the United States in 2002. The observed difference is due to chance and they do not vary with each other.

\(H_{A}\): The acceptance of homosexual sexual relations is associated with the confidence in organized religion in the United States in 2002. The observed difference is not due to chance and they vary with each other.

Conditions for the chi-square test of independence

The following conditions need to be met for us to carry out the hypothesis test.

Independence:
- The data collected by GSS are random samples.
- Since GSS samples without replacement, the sample size is certainly less than 10% of the total population in the US.
- Furthermore, each case only contributes to one cell in the table due to the design in the survey. Each respondant can only give one answer for each question of the questions on conclerg and homosex.
- Thus we can say the independence criteria is satisfied.

Calculating expected count for the chi-square test of independence. The approach for this can be found on StackOverflow ¹³. We use the following formula for expected counts for the \(i^{th}\) row and the \(j^{th}\) column:

\(Expected\ Count_{row\ i, col\ j} = \frac{(row\ i\ total)\times(column\ j\ total)}{table\ total}\)

# run a chi-square test to get the expected values
Xsq <- chisq.test(conclerg_homosex)

## Warning in chisq.test(conclerg_homosex): Chi-squared approximation may be
## incorrect

conclerg_homosex_chi <- conclerg_homosex
conclerg_homosex_expected <- round(Xsq$expected)
conclerg_homosex_chi[] <- paste(conclerg_homosex, paste0("(",conclerg_homosex_expected,")"))

# addmargins for the row and column totals
conclerg_homosex_margins <- addmargins(conclerg_homosex)
conclerg_homosex_chi <- cbind(conclerg_homosex_chi, conclerg_homosex_margins[-5, 4])
conclerg_homosex_chi <- rbind(conclerg_homosex_chi, conclerg_homosex_margins[5,])

# rename the last column and last row as Total
colnames(conclerg_homosex_chi)[4] <- "Total"
rownames(conclerg_homosex_chi)[5] <- "Total"
dimnames(conclerg_homosex_chi) <- list(homosex = rownames(conclerg_homosex_chi),
                                       conclerg = colnames(conclerg_homosex_chi))
kable(conclerg_homosex_chi, caption='Observed counts and (the expected counts)')

Observed counts and (the expected counts)
	A Great Deal	Only Some	Hardly Any	Total
Always Wrong	51 (41)	138 (136)	47 (59)	236
Almst Always Wrg	3 (4)	20 (15)	3 (7)	26
Sometimes Wrong	4 (5)	21 (18)	6 (8)	31
Not Wrong At All	17 (25)	73 (83)	54 (36)	144
Total	75	252	110	437

Sample size: Since each cell does not have at least 5 expected cases, we cannot use the chi-square test for this. Instead we’ll have to use a simulation.
Degrees of Freedom:
- The degrees of freedom (dof) in this case should be greater than 2.
- This can be calculated as \(dof = (R-1) \times (C-1)\)
- where, \(R\) is the number of rows and \(C\) is the number of columns
- \(\implies dof = (4-1) \times (3-1) = 6\)
- Hence this condition is satisfied

Since we didn’t pass all conditions for performing a ch-square test, we’ll need to perform a simulation in order to construct our inference.

Chi-square test with simulation

We’ll used two methods for simulation of a two-way table - Only the row totals are fixed¹⁴ and the Monte Carlo Simulation ¹⁵. We want to identify the sampling distribution for each cell’s test-statistic (\(\chi^{2}\)) if the null hypothesis was true. In this case, we want to see how far the observed values are from expected values then use this information as evidence to reject the null hypothesis.

Patefield’s Algorithm

Patefield’s algorithm presents an efficient way of generating a table. It simulates under the assumption of independence for each random cell given that the margins are fixed. The number of replications in this case will be 15000.

## Simulate permutation test for independence based on the maximum
## Pearson residuals (rather than their sum).
rowTotals <- rowSums(conclerg_homosex)
colTotals <- colSums(conclerg_homosex)
nOfCases <- sum(rowTotals)
expected <- outer(rowTotals, colTotals, "*") / nOfCases

# expected values based on patefield's algorithm
conclerg_homosex_pat <- conclerg_homosex
conclerg_homosex_pat[] <- paste(conclerg_homosex, paste0("(",round(expected),")"))
kable(conclerg_homosex_pat, 
      caption='Observed counts and (the expected counts) generated using Patfields algorithm')

Observed counts and (the expected counts) generated using Patfields algorithm
	A Great Deal	Only Some	Hardly Any
Always Wrong	51 (41)	138 (136)	47 (59)
Almst Always Wrg	3 (4)	20 (15)	3 (7)
Sometimes Wrong	4 (5)	21 (18)	6 (8)
Not Wrong At All	17 (25)	73 (83)	54 (36)

maxSqResid <- function(x) max((x - expected) ^ 2 / expected)
simMaxSqResid <-
    sapply(r2dtable(15000, rowTotals, colTotals), maxSqResid)
p_value <- sum(simMaxSqResid >= maxSqResid(conclerg_homosex)) / 15000
p_value

## [1] 0.003333333

Since the p-value in this case is less than 0.05. We can reject the null hypothesis.

Monte Carlo Simulation

Monte-Carlo Simulation is to used to produce a reference distribution, based on randomly generated samples which will have the same size as the observed sample, in order to compute p-values when sample size conditions are not satisfied. The number of replications in this case will be 15000.

Xsq <- chisq.test(conclerg_homosex, simulate.p.value = TRUE, B = 15000)
# expected values based on monte carlo
conclerg_homosex_mc <- conclerg_homosex
conclerg_homosex_mc[] <- paste(conclerg_homosex, 
                               paste0("(",round(expected),")"))
kable(conclerg_homosex_mc, 
      caption='Observed counts and (the expected counts) generated using Monte Carlo Simulation')

Observed counts and (the expected counts) generated using Monte Carlo Simulation
	A Great Deal	Only Some	Hardly Any
Always Wrong	51 (41)	138 (136)	47 (59)
Almst Always Wrg	3 (4)	20 (15)	3 (7)
Sometimes Wrong	4 (5)	21 (18)	6 (8)
Not Wrong At All	17 (25)	73 (83)	54 (36)

Xsq

## 
##  Pearson's Chi-squared test with simulated p-value (based on 15000
##  replicates)
## 
## data:  conclerg_homosex
## X-squared = 23.015, df = NA, p-value = 0.001667

Since the p-value in this case is less than 0.05. We can reject the null hypothesis.

Inference

We can’t construct a Confidence Interval for this data as it is a two-way contingency table. Furthermore, there is no null value or sample estimate to construct a confidence interval around. Furthermore, we’re not calculating any means, medians nor proportions, instead, we’re trying to find whether the observed and expected values are significantly different. There are more than two values to compare. Finally, since we’re not estimating any population parameter, a confidence interval will not yield any benefits to us.

Since the p-value in both simulations is \(\le 0.05\), we can safely reject the null hypothesis, \(H_{0}\). Therefore, the data provide strong evidence that there is an association between the acceptance of homosexual sexual relations and the confidence in organized religion in the United States in 2002. Furthermore, the acceptance of homosexual sexual relationships tends to vary with a respondent’s confidence in organized religion.

Therefore, can we conclude from this analysis that respondents with a great deal of confidence in organized religion will find homosexual sexual relationships wrong?

No, the result of this could be a Type 1 error. The reasons why are as follows:

Marriage among the same sex wasn’t legalized in all states in the USA until 2015. Certain religions, such as Christianity, could view this as excessive intimacy out of wedlock, which is forbidden.
Based on the ballot of 2002, the question was asked with regards to “Christian Organizations,” which could mean the data is under representative.
A significant proportion (and number) of people with “Only Some” confidence in organized religion still exists, based on the graphs displayed above.
Since this is an observational study, causal effects cant be assumed. Furthermore, other confounding variables such as neighborhood, schools, age, religion, etc. have not been considered.
This is just an observational study based on a small sample since a few cells were below 5.
According to Hope’s paper on the Monte Carlo significance test procedure, “It is preferable to use a known test of good efficiency instead of a Monte-Carlo test procedure assuming that the alternative statistical hypothesis can be completely specified.” Therefore any results must be presented with skepticism.