Setup

Load packages

## Warning: package 'ggplot2' was built under R version 3.6.3
## Warning: package 'dplyr' was built under R version 3.6.3
## Warning: package 'statsr' was built under R version 3.6.3

Load data


Part 1: Data

The General Social Survey has been gathering data on the American society to monitor and explain trends in attitudes and behaviours. Additionally, this data can be used to understand the societal structures existing in the country and how that structure is affected by subgroups and minorities. This data can also be used to compare the American society to other societies, provided such data is available for comparision.

The data contains 57061 observations of 114 variables. This is a relatively large sample drawn by random sampling so we can generalize this data to the population.


Part 2: Research question

Relationship between political standing and opinions on premarital sex For this project, I am interesting in finding out if a relationship between opinions on premarital sex and political standing exists. I believe this question is of interest as many believe that conservative individuals are against premarital sex and vice versa.


Part 3: Exploratory data analysis

For this project, I shall be using the following variables: 1. premarsx: Sex before marriage

##     Always Wrong Almst Always Wrg  Sometimes Wrong Not Wrong At All 
##             9244             3200             7044            14060 
##            Other             NA's 
##                0            23513
  1. polviews: Thinks of self as liberal or conservative
##     Extremely Liberal               Liberal      Slightly Liberal 
##                  1330                  5582                  6181 
##              Moderate Slightly Conservative          Conservative 
##                 18494                  7691                  7092 
##  Extrmly Conservative                  NA's 
##                  1506                  9185

We will be removing all NAs from this dataset. In addition to that, the ‘Moderate’ column will be removed from polviews as it is not of interest to the research question. Other columns in polviews will be combined into ‘Conservative’ and ‘Liberal’.

Combining conservative columns and liberal columns together.

Summary

## Conservative      Liberal 
##        10169         8328

Visualization

From the plot above we can see that as we go from “Always Wrong” to “Never Wrong”, the proportion of Conservatives decreases and Liberals increase. This aligns with the assumption stated before that conservative indiviudals have negative views on premarital sex. It should be noted that there is a factor of religion that may impact the relationship, but we are not looking into the effects of religion in this study.


Part 4: Inference

I will be conducting a test for independence since I am dealing with two categorical variables, and I wish to find out if the two variables are independent of eachother.

Null hypothesis: Political views and views on premarital sex are independent of eachother Alternative hypothesis Political views and views on premarital sex are dependent

Conditions For a chi-square test of independence the following conditions must be fulfilled:

  1. Independence: Since the data was sampled using random sampling, we can assume that the observations are independent of eachother. Additionally, there are 57061 observations, which make it less than 10% of the population.
  2. Expected frequncy counts are over 5: from the R output below we can see that the condition is satisfied.
##                      gss_polpmsx$premarsx
## gss_polpmsx$pol_views Always Wrong Almst Always Wrg Sometimes Wrong
##          Conservative     2741.677          965.387         2080.86
##          Liberal          2245.323          790.613         1704.14
##                      gss_polpmsx$premarsx
## gss_polpmsx$pol_views Not Wrong At All
##          Conservative         4381.076
##          Liberal              3587.924

Test for Independence

## 
##  Pearson's Chi-squared test
## 
## data:  gss_polpmsx$pol_views and gss_polpmsx$premarsx
## X-squared = 1562.1, df = 3, p-value < 2.2e-16

Since the p value is less than 0.05, we reject the null hypothesis, and conclude that opinions on premarital sex and political standing are dependent on each other.

It should be noted that there could be other factors that may affect the opinion on premarital sex. Such factors could be religion, age, and country of origin. These factors would require additional studies and analysis before any conclusive statements can be made about their relationships.