The General Social Survey (GSS) has been a significant source of sociological data since its inception in 1972. Conducted by NORC at the University of Chicago and funded by the National Science Foundation, the GSS collects data to monitor societal changes and study the growing complexity of American society.
The GSS utilizes a robust methodology to ensure the collection of high-quality, unbiased data. Initially administered annually and then biennially, the survey interviews a representative sample of Americans on a wide range of topics, from beliefs and attitudes to behaviors and societal trends. The inclusion of a broad array of questions - such as belief in God, government confidence, race relations, and more - allows for a comprehensive analysis of American society.
The GSS’s rigorous methodological approach ensures that its findings are generalizable to the broader American population. The representative nature of the sample allows researchers and policymakers to draw conclusions about societal trends and attitudes with a high degree of confidence. However, like most surveys, the GSS is primarily observational, meaning it can identify correlations but not necessarily establish causality.
A key feature of the GSS is its commitment to making data accessible to a wide audience. The GSS Data Explorer, an online tool launched in 2015 and updated in 2021, allows users to search and analyze data, making the GSS a valuable resource for educators, policymakers, journalists, and students. The survey’s adaptability and incorporation of new collection methods, like the web mode starting in 2022, ensure that the GSS remains a vital tool for understanding American society.
The GSS participates in the International Social Science Programme (ISSP), facilitating cross-country comparisons. This inclusion allows researchers to compare responses from the United States with those from other countries, further enhancing the survey’s utility in global sociological research.
The General Social Survey’s methodological rigor and broad scope make it an indispensable tool for understanding societal trends and attitudes in the United States. While it offers excellent generalizability, the nature of its data collection primarily allows for observational insights, limiting the ability to infer causality.
For more detailed information on the GSS methodology, you can visit NORC’s GSS website and the GSS Methodological Reports.
Understanding Sociopolitical Polarization: In recent years, the United States has witnessed increasing sociopolitical polarization. Analyzing how political views correlate with opinions on social welfare spending can shed light on the ideological divides that characterize contemporary American politics. People’s attitudes towards social welfare spending are a key indicator of their broader socio-economic policy preferences. Understanding these attitudes in the context of political orientation can provide insights into public support or opposition to welfare policies, which is crucial for policymakers and political analysts.
This question can help in understanding public opinion trends that are vital for developing responsive and representative welfare policies. Recognizing how different demographic groups view welfare spending, in relation to their political views, allows for more targeted and effective policy development. Attitudes toward welfare spending reflect broader opinions about the role of government in addressing social inequalities and economic challenges. Understanding these attitudes is key to addressing issues like poverty, healthcare, and education.
In this analysis we will investigate three different demographic factors and their relationship to opinions on welfare spending: income level, education, and political views; therefore, we will be testing three hypotheses.
Null Hypothesis (H0): There has been no significant relationship between income level and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
Alternative Hypothesis (H1): There is a significant relationship between income level and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
Null Hypothesis (H0): There has been no significant relationship between education and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
Alternative Hypothesis (H1): There is a significant relationship between education and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
Null Hypothesis (H0): There has been no significant relationship between political views and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
Alternative Hypothesis (H1): There is a significant relationship between political views and opinions on social welfare spending in the United States across different demographic groups over the past two decades.
# Select relevant variables
gss_selected <- gss %>%
select(year, polviews, natfare, age, sex, race, educ, income06) %>%
filter(!is.na(natfare), !is.na(polviews)) # Ensure key variables are not missing
# Check for missing values and data structure
summary(gss_selected)## year polviews natfare
## Min. :1974 Extremely Liberal : 762 Too Little : 5367
## 1st Qu.:1978 Liberal : 3114 About Right: 8462
## Median :1988 Slightly Liberal : 3564 Too Much :13189
## Mean :1990 Moderate :10465
## 3rd Qu.:1998 Slightly Conservative: 4415
## Max. :2012 Conservative : 3863
## Extrmly Conservative : 835
## age sex race educ
## Min. :18.00 Male :12376 White:22438 Min. : 0.00
## 1st Qu.:31.00 Female:14642 Black: 3505 1st Qu.:12.00
## Median :42.00 Other: 1075 Median :12.00
## Mean :45.15 Mean :12.77
## 3rd Qu.:58.00 3rd Qu.:15.00
## Max. :89.00 Max. :20.00
## NA's :85 NA's :51
## income06
## $60000 To 74999 : 387
## $40000 To 49999 : 325
## Refused : 309
## $50000 To 59999 : 302
## $75000 To $89999: 295
## (Other) : 2413
## NA's :22987
## 'data.frame': 27018 obs. of 8 variables:
## $ year : int 1974 1974 1974 1974 1974 1974 1974 1974 1974 1974 ...
## $ polviews: Factor w/ 7 levels "Extremely Liberal",..: 4 5 6 6 6 5 5 5 6 4 ...
## $ natfare : Factor w/ 3 levels "Too Little","About Right",..: 3 3 3 2 2 2 3 2 3 1 ...
## $ age : int 21 41 83 69 58 30 48 67 54 89 ...
## $ sex : Factor w/ 2 levels "Male","Female": 1 1 2 2 2 1 1 1 2 1 ...
## $ race : Factor w/ 3 levels "White","Black",..: 1 1 1 1 1 1 1 1 1 2 ...
## $ educ : int 14 16 10 10 12 16 17 10 11 6 ...
## $ income06: Factor w/ 26 levels "Under $1 000",..: NA NA NA NA NA NA NA NA NA NA ...
# Preliminary exploration
gss_selected %>%
group_by(year) %>%
summarise(
count = n(),
avg_educ = mean(educ, na.rm = TRUE),
mode_income = names(which.max(table(income06)))
)## # A tibble: 27 × 4
## year count avg_educ mode_income
## <int> <int> <dbl> <chr>
## 1 1974 1359 11.9 Under $1 000
## 2 1975 1326 11.9 Under $1 000
## 3 1976 1344 12.0 Under $1 000
## 4 1977 1390 11.8 Under $1 000
## 5 1978 1392 12.1 Under $1 000
## 6 1980 1367 12.2 Under $1 000
## 7 1982 1668 12.1 Under $1 000
## 8 1983 747 12.6 Under $1 000
## 9 1984 456 12.5 Under $1 000
## 10 1985 691 12.7 Under $1 000
## # ℹ 17 more rows
# Visualization of opinions on welfare spending over years
gss %>%
filter(!is.na(natfare)) %>%
count(year, natfare) %>%
ggplot(aes(x = factor(year), y = n, fill = natfare)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Opinions on Welfare Spending Over Time",
x = "Year",
y = "Percentage",
fill = "Opinion on Welfare") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))# Convert to factor if it's not already
gss$income06 <- as.factor(gss$income06)
income_levels <- levels(gss$income06)
income_levels## [1] "Under $1 000" "$1 000 To 2 999" "$3 000 To 3 999"
## [4] "$4 000 To 4 999" "$5 000 To 5 999" "$6 000 To 6 999"
## [7] "$7 000 To 7 999" "$8 000 To 9 999" "$10000 To 12499"
## [10] "$12500 To 14999" "$15000 To 17499" "$17500 To 19999"
## [13] "$20000 To 22499" "$22500 To 24999" "$25000 To 29999"
## [16] "$30000 To 34999" "$35000 To 39999" "$40000 To 49999"
## [19] "$50000 To 59999" "$60000 To 74999" "$75000 To $89999"
## [22] "$90000 To $109999" "$110000 To $129999" "$130000 To $149999"
## [25] "$150000 Or Over" "Refused"
# Create a named vector that maps each original income level to the new category
income_map <- setNames(
c(rep("Under $20,000", 13), # For income levels up to "$17500 To 19999"
rep("$20,000 to $49,999", 5), # For the next five income levels
rep("$50,000 to $74,999", 2), # For the next two income levels
rep("$75,000 to $99,999", 2), # For the next two income levels
"Unknown/Refused", # For the "Refused" category
rep("$100,000 and over", 5)), # For the top five income levels
unique(gss$income06)
)
# Map the 'income06' variable to the new categories
gss$income_category <- map_chr(gss$income06, ~ income_map[.])
# Handle the NAs if any exist after mapping
gss$income_category[is.na(gss$income_category)] <- "Unknown/Refused"
# Check the new income category variable
table(gss$income_category)##
## $100,000 and over $20,000 to $49,999 $50,000 to $74,999 $75,000 to $99,999
## 1692 2661 1625 1257
## Under $20,000 Unknown/Refused
## 2456 47370
# Visualization of opinions on welfare spending by income level
# Making sure the income_category is a factor and set the levels in the order we want
gss$income_category <- factor(gss$income_category, levels = c(
"Under $20,000",
"$20,000 to $49,999",
"$50,000 to $74,999",
"$75,000 to $99,999",
"$100,000 and over",
"Unknown/Refused"
))
# Filtering out NA opinions on welfare
gss_filtered <- gss %>%
filter(!is.na(natfare) & natfare != "NA")
# Creating a plot with the ordered income categories and excluding NA opinions
ggplot(gss_filtered, aes(x = income_category, fill = natfare)) +
geom_bar(position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Opinions on Welfare Spending by Income Category",
x = "Income Category",
y = "Percentage",
fill = "Opinion on Welfare") +
theme_minimal() +
# Rotate x labels for readability
theme(axis.text.x = element_text(angle = 45, hjust = 1))# First, create and store the filtered and transformed dataset
gss_education_filtered <- gss %>%
filter(!is.na(natfare)) %>%
mutate(education_group = cut(educ,
breaks = c(-Inf, 11, 12, 14, 16, Inf),
labels = c("Less than High School", "High School Graduate",
"Some College", "Bachelor's Degree", "Graduate Degree"),
include.lowest = TRUE))
# Visualization of opinions on welfare spending by education level
gss_education_filtered %>%
count(education_group, natfare) %>%
ggplot(aes(x = education_group, y = n, fill = natfare)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Opinions on Welfare Spending by Education Level",
x = "Education Level",
y = "Percentage",
fill = "Opinion on Welfare") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x labels for readability# Visualization of opinions on welfare spending by political views
# Filtering to remove rows with NA in either 'polviews' or 'natfare'
gss_clean <- gss %>%
filter(!is.na(polviews) & !is.na(natfare)) %>%
count(polviews, natfare)
# Creating the plot with filtered data
gss_clean %>%
ggplot(aes(x = polviews, y = n, fill = natfare)) +
geom_bar(stat = "identity", position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Opinions on Welfare Spending by Political Views",
x = "Political Views",
y = "Percentage",
fill = "Opinion on Welfare") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x labels for readability# Chi-squared test for income level and opinions on welfare spending
# Create a contingency table of counts for 'income_category' and 'natfare'
contingency_table_income <- table(gss_filtered$income_category, gss_filtered$natfare)
# Perform the chi-square test
chi_square_test_income <- chisq.test(contingency_table_income)
# Output the result of the chi-square test
print(chi_square_test_income)##
## Pearson's Chi-squared test
##
## data: contingency_table_income
## X-squared = 258.16, df = 10, p-value < 2.2e-16
# Chi-squared test for education level and opinions on welfare spending
# Create a contingency table of counts for 'education_filtered' and 'natfare'
contingency_table_education <- table(gss_education_filtered$education_group, gss_education_filtered$natfare)
# Perform the chi-square test
chi_square_test_education <- chisq.test(contingency_table_education)
# Output the result of the chi-square test
print(chi_square_test_education)##
## Pearson's Chi-squared test
##
## data: contingency_table_education
## X-squared = 347.21, df = 8, p-value < 2.2e-16
# Chi-squared test for political views and opinions on welfare spending
# Create a dataset for the chi-square test without summarizing
gss_chi_test <- gss %>%
filter(!is.na(polviews) & !is.na(natfare))
# Create a contingency table of counts for 'polviews' and 'natfare'
contingency_table <- table(gss_chi_test$polviews, gss_chi_test$natfare)
# Perform the chi-square test
chi_square_test <- chisq.test(contingency_table)
print(chi_square_test)##
## Pearson's Chi-squared test
##
## data: contingency_table
## X-squared = 1051.9, df = 12, p-value < 2.2e-16
Income Category: Chi-square test result (\(\chi^2 = 258.16\), \(df = 10\)) with p-value < \(2.2 \times 10^{-16}\) indicates a significant association between income level and opinions on welfare spending.
Education Level: With \(\chi^2 = 347.21\) and \(df = 8\) with p-value < \(2.2 \times 10^{-16}\) suggests a statistically significant relationship between education level and opinions on welfare spending.
Political Views: The test shows \(\chi^2 = 1051.9\) and \(df = 12\), with a p-value < \(2.2 \times 10^{-16}\), indicating a significant association between political views and opinions on welfare spending.
These results imply significant associations across all categories. The identical p-values, commonly seen in large datasets, suggest strong statistical evidence against the null hypotheses. These tests indicate associations, not causality, necessitating further analysis for understanding the underlying reasons.
Social Cohesion and Democratic Engagement
Analyzing how opinions on welfare vary across demographic groups helps in identifying potential areas of social tension and cohesion. It’s essential for fostering a more inclusive society where diverse views are understood and respected. Understanding the public’s stance on significant issues like welfare spending is vital for healthy democratic engagement. It ensures that the voices of different demographic groups are heard and considered in the democratic process.