Final Project Proposal DATA 606

Research Question

What are the current policies and attitudes towards firearms ownership and regulation in the United States?

Cases

In the CSV file pulled from FiveThirtyEight’s website, there are 58 cases. Each one is based off of age, background checks, bans on assault weapons, etc. and each political parties vote.

Data collection

The method of data collection appears to involve conducting polls or surveys on various topics related to guns. Each row in the CSV file represents a different poll or survey conducted by different organizations or entities.

There are key factors such as “Question”, “Start”, “End”, “Pollster”, “Population”, “Support”, “Republican Support”, “Democratic Support”, and “URL” provide insights into the data collection process. The method of data collection involves administering surveys or polls to different demographic groups and analyzing the responses to understand public opinion on gun-related issues.

Type of study

This study appears to be an observational study. In an observational study, researchers observe and collect data on individuals without intervening or influencing the variables of interest. In this case, data is collected through surveys or polls administered to participants, but there is no manipulation of variables or assignment of participants to different conditions, which is characteristic of experimental studies. Instead, researchers are simply observing and recording responses to understand public opinion on gun-related topics.

Dependent Variable

The dependent variable in an observational study is the outcome or response that is being measured or observed. In this dataset, the dependent variable would be the level of support for various gun-related issues or candidates, as indicated by the percentages in columns such as “Support”, “Republican Support”, and “Democratic Support”.

Independent Variable(s)

Potential independent variables could include the specific question or topic posed in each survey (“Question” column), the demographic characteristics of the surveyed population (“Population” column), and potentially the timing of the survey (“Start” and “End” columns).

Additionally, societal or political events occurring during the survey period may also impact respondents’ views. Therefore, while there isn’t a single independent variable that can be manipulated in an observational study like this, factors such as survey question, demographics, and timing may serve as variables that could potentially influence the level of support observed in the data.

Relevant summary statistics

# Load necessary libraries
library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

# Read the data
data <- read_csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/poll-quiz-guns/guns-polls.csv")
## Rows: 57 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (6): Question, Start, End, Pollster, Population, URL
## dbl (3): Support, Republican Support, Democratic Support
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Display the structure of the data
str(data)
## spc_tbl_ [57 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Question          : chr [1:57] "age-21" "age-21" "age-21" "age-21" ...
##  $ Start             : chr [1:57] "2/20/18" "2/27/18" "3/1/18" "2/22/18" ...
##  $ End               : chr [1:57] "2/23/18" "2/28/18" "3/4/18" "2/26/18" ...
##  $ Pollster          : chr [1:57] "CNN/SSRS" "NPR/Ipsos" "Rasmussen" "Harris Interactive" ...
##  $ Population        : chr [1:57] "Registered Voters" "Adults" "Adults" "Registered Voters" ...
##  $ Support           : num [1:57] 72 82 67 84 78 72 76 41 44 43 ...
##  $ Republican Support: num [1:57] 61 72 59 77 63 65 72 69 68 71 ...
##  $ Democratic Support: num [1:57] 86 92 76 92 93 80 86 20 20 24 ...
##  $ URL               : chr [1:57] "http://cdn.cnn.com/cnn/2018/images/02/25/rel3a.-.trump,.guns.pdf" "https://www.ipsos.com/en-us/npripsos-poll-majority-americans-support-policies-aimed-keep-guns-out-hands-dangerous-individuals" "http://www.rasmussenreports.com/public_content/lifestyle/general_lifestyle/march_2018/americans_favor_raising_a"| __truncated__ "http://thehill.com/opinion/civil-rights/375993-americans-support-no-gun-under-21" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Question = col_character(),
##   ..   Start = col_character(),
##   ..   End = col_character(),
##   ..   Pollster = col_character(),
##   ..   Population = col_character(),
##   ..   Support = col_double(),
##   ..   `Republican Support` = col_double(),
##   ..   `Democratic Support` = col_double(),
##   ..   URL = col_character()
##   .. )
##  - attr(*, "problems")=<externalptr>
# Summary statistics for quantitative variables
summary(data$Support)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.00   61.00   68.00   67.77   76.00   97.00
summary(data$Republican_Support)
## Warning: Unknown or uninitialised column: `Republican_Support`.
## Length  Class   Mode 
##      0   NULL   NULL
summary(data$Democratic_Support)
## Warning: Unknown or uninitialised column: `Democratic_Support`.
## Length  Class   Mode 
##      0   NULL   NULL
# Summary statistics for categorical variables
table(data$Question)
## 
##                      age-21                arm-teachers 
##                           7                           6 
##           background-checks         ban-assault-weapons 
##                           7                          12 
## ban-high-capacity-magazines       mental-health-own-gun 
##                           7                           6 
##        repeal-2nd-amendment           stricter-gun-laws 
##                           1                          11
table(data$Population)
## 
##            Adults Registered Voters 
##                13                44
# Visualizations
# Histogram of Support
ggplot(data, aes(x = Support)) +
  geom_histogram(binwidth = 5, fill = "skyblue", color = "black") +
  labs(title = "Distribution of Support",
       x = "Support",
       y = "Frequency")

# Boxplot of Support by Population
ggplot(data, aes(x = Population, y = Support)) +
  geom_boxplot(fill = "lightgreen") +
  labs(title = "Support by Population",
       x = "Population",
       y = "Support")

# Bar plot of Question
ggplot(data, aes(x = Question)) +
  geom_bar(fill = "salmon") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(title = "Frequency of Questions",
       x = "Question",
       y = "Frequency")