Data Preparation:

library(RCurl)
library(psych)
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
git_guns <- getURL("https://raw.githubusercontent.com/cfalvarez36/MSDS_R_Programming/main/Guns.csv")
guns <- read.csv(text = git_guns)

General summary of the guns database and scales of numerics:

summary(guns)
##        X             year         violent           murder      
##  Min.   :   1   Min.   :1977   Min.   :  47.0   Min.   : 0.200  
##  1st Qu.: 294   1st Qu.:1982   1st Qu.: 283.1   1st Qu.: 3.700  
##  Median : 587   Median :1988   Median : 443.0   Median : 6.400  
##  Mean   : 587   Mean   :1988   Mean   : 503.1   Mean   : 7.665  
##  3rd Qu.: 880   3rd Qu.:1994   3rd Qu.: 650.9   3rd Qu.: 9.800  
##  Max.   :1173   Max.   :1999   Max.   :2921.8   Max.   :80.600  
##     robbery         prisoners           afam              cauc      
##  Min.   :   6.4   Min.   :  19.0   Min.   : 0.2482   Min.   :21.78  
##  1st Qu.:  71.1   1st Qu.: 114.0   1st Qu.: 2.2022   1st Qu.:59.94  
##  Median : 124.1   Median : 187.0   Median : 4.0262   Median :65.06  
##  Mean   : 161.8   Mean   : 226.6   Mean   : 5.3362   Mean   :62.95  
##  3rd Qu.: 192.7   3rd Qu.: 291.0   3rd Qu.: 6.8507   3rd Qu.:69.20  
##  Max.   :1635.1   Max.   :1913.0   Max.   :26.9796   Max.   :76.53  
##       male         population          income         density         
##  Min.   :12.21   Min.   : 0.4027   Min.   : 8555   Min.   : 0.000707  
##  1st Qu.:14.65   1st Qu.: 1.1877   1st Qu.:11935   1st Qu.: 0.031911  
##  Median :15.90   Median : 3.2713   Median :13402   Median : 0.081569  
##  Mean   :16.08   Mean   : 4.8163   Mean   :13725   Mean   : 0.352038  
##  3rd Qu.:17.53   3rd Qu.: 5.6856   3rd Qu.:15271   3rd Qu.: 0.177718  
##  Max.   :22.35   Max.   :33.1451   Max.   :23647   Max.   :11.102120  
##     state               law           
##  Length:1173        Length:1173       
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 

Changing of column names to better represent the data:

gun_edit <- setNames(guns, c("Index","Year","Violence", "Murder", "Theft", "Prisoners", "African_American", "Caucasian", "Male", "Population", "Income", "Density", "State", "Carry_Law"))
head(gun_edit)
##   Index Year Violence Murder Theft Prisoners African_American Caucasian
## 1     1 1977    414.4   14.2  96.8        83         8.384873  55.12291
## 2     2 1978    419.1   13.3  99.1        94         8.352101  55.14367
## 3     3 1979    413.3   13.2 109.5       144         8.329575  55.13586
## 4     4 1980    448.5   13.2 132.1       141         8.408386  54.91259
## 5     5 1981    470.5   11.9 126.5       149         8.483435  54.92513
## 6     6 1982    447.7   10.6 112.0       183         8.514000  54.89621
##       Male Population   Income   Density   State Carry_Law
## 1 18.17441   3.780403 9563.148 0.0745524 Alabama        no
## 2 17.99408   3.831838 9932.000 0.0755667 Alabama        no
## 3 17.83934   3.866248 9877.028 0.0762453 Alabama        no
## 4 17.73420   3.900368 9541.428 0.0768288 Alabama        no
## 5 17.67372   3.918531 9548.351 0.0771866 Alabama        no
## 6 17.51052   3.925229 9478.919 0.0773185 Alabama        no

The creation of two subsets to focus on the research question:

open_carry <- subset(gun_edit, Carry_Law == "yes")
no_carry <- subset(gun_edit, Carry_Law == "no")

The creation of mean and median for the two subsets to show the difference between the two.

gun_robbery_OC_mean <- mean(open_carry$Theft)
gun_robbery_OC_median <- median(open_carry$Theft)
gun_theft_NC_mean <- mean(no_carry$Theft)
gun_theft_NC_median <- median(no_carry$Theft)

A dataframe to show all four of the statistics together:

mean_median_df = data.frame(gun_robbery_OC_mean, gun_robbery_OC_median, gun_theft_NC_mean, gun_theft_NC_median)
mean_median_df
##   gun_robbery_OC_mean gun_robbery_OC_median gun_theft_NC_mean
## 1             97.8986                  86.6          182.3356
##   gun_theft_NC_median
## 1              133.85

Research question:

Does having an Open Carry law represent higher crime rates in the United States?

What are the cases, and how many are there?

Each case represents a year’s worth of statistics for a state, and each state has 23 records. There are 1173 observations, each with 13 variables.

Describe the method of data collection.

Guns is a balanced panel of data on 50 US states, plus the District of Columbia (for a total of 51 states), by year for 1977–1999. Dataset taken from https://vincentarelbundock.github.io/Rdatasets/index.html

What type of study is this (observational/experiment)?

Observational, since we aren’t trying to intervene with the data.

Data Source: If you collected the data, state self-collected. If not, provide a citation/link.

Data was taken from:https://raw.githubusercontent.com/cfalvarez36/MSDS_R_Programming/main/Guns.csv Factors explained: https://vincentarelbundock.github.io/Rdatasets/doc/AER/Guns.html

https://oag.ca.gov/sites/all/files/agweb/pdfs/cjsc/prof10/formulas.pdf Will be used to calculate crime rate for more analysis.

Response:

What is the response variable, and what type is it (numerical/categorical)? The response variable is levels of crime and it will be numerical.

Explanatory: What is the explanatory variable(s), and what type is it (numerical/categorival)?

The explanatory variables will be population, Violence murder and theft counts, with additional analysis on income and prisoner status. These are all numerical.

Relevant summary statistics:

The general statistics for the entire dataset. This will be broken down into respective states!

describe(gun_edit)
##                  vars    n     mean      sd   median  trimmed     mad     min
## Index               1 1173   587.00  338.76   587.00   587.00  434.40    1.00
## Year                2 1173  1988.00    6.64  1988.00  1988.00    8.90 1977.00
## Violence            3 1173   503.07  334.28   443.00   464.50  266.42   47.00
## Murder              4 1173     7.67    7.52     6.40     6.72    4.30    0.20
## Theft               5 1173   161.82  170.51   124.10   133.84   91.18    6.40
## Prisoners           6 1173   226.58  178.89   187.00   202.38  123.06   19.00
## African_American    7 1173     5.34    4.89     4.03     4.53    3.14    0.25
## Caucasian           8 1173    62.95    9.76    65.06    64.48    6.68   21.78
## Male                9 1173    16.08    1.73    15.90    16.04    2.09   12.21
## Population         10 1173     4.82    5.25     3.27     3.78    3.20    0.40
## Income             11 1173 13724.80 2554.54 13401.55 13549.54 2406.96 8554.88
## Density            12 1173     0.35    1.36     0.08     0.12    0.09    0.00
## State*             13 1173    26.00   14.73    26.00    26.00   19.27    1.00
## Carry_Law*         14 1173     1.24    0.43     1.00     1.18    0.00    1.00
##                       max    range  skew kurtosis    se
## Index             1173.00  1172.00  0.00    -1.20  9.89
## Year              1999.00    22.00  0.00    -1.21  0.19
## Violence          2921.80  2874.80  2.54    11.85  9.76
## Murder              80.60    80.40  5.78    45.41  0.22
## Theft             1635.10  1628.70  3.88    21.28  4.98
## Prisoners         1913.00  1894.00  3.88    25.99  5.22
## African_American    26.98    26.73  2.35     6.69  0.14
## Caucasian           76.53    54.75 -2.22     6.05  0.29
## Male                22.35    10.14  0.27    -0.57  0.05
## Population          33.15    32.74  2.43     7.31  0.15
## Income           23646.71 15091.83  0.73     0.64 74.59
## Density             11.10    11.10  6.69    44.21  0.04
## State*              51.00    50.00  0.00    -1.20  0.43
## Carry_Law*           2.00     1.00  1.20    -0.57  0.01