Hate Crimes in NY from 2010-2016

Quarto

Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

1 + 1
[1] 2

You can add options to executable code like this

[1] 4

The echo: false option disables the printing of code (only output is displayed).

Hate Crimes Data set

This data set looks at all types of hate crimes in New York counties by the type of hate crime from 2010 to 2016.

Flawed hate crime data collection - we should know how the data was collected

So now we know that there is possible bias in the data set, what can we do with it?

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#tinytex::install_tinytex()
#library(tinytex)
setwd("~/Desktop/Data 110")
hatecrimes <- read_csv("hateCrimes2010.csv")
Rows: 423 Columns: 44
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (2): County, Crime Type
dbl (42): Year, Anti-Male, Anti-Female, Anti-Transgender, Anti-Gender Identi...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Clean up the data:

Make all headers lowercase and remove spaces

names(hatecrimes) <- tolower(names(hatecrimes))
names(hatecrimes) <- gsub(" ","_",names(hatecrimes))
head(hatecrimes)
# A tibble: 6 × 44
  county    year crime_type         `anti-male` `anti-female` `anti-transgender`
  <chr>    <dbl> <chr>                    <dbl>         <dbl>              <dbl>
1 Albany    2016 Crimes Against Pe…           0             0                  0
2 Albany    2016 Property Crimes              0             0                  0
3 Allegany  2016 Property Crimes              0             0                  0
4 Bronx     2016 Crimes Against Pe…           0             0                  4
5 Bronx     2016 Property Crimes              0             0                  0
6 Broome    2016 Crimes Against Pe…           0             0                  0
# ℹ 38 more variables: `anti-gender_identity_expression` <dbl>,
#   `anti-age*` <dbl>, `anti-white` <dbl>, `anti-black` <dbl>,
#   `anti-american_indian/alaskan_native` <dbl>, `anti-asian` <dbl>,
#   `anti-native_hawaiian/pacific_islander` <dbl>,
#   `anti-multi-racial_groups` <dbl>, `anti-other_race` <dbl>,
#   `anti-jewish` <dbl>, `anti-catholic` <dbl>, `anti-protestant` <dbl>,
#   `anti-islamic_(muslim)` <dbl>, `anti-multi-religious_groups` <dbl>, …

##Select only certain hate-crimes

summary(hatecrimes)
    county               year       crime_type          anti-male       
 Length:423         Min.   :2010   Length:423         Min.   :0.000000  
 Class :character   1st Qu.:2011   Class :character   1st Qu.:0.000000  
 Mode  :character   Median :2013   Mode  :character   Median :0.000000  
                    Mean   :2013                      Mean   :0.007092  
                    3rd Qu.:2015                      3rd Qu.:0.000000  
                    Max.   :2016                      Max.   :1.000000  
  anti-female      anti-transgender  anti-gender_identity_expression
 Min.   :0.00000   Min.   :0.00000   Min.   :0.00000                
 1st Qu.:0.00000   1st Qu.:0.00000   1st Qu.:0.00000                
 Median :0.00000   Median :0.00000   Median :0.00000                
 Mean   :0.01655   Mean   :0.04728   Mean   :0.05674                
 3rd Qu.:0.00000   3rd Qu.:0.00000   3rd Qu.:0.00000                
 Max.   :1.00000   Max.   :5.00000   Max.   :3.00000                
   anti-age*         anti-white        anti-black    
 Min.   :0.00000   Min.   : 0.0000   Min.   : 0.000  
 1st Qu.:0.00000   1st Qu.: 0.0000   1st Qu.: 0.000  
 Median :0.00000   Median : 0.0000   Median : 1.000  
 Mean   :0.05201   Mean   : 0.3357   Mean   : 1.761  
 3rd Qu.:0.00000   3rd Qu.: 0.0000   3rd Qu.: 2.000  
 Max.   :9.00000   Max.   :11.0000   Max.   :18.000  
 anti-american_indian/alaskan_native   anti-asian    
 Min.   :0.000000                    Min.   :0.0000  
 1st Qu.:0.000000                    1st Qu.:0.0000  
 Median :0.000000                    Median :0.0000  
 Mean   :0.007092                    Mean   :0.1773  
 3rd Qu.:0.000000                    3rd Qu.:0.0000  
 Max.   :1.000000                    Max.   :8.0000  
 anti-native_hawaiian/pacific_islander anti-multi-racial_groups anti-other_race
 Min.   :0                             Min.   :0.00000          Min.   :0      
 1st Qu.:0                             1st Qu.:0.00000          1st Qu.:0      
 Median :0                             Median :0.00000          Median :0      
 Mean   :0                             Mean   :0.08511          Mean   :0      
 3rd Qu.:0                             3rd Qu.:0.00000          3rd Qu.:0      
 Max.   :0                             Max.   :3.00000          Max.   :0      
  anti-jewish     anti-catholic     anti-protestant   anti-islamic_(muslim)
 Min.   : 0.000   Min.   : 0.0000   Min.   :0.00000   Min.   : 0.0000      
 1st Qu.: 0.000   1st Qu.: 0.0000   1st Qu.:0.00000   1st Qu.: 0.0000      
 Median : 0.000   Median : 0.0000   Median :0.00000   Median : 0.0000      
 Mean   : 3.981   Mean   : 0.2695   Mean   :0.02364   Mean   : 0.4704      
 3rd Qu.: 3.000   3rd Qu.: 0.0000   3rd Qu.:0.00000   3rd Qu.: 0.0000      
 Max.   :82.000   Max.   :12.0000   Max.   :1.00000   Max.   :10.0000      
 anti-multi-religious_groups anti-atheism/agnosticism
 Min.   : 0.00000            Min.   :0               
 1st Qu.: 0.00000            1st Qu.:0               
 Median : 0.00000            Median :0               
 Mean   : 0.07565            Mean   :0               
 3rd Qu.: 0.00000            3rd Qu.:0               
 Max.   :10.00000            Max.   :0               
 anti-religious_practice_generally anti-other_religion anti-buddhist
 Min.   :0.000000                  Min.   :0.000       Min.   :0    
 1st Qu.:0.000000                  1st Qu.:0.000       1st Qu.:0    
 Median :0.000000                  Median :0.000       Median :0    
 Mean   :0.007092                  Mean   :0.104       Mean   :0    
 3rd Qu.:0.000000                  3rd Qu.:0.000       3rd Qu.:0    
 Max.   :2.000000                  Max.   :4.000       Max.   :0    
 anti-eastern_orthodox_(greek,_russian,_etc.)   anti-hindu      
 Min.   :0.000000                             Min.   :0.000000  
 1st Qu.:0.000000                             1st Qu.:0.000000  
 Median :0.000000                             Median :0.000000  
 Mean   :0.002364                             Mean   :0.002364  
 3rd Qu.:0.000000                             3rd Qu.:0.000000  
 Max.   :1.000000                             Max.   :1.000000  
 anti-jehovahs_witness  anti-mormon anti-other_christian   anti-sikh
 Min.   :0             Min.   :0    Min.   :0.00000      Min.   :0  
 1st Qu.:0             1st Qu.:0    1st Qu.:0.00000      1st Qu.:0  
 Median :0             Median :0    Median :0.00000      Median :0  
 Mean   :0             Mean   :0    Mean   :0.01655      Mean   :0  
 3rd Qu.:0             3rd Qu.:0    3rd Qu.:0.00000      3rd Qu.:0  
 Max.   :0             Max.   :0    Max.   :3.00000      Max.   :0  
 anti-hispanic       anti-arab       anti-other_ethnicity/national_origin
 Min.   : 0.0000   Min.   :0.00000   Min.   : 0.0000                     
 1st Qu.: 0.0000   1st Qu.:0.00000   1st Qu.: 0.0000                     
 Median : 0.0000   Median :0.00000   Median : 0.0000                     
 Mean   : 0.3735   Mean   :0.06619   Mean   : 0.2837                     
 3rd Qu.: 0.0000   3rd Qu.:0.00000   3rd Qu.: 0.0000                     
 Max.   :17.0000   Max.   :2.00000   Max.   :19.0000                     
 anti-non-hispanic* anti-gay_male    anti-gay_female 
 Min.   :0          Min.   : 0.000   Min.   :0.0000  
 1st Qu.:0          1st Qu.: 0.000   1st Qu.:0.0000  
 Median :0          Median : 0.000   Median :0.0000  
 Mean   :0          Mean   : 1.499   Mean   :0.2411  
 3rd Qu.:0          3rd Qu.: 1.000   3rd Qu.:0.0000  
 Max.   :0          Max.   :36.000   Max.   :8.0000  
 anti-gay_(male_and_female) anti-heterosexual  anti-bisexual     
 Min.   :0.0000             Min.   :0.000000   Min.   :0.000000  
 1st Qu.:0.0000             1st Qu.:0.000000   1st Qu.:0.000000  
 Median :0.0000             Median :0.000000   Median :0.000000  
 Mean   :0.1017             Mean   :0.002364   Mean   :0.004728  
 3rd Qu.:0.0000             3rd Qu.:0.000000   3rd Qu.:0.000000  
 Max.   :4.0000             Max.   :1.000000   Max.   :1.000000  
 anti-physical_disability anti-mental_disability total_incidents 
 Min.   :0.00000          Min.   :0.000000       Min.   :  1.00  
 1st Qu.:0.00000          1st Qu.:0.000000       1st Qu.:  1.00  
 Median :0.00000          Median :0.000000       Median :  3.00  
 Mean   :0.01182          Mean   :0.009456       Mean   : 10.09  
 3rd Qu.:0.00000          3rd Qu.:0.000000       3rd Qu.: 10.00  
 Max.   :1.00000          Max.   :1.000000       Max.   :101.00  
 total_victims    total_offenders 
 Min.   :  1.00   Min.   :  1.00  
 1st Qu.:  1.00   1st Qu.:  1.00  
 Median :  3.00   Median :  3.00  
 Mean   : 10.48   Mean   : 11.77  
 3rd Qu.: 10.00   3rd Qu.: 11.00  
 Max.   :106.00   Max.   :113.00  
hatecrimes2 <- hatecrimes |>
  select(county, year, 'anti-black', 'anti-white', 'anti-jewish', 'anti-catholic', 'anti-gay_male', 'anti-hispanic') |>
  group_by(county, year)
head(hatecrimes2)
# A tibble: 6 × 8
# Groups:   county, year [4]
  county    year `anti-black` `anti-white` `anti-jewish` `anti-catholic`
  <chr>    <dbl>        <dbl>        <dbl>         <dbl>           <dbl>
1 Albany    2016            1            0             0               0
2 Albany    2016            2            0             0               0
3 Allegany  2016            1            0             0               0
4 Bronx     2016            0            1             0               0
5 Bronx     2016            0            1             1               0
6 Broome    2016            1            0             0               0
# ℹ 2 more variables: `anti-gay_male` <dbl>, `anti-hispanic` <dbl>

Check the dimensions and the summary to make sure no missing values

dim(hatecrimes2)
[1] 423   8
summary(hatecrimes2)
    county               year        anti-black       anti-white     
 Length:423         Min.   :2010   Min.   : 0.000   Min.   : 0.0000  
 Class :character   1st Qu.:2011   1st Qu.: 0.000   1st Qu.: 0.0000  
 Mode  :character   Median :2013   Median : 1.000   Median : 0.0000  
                    Mean   :2013   Mean   : 1.761   Mean   : 0.3357  
                    3rd Qu.:2015   3rd Qu.: 2.000   3rd Qu.: 0.0000  
                    Max.   :2016   Max.   :18.000   Max.   :11.0000  
  anti-jewish     anti-catholic     anti-gay_male    anti-hispanic    
 Min.   : 0.000   Min.   : 0.0000   Min.   : 0.000   Min.   : 0.0000  
 1st Qu.: 0.000   1st Qu.: 0.0000   1st Qu.: 0.000   1st Qu.: 0.0000  
 Median : 0.000   Median : 0.0000   Median : 0.000   Median : 0.0000  
 Mean   : 3.981   Mean   : 0.2695   Mean   : 1.499   Mean   : 0.3735  
 3rd Qu.: 3.000   3rd Qu.: 0.0000   3rd Qu.: 1.000   3rd Qu.: 0.0000  
 Max.   :82.000   Max.   :12.0000   Max.   :36.000   Max.   :17.0000  

Convert from wide to long format

hatelong <- hatecrimes2 |> 
    pivot_longer(
        cols =3:6,
        names_to = "victim_cat",
        values_to = "crimecount")

Now use the long format to create a facet plot

hatecrimplot <-hatelong |> 
  ggplot(aes(year, crimecount))+
  geom_point()+
  aes(color = victim_cat)+
  facet_wrap(~victim_cat)
hatecrimplot

hatenew <- hatelong |>
  filter( victim_cat %in% c("anti-black", "anti-jewish", "anti-gaymale"))|>
  group_by(year, county) |>
  arrange(desc(crimecount))
hatenew
# A tibble: 846 × 6
# Groups:   year, county [277]
   county   year `anti-gay_male` `anti-hispanic` victim_cat  crimecount
   <chr>   <dbl>           <dbl>           <dbl> <chr>            <dbl>
 1 Kings    2012               0               0 anti-jewish         82
 2 Kings    2016               2               1 anti-jewish         51
 3 Suffolk  2014               2               0 anti-jewish         48
 4 Suffolk  2012               2               2 anti-jewish         48
 5 Kings    2011               2               0 anti-jewish         44
 6 Kings    2013               1               1 anti-jewish         41
 7 Kings    2010               1               0 anti-jewish         39
 8 Nassau   2011               0               2 anti-jewish         38
 9 Suffolk  2013               1               0 anti-jewish         37
10 Nassau   2016               0               0 anti-jewish         36
# ℹ 836 more rows

##Plot these three types of hate crimes together

plot2 <- hatenew |>
  ggplot() +
  geom_bar(aes(x=year, y=crimecount, fill = victim_cat),
      position = "dodge", stat = "identity") +
  labs(fill = "Hate Crime Type",
       y = "Number of Hate Crime Incidents",
       title = "Hate Crime Type in NY Counties Between 2010-2016",
       caption = "Source: NY State Division of Criminal Justice Services")
plot2

What about the counties?

plot3 <- hatenew |>
  ggplot() +
  geom_bar(aes(x=county, y=crimecount, fill = victim_cat),
      position = "dodge", stat = "identity") +
  labs(fill = "Hate Crime Type",
       y = "Number of Hate Crime Incidents",
       title = "Hate Crime Type in NY Counties Between 2010-2016",
       caption = "Source: NY State Division of Criminal Justice Services")
plot3

So many counties

counties <- hatenew |>
  group_by(year, county)|>
  summarize(sum = sum(crimecount)) |>
  arrange(desc(sum))
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
counties
# A tibble: 277 × 3
# Groups:   year [7]
    year county    sum
   <dbl> <chr>   <dbl>
 1  2012 Kings     124
 2  2010 Kings      89
 3  2016 Kings      83
 4  2012 Suffolk    80
 5  2014 Kings      69
 6  2015 Kings      69
 7  2011 Kings      68
 8  2013 Kings      68
 9  2014 Suffolk    65
10  2013 Suffolk    64
# ℹ 267 more rows

Top 5

counties2 <- hatenew |>
  group_by(county)|>
  summarize(sum = sum(crimecount)) |>
  slice_max(order_by = sum, n=5)
counties2
# A tibble: 5 × 2
  county     sum
  <chr>    <dbl>
1 Kings      570
2 Suffolk    317
3 Nassau     283
4 New York   266
5 Queens     175
plot4 <- hatenew |>
  filter(county %in% c("Kings", "New York", "Suffolk", "Nassau", "Queens")) |>
  ggplot() +
  geom_bar(aes(x=county, y=crimecount, fill = victim_cat),
      position = "dodge", stat = "identity") +
  labs(y = "Number of Hate Crime Incidents",
       title = "5 Counties in NY with Highest Incidents of Hate Crimes",
       subtitle = "Between 2010-2016", 
       fill = "Hate Crime Type",
      caption = "Source: NY State Division of Criminal Justice Services")
plot4

How would calculations be affected by looking at hate crimes in counties per year by population densitites

setwd("~/Desktop/Data 110")
nypop <- read_csv("newyorkpopulation.csv")
Rows: 62 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Geography
dbl (7): 2010, 2011, 2012, 2013, 2014, 2015, 2016

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Clean the county name to match the other dataset

nypop$Geography <- gsub(" , New York", "", nypop$Geography)
nypop$Geography <- gsub("County", "", nypop$Geography)
nypoplong <- nypop |>
  rename(county = Geography) |>
  gather("year", "population", 2:8) 
nypoplong$year <- as.double(nypoplong$year)
head(nypoplong)
# A tibble: 6 × 3
  county                  year population
  <chr>                  <dbl>      <dbl>
1 Albany , New York       2010     304078
2 Allegany , New York     2010      48949
3 Bronx , New York        2010    1388240
4 Broome , New York       2010     200469
5 Cattaraugus , New York  2010      80249
6 Cayuga , New York       2010      79844

Focus on 2012

nypoplong12 <- nypoplong |>
  filter(year == 2012) |>
  arrange(desc(population)) |>
  head(10)
nypoplong12$county<-gsub(" , New York","",nypoplong12$county)
nypoplong12
# A tibble: 10 × 3
   county       year population
   <chr>       <dbl>      <dbl>
 1 Kings        2012    2572282
 2 Queens       2012    2278024
 3 New York     2012    1625121
 4 Suffolk      2012    1499382
 5 Bronx        2012    1414774
 6 Nassau       2012    1350748
 7 Westchester  2012     961073
 8 Erie         2012     920792
 9 Monroe       2012     748947
10 Richmond     2012     470978

Filter hate crimes just for 2012 as well

counties12 <- counties |>
  filter(year == 2012) |>
  arrange(desc(sum)) 
counties12
# A tibble: 41 × 3
# Groups:   year [1]
    year county     sum
   <dbl> <chr>    <dbl>
 1  2012 Kings      124
 2  2012 Suffolk     80
 3  2012 New York    54
 4  2012 Nassau      48
 5  2012 Queens      38
 6  2012 Erie        20
 7  2012 Bronx       19
 8  2012 Richmond    16
 9  2012 Multiple    14
10  2012 Dutchess     9
# ℹ 31 more rows

Join the Hate Crimes data with NY population data for 2012

datajoin <- counties12 |>
  full_join(nypoplong12, by=c("county", "year"))
datajoin
# A tibble: 41 × 4
# Groups:   year [1]
    year county     sum population
   <dbl> <chr>    <dbl>      <dbl>
 1  2012 Kings      124    2572282
 2  2012 Suffolk     80    1499382
 3  2012 New York    54    1625121
 4  2012 Nassau      48    1350748
 5  2012 Queens      38    2278024
 6  2012 Erie        20     920792
 7  2012 Bronx       19    1414774
 8  2012 Richmond    16     470978
 9  2012 Multiple    14         NA
10  2012 Dutchess     9         NA
# ℹ 31 more rows

Calculate the rate of incidents per 100,000. Then arrange in descending order

datajoinrate <- datajoin |>
  mutate(rate = sum/population*100000) |>
  arrange(desc(rate))
datajoinrate
# A tibble: 41 × 5
# Groups:   year [1]
    year county        sum population  rate
   <dbl> <chr>       <dbl>      <dbl> <dbl>
 1  2012 Suffolk        80    1499382 5.34 
 2  2012 Kings         124    2572282 4.82 
 3  2012 Nassau         48    1350748 3.55 
 4  2012 Richmond       16     470978 3.40 
 5  2012 New York       54    1625121 3.32 
 6  2012 Erie           20     920792 2.17 
 7  2012 Queens         38    2278024 1.67 
 8  2012 Bronx          19    1414774 1.34 
 9  2012 Monroe          5     748947 0.668
10  2012 Westchester     6     961073 0.624
# ℹ 31 more rows
dt <- datajoinrate[,c("county","rate")]
dt
# A tibble: 41 × 2
   county       rate
   <chr>       <dbl>
 1 Suffolk     5.34 
 2 Kings       4.82 
 3 Nassau      3.55 
 4 Richmond    3.40 
 5 New York    3.32 
 6 Erie        2.17 
 7 Queens      1.67 
 8 Bronx       1.34 
 9 Monroe      0.668
10 Westchester 0.624
# ℹ 31 more rows

Aggregating some of the categories

aggregategroups <- hatecrimes |>
  pivot_longer(
    cols = 4:44,
    names_to = "victim_cat",
    values_to = "crimecount"
  )
unique(aggregategroups$victim_cat)
 [1] "anti-male"                                   
 [2] "anti-female"                                 
 [3] "anti-transgender"                            
 [4] "anti-gender_identity_expression"             
 [5] "anti-age*"                                   
 [6] "anti-white"                                  
 [7] "anti-black"                                  
 [8] "anti-american_indian/alaskan_native"         
 [9] "anti-asian"                                  
[10] "anti-native_hawaiian/pacific_islander"       
[11] "anti-multi-racial_groups"                    
[12] "anti-other_race"                             
[13] "anti-jewish"                                 
[14] "anti-catholic"                               
[15] "anti-protestant"                             
[16] "anti-islamic_(muslim)"                       
[17] "anti-multi-religious_groups"                 
[18] "anti-atheism/agnosticism"                    
[19] "anti-religious_practice_generally"           
[20] "anti-other_religion"                         
[21] "anti-buddhist"                               
[22] "anti-eastern_orthodox_(greek,_russian,_etc.)"
[23] "anti-hindu"                                  
[24] "anti-jehovahs_witness"                       
[25] "anti-mormon"                                 
[26] "anti-other_christian"                        
[27] "anti-sikh"                                   
[28] "anti-hispanic"                               
[29] "anti-arab"                                   
[30] "anti-other_ethnicity/national_origin"        
[31] "anti-non-hispanic*"                          
[32] "anti-gay_male"                               
[33] "anti-gay_female"                             
[34] "anti-gay_(male_and_female)"                  
[35] "anti-heterosexual"                           
[36] "anti-bisexual"                               
[37] "anti-physical_disability"                    
[38] "anti-mental_disability"                      
[39] "total_incidents"                             
[40] "total_victims"                               
[41] "total_offenders"                             
aggregategroups <- aggregategroups |>
  mutate(group = case_when(
    victim_cat %in% c("anti-transgender", "anti-gayfemale", "anti-gendervictim_catendityexpression", "anti-gaymale", "anti-gay(maleandfemale", "anti-bisexual") ~ "anti-lgbtq",
    victim_cat %in% c("anti-multi-racialgroups", "anti-jewish", "anti-protestant", "anti-multi-religousgroups", "anti-religiouspracticegenerally", "anti-buddhist", "anti-hindu", "anti-mormon", "anti-sikh", "anti-catholic", "anti-islamic(muslim)", "anti-atheism/agnosticism", "anti-otherreligion", "anti-easternorthodox(greek,russian,etc.)", "anti-jehovahswitness", "anti-otherchristian") ~ "anti-religion", 
    victim_cat %in% c("anti-asian", "anti-arab", "anti-non-hispanic", "anti-white", "anti-americanindian/alaskannative", "anti-nativehawaiian/pacificislander", "anti-otherrace", "anti-hispanic", "anti-otherethnicity/nationalorigin") ~ "anti-ethnicity",
    victim_cat %in% c("anti-physicaldisability", "anti-mentaldisability") ~ "anti-disability",
    victim_cat %in% c("anti-female", "anti-male") ~ "anti-gender",
    TRUE ~ "others"))
aggregategroups
# A tibble: 17,343 × 6
   county  year crime_type             victim_cat               crimecount group
   <chr>  <dbl> <chr>                  <chr>                         <dbl> <chr>
 1 Albany  2016 Crimes Against Persons anti-male                         0 anti…
 2 Albany  2016 Crimes Against Persons anti-female                       0 anti…
 3 Albany  2016 Crimes Against Persons anti-transgender                  0 anti…
 4 Albany  2016 Crimes Against Persons anti-gender_identity_ex…          0 othe…
 5 Albany  2016 Crimes Against Persons anti-age*                         0 othe…
 6 Albany  2016 Crimes Against Persons anti-white                        0 anti…
 7 Albany  2016 Crimes Against Persons anti-black                        1 othe…
 8 Albany  2016 Crimes Against Persons anti-american_indian/al…          0 othe…
 9 Albany  2016 Crimes Against Persons anti-asian                        0 anti…
10 Albany  2016 Crimes Against Persons anti-native_hawaiian/pa…          0 othe…
# ℹ 17,333 more rows

or create subset with jusr LGBTQ

lgbtq <- hatecrimes |>
  pivot_longer(
      cols = 4:44,
      names_to = "victim_cat",
      values_to = "crimecount") |>
filter(victim_cat %in% c("anti-transgender", "anti-gayfemale", "anti-gendervictim_catendityexpression", "anti-gaymale", "anti-gay(maleandfemale", "anti-bisexual"))
lgbtq
# A tibble: 846 × 5
   county    year crime_type             victim_cat       crimecount
   <chr>    <dbl> <chr>                  <chr>                 <dbl>
 1 Albany    2016 Crimes Against Persons anti-transgender          0
 2 Albany    2016 Crimes Against Persons anti-bisexual             0
 3 Albany    2016 Property Crimes        anti-transgender          0
 4 Albany    2016 Property Crimes        anti-bisexual             0
 5 Allegany  2016 Property Crimes        anti-transgender          0
 6 Allegany  2016 Property Crimes        anti-bisexual             0
 7 Bronx     2016 Crimes Against Persons anti-transgender          4
 8 Bronx     2016 Crimes Against Persons anti-bisexual             0
 9 Bronx     2016 Property Crimes        anti-transgender          0
10 Bronx     2016 Property Crimes        anti-bisexual             0
# ℹ 836 more rows

The data set on hate crimes in New York from 2010 to 2016 provides valuable data on hate crimes committed across different counties. It has several advantages, such as offering a wide array of hate crime data, including categories like anti-male, anti-female, anti-Black, and anti-Jewish. Another advantage is that the data set includes data from many counties, allowing for a broader analysis rather than focusing only on the most densely populated areas.

However, this dataset also has limitations. One significant issue is that the data can be skewed and may not reflect the actual number of hate crimes. Victims of hate crimes may choose not to report incidents due to fear or lack of trust in authorities. Another factor contributing to biased data is that the New York Police Department has the authority to decide what qualifies as a hate crime, which may lead to inconsistencies in classification.

One potential research path I would explore is analyzing crime types and their relationship to anti-female hate crimes. By plotting crime types with anti-female incidents, I could determine whether specific types of crimes are more commonly committed against women. Another research path would involve graphing the anti-transgender and anti-gay hate crime data to identify any correlation between the two types of hate crimes and determine if they share any patterns.

After reviewing the hate crimes tutorial, two follow-up actions I would take include investigating the reasons behind the spike in anti-Jewish hate crimes. By researching historical and political contexts, I could understand potential factors contributing to increases in these incidents. Additionally, I would examine whether any community interventions were implemented when hate crime rates decreased, as this could provide insights into effective strategies for reducing hate crimes.