Rows: 423 Columns: 44
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): County, Crime Type
dbl (42): Year, Anti-Male, Anti-Female, Anti-Transgender, Anti-Gender Identi...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
county year crime_type anti-male
Length:423 Min. :2010 Length:423 Min. :0.000000
Class :character 1st Qu.:2011 Class :character 1st Qu.:0.000000
Mode :character Median :2013 Mode :character Median :0.000000
Mean :2013 Mean :0.007092
3rd Qu.:2015 3rd Qu.:0.000000
Max. :2016 Max. :1.000000
anti-female anti-transgender anti-gender_identity_expression
Min. :0.00000 Min. :0.00000 Min. :0.00000
1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000
Median :0.00000 Median :0.00000 Median :0.00000
Mean :0.01655 Mean :0.04728 Mean :0.05674
3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000
Max. :1.00000 Max. :5.00000 Max. :3.00000
anti-age* anti-white anti-black
Min. :0.00000 Min. : 0.0000 Min. : 0.000
1st Qu.:0.00000 1st Qu.: 0.0000 1st Qu.: 0.000
Median :0.00000 Median : 0.0000 Median : 1.000
Mean :0.05201 Mean : 0.3357 Mean : 1.761
3rd Qu.:0.00000 3rd Qu.: 0.0000 3rd Qu.: 2.000
Max. :9.00000 Max. :11.0000 Max. :18.000
anti-american_indian/alaskan_native anti-asian
Min. :0.000000 Min. :0.0000
1st Qu.:0.000000 1st Qu.:0.0000
Median :0.000000 Median :0.0000
Mean :0.007092 Mean :0.1773
3rd Qu.:0.000000 3rd Qu.:0.0000
Max. :1.000000 Max. :8.0000
anti-native_hawaiian/pacific_islander anti-multi-racial_groups anti-other_race
Min. :0 Min. :0.00000 Min. :0
1st Qu.:0 1st Qu.:0.00000 1st Qu.:0
Median :0 Median :0.00000 Median :0
Mean :0 Mean :0.08511 Mean :0
3rd Qu.:0 3rd Qu.:0.00000 3rd Qu.:0
Max. :0 Max. :3.00000 Max. :0
anti-jewish anti-catholic anti-protestant anti-islamic_(muslim)
Min. : 0.000 Min. : 0.0000 Min. :0.00000 Min. : 0.0000
1st Qu.: 0.000 1st Qu.: 0.0000 1st Qu.:0.00000 1st Qu.: 0.0000
Median : 0.000 Median : 0.0000 Median :0.00000 Median : 0.0000
Mean : 3.981 Mean : 0.2695 Mean :0.02364 Mean : 0.4704
3rd Qu.: 3.000 3rd Qu.: 0.0000 3rd Qu.:0.00000 3rd Qu.: 0.0000
Max. :82.000 Max. :12.0000 Max. :1.00000 Max. :10.0000
anti-multi-religious_groups anti-atheism/agnosticism
Min. : 0.00000 Min. :0
1st Qu.: 0.00000 1st Qu.:0
Median : 0.00000 Median :0
Mean : 0.07565 Mean :0
3rd Qu.: 0.00000 3rd Qu.:0
Max. :10.00000 Max. :0
anti-religious_practice_generally anti-other_religion anti-buddhist
Min. :0.000000 Min. :0.000 Min. :0
1st Qu.:0.000000 1st Qu.:0.000 1st Qu.:0
Median :0.000000 Median :0.000 Median :0
Mean :0.007092 Mean :0.104 Mean :0
3rd Qu.:0.000000 3rd Qu.:0.000 3rd Qu.:0
Max. :2.000000 Max. :4.000 Max. :0
anti-eastern_orthodox_(greek,_russian,_etc.) anti-hindu
Min. :0.000000 Min. :0.000000
1st Qu.:0.000000 1st Qu.:0.000000
Median :0.000000 Median :0.000000
Mean :0.002364 Mean :0.002364
3rd Qu.:0.000000 3rd Qu.:0.000000
Max. :1.000000 Max. :1.000000
anti-jehovahs_witness anti-mormon anti-other_christian anti-sikh
Min. :0 Min. :0 Min. :0.00000 Min. :0
1st Qu.:0 1st Qu.:0 1st Qu.:0.00000 1st Qu.:0
Median :0 Median :0 Median :0.00000 Median :0
Mean :0 Mean :0 Mean :0.01655 Mean :0
3rd Qu.:0 3rd Qu.:0 3rd Qu.:0.00000 3rd Qu.:0
Max. :0 Max. :0 Max. :3.00000 Max. :0
anti-hispanic anti-arab anti-other_ethnicity/national_origin
Min. : 0.0000 Min. :0.00000 Min. : 0.0000
1st Qu.: 0.0000 1st Qu.:0.00000 1st Qu.: 0.0000
Median : 0.0000 Median :0.00000 Median : 0.0000
Mean : 0.3735 Mean :0.06619 Mean : 0.2837
3rd Qu.: 0.0000 3rd Qu.:0.00000 3rd Qu.: 0.0000
Max. :17.0000 Max. :2.00000 Max. :19.0000
anti-non-hispanic* anti-gay_male anti-gay_female
Min. :0 Min. : 0.000 Min. :0.0000
1st Qu.:0 1st Qu.: 0.000 1st Qu.:0.0000
Median :0 Median : 0.000 Median :0.0000
Mean :0 Mean : 1.499 Mean :0.2411
3rd Qu.:0 3rd Qu.: 1.000 3rd Qu.:0.0000
Max. :0 Max. :36.000 Max. :8.0000
anti-gay_(male_and_female) anti-heterosexual anti-bisexual
Min. :0.0000 Min. :0.000000 Min. :0.000000
1st Qu.:0.0000 1st Qu.:0.000000 1st Qu.:0.000000
Median :0.0000 Median :0.000000 Median :0.000000
Mean :0.1017 Mean :0.002364 Mean :0.004728
3rd Qu.:0.0000 3rd Qu.:0.000000 3rd Qu.:0.000000
Max. :4.0000 Max. :1.000000 Max. :1.000000
anti-physical_disability anti-mental_disability total_incidents
Min. :0.00000 Min. :0.000000 Min. : 1.00
1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.: 1.00
Median :0.00000 Median :0.000000 Median : 3.00
Mean :0.01182 Mean :0.009456 Mean : 10.09
3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.: 10.00
Max. :1.00000 Max. :1.000000 Max. :101.00
total_victims total_offenders
Min. : 1.00 Min. : 1.00
1st Qu.: 1.00 1st Qu.: 1.00
Median : 3.00 Median : 3.00
Mean : 10.48 Mean : 11.77
3rd Qu.: 10.00 3rd Qu.: 11.00
Max. :106.00 Max. :113.00
county year anti-black anti-transgender
Length:423 Min. :2010 Min. : 0.000 Min. :0.00000
Class :character 1st Qu.:2011 1st Qu.: 0.000 1st Qu.:0.00000
Mode :character Median :2013 Median : 1.000 Median :0.00000
Mean :2013 Mean : 1.761 Mean :0.04728
3rd Qu.:2015 3rd Qu.: 2.000 3rd Qu.:0.00000
Max. :2016 Max. :18.000 Max. :5.00000
anti-jewish anti-bisexual anti-asian anti-catholic
Min. : 0.000 Min. :0.000000 Min. :0.0000 Min. : 0.0000
1st Qu.: 0.000 1st Qu.:0.000000 1st Qu.:0.0000 1st Qu.: 0.0000
Median : 0.000 Median :0.000000 Median :0.0000 Median : 0.0000
Mean : 3.981 Mean :0.004728 Mean :0.1773 Mean : 0.2695
3rd Qu.: 3.000 3rd Qu.:0.000000 3rd Qu.:0.0000 3rd Qu.: 0.0000
Max. :82.000 Max. :1.000000 Max. :8.0000 Max. :12.0000
anti-female anti-male anti-white
Min. :0.00000 Min. :0.000000 Min. : 0.0000
1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.: 0.0000
Median :0.00000 Median :0.000000 Median : 0.0000
Mean :0.01655 Mean :0.007092 Mean : 0.3357
3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.: 0.0000
Max. :1.00000 Max. :1.000000 Max. :11.0000
plot2 <- hatenew |>ggplot() +geom_bar(aes(x=year, y=crimecount, fill = victim_cat),position ="dodge", stat ="identity") +labs(fill ="Hate Crime Type",y ="Number of Hate Crime Incidents",title ="Hate Crime Type in NY Counties Between 2010-2016",caption ="Source: NY State Division of Criminal Justice Services")plot2
#let’s evaluate the couties
plot3 <- hatenew |>ggplot() +geom_bar(aes(x=county, y=crimecount, fill = victim_cat),position ="dodge", stat ="identity") +labs(fill ="Hate Crime Type",y ="Number of Hate Crime Incidents",title ="Hate Crime Type in NY Counties Between 2010-2016",caption ="Source: NY State Division of Criminal Justice Services")plot3
#let’s reduce the number of counties
county <- hatenew |>group_by(year, county)|>summarize(sum =sum(crimecount)) |>arrange(desc(sum))
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
# A tibble: 5 × 2
county sum
<chr> <dbl>
1 Kings 595
2 Suffolk 342
3 Nassau 289
4 New York 278
5 Queens 187
#let’s create a barplot for the 5 top county
plot4 <- hatenew |>filter(county %in%c("Kings", "Suffolk", "Nassau","New York", "Queens")) |>ggplot() +geom_bar(aes(x=county, y=crimecount, fill = victim_cat),position ="dodge", stat ="identity") +labs(y ="Number of Hate Crime Incidents",title ="5 Counties in NY with Highest Incidents of Hate Crimes",subtitle ="Between 2010-2016", fill ="Hate Crime Type",caption ="Source: NY State Division of Criminal Justice Services")plot4
Rows: 62 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Geography
dbl (7): 2010, 2011, 2012, 2013, 2014, 2015, 2016
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#Rename the variable “Geography” as “county” so that it matches in the other dataset.
# A tibble: 6 × 3
county year population
<chr> <dbl> <dbl>
1 Albany , New York 2010 304078
2 Allegany , New York 2010 48949
3 Bronx , New York 2010 1388240
4 Broome , New York 2010 200469
5 Cattaraugus , New York 2010 80249
6 Cayuga , New York 2010 79844
#Let’s focus on 2014
nypoplong12 <- nypoplong |>filter(year ==2014) |>arrange(desc(population)) |>head(10)nypoplong12$county<-gsub(" , New York","",nypoplong12$county)nypoplong12
# A tibble: 10 × 3
county year population
<chr> <dbl> <dbl>
1 Kings 2014 2612544
2 Queens 2014 2314149
3 New York 2014 1634468
4 Suffolk 2014 1500008
5 Bronx 2014 1437687
6 Nassau 2014 1357799
7 Westchester 2014 970255
8 Erie 2014 923702
9 Monroe 2014 750089
10 Richmond 2014 473142
#According to this result, we can see that over the year the population of kings county grow more and more. Even if there is a lot of hate crimes, people still come and live in that area.
#let’s filter hatecrimes on 2014 only.
county14 <- county |>filter(year ==2014) |>arrange(desc(sum)) county14
# A tibble: 36 × 3
# Groups: year [1]
year county sum
<dbl> <chr> <dbl>
1 2014 Kings 75
2 2014 Suffolk 66
3 2014 Nassau 36
4 2014 New York 29
5 2014 Queens 24
6 2014 Richmond 24
7 2014 Multiple 17
8 2014 Erie 13
9 2014 Bronx 12
10 2014 Orange 7
# ℹ 26 more rows
#let’s join the nY population and hatecrimes of 2014 together
# A tibble: 36 × 5
# Groups: year [1]
year county sum population rate
<dbl> <chr> <dbl> <dbl> <dbl>
1 2014 Richmond 24 473142 5.07
2 2014 Suffolk 66 1500008 4.40
3 2014 Kings 75 2612544 2.87
4 2014 Nassau 36 1357799 2.65
5 2014 New York 29 1634468 1.77
6 2014 Erie 13 923702 1.41
7 2014 Queens 24 2314149 1.04
8 2014 Bronx 12 1437687 0.835
9 2014 Westchester 7 970255 0.721
10 2014 Monroe 0 750089 0
# ℹ 26 more rows
#let’s see the highest rate in 2014
dt <- datajoinrate[,c("county","rate")]dt
# A tibble: 36 × 2
county rate
<chr> <dbl>
1 Richmond 5.07
2 Suffolk 4.40
3 Kings 2.87
4 Nassau 2.65
5 New York 1.77
6 Erie 1.41
7 Queens 1.04
8 Bronx 0.835
9 Westchester 0.721
10 Monroe 0
# ℹ 26 more rows
#Even if there is more hatecrimes in kings, we can literally see that the highest rate in 2014 was at richmond .
#essay
#This dataset is great because it brings together various years, giving us a chance to learn more. It helps us pinpoint the most popular counties in New York, which is useful information. As a data scientist, I can use this dataset to understand how hate crimes are affecting our community.However, there is some negatives aspects like every dataset. Indeed,gthis dataset has a lot of county that we can all study. I have to choose some of the county that I only work on . I decided to choose the most popular in order to evaluate it. #hypothetically, I want it to explore the county that do not have a lot of crime and also explore only one county and see incidents hate crime in there,For example, it will right to look at the hatecrimes at kings only , so I can see what hate crime type in gwenerate in there. #after seeing the output from the hatecrimes, I want to know if the hatecrimes dataset on 2023 is the still the same one or there is changement. I want to know if the actual New York is more dangereous than the one from 2010 Finally, The second things that I would like to follow up is if the purcentage of anti-Jewish got down between 2016 to 2023.