Commissioner William Bratton and his executive staff could take satisfaction in the progress made by the New York Police Department (NYPD) toward goals they had set at the outset of 1994 to reduce major crimes in the City. Their efforts had produced better results than even some of them had expected, better even than portrayed in the popular television drama carrying the Department’s name. William Bratton revolutionized the way policing was handleded in America, the introduction of the Compstat system utilized data like never before and would allow the NYPD to fight crime like never before. A change was implimented in the policing, away from following a rective protocol the department shifted focus on prevention. The purpose of this analysis is to compare the decrease in crime during Brattons first term with crime statistics of his second term.
The data utilized is found on Kaggle and observes 2014-2015 Crimes reported in all 5 boroughs of New York City. It can be found here. As data from his first term is not widely available - I converted a PDF file showing the overall crime numbers in the years of question; with this in mind the anlysis will focus primarily on 2014 and 2015 as data is available for these periods.
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
library(knitr)
library(leaflet)
library(ggplot2)
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
library(jsonlite)
JSON_URL <- "https://raw.githubusercontent.com/Fyoun123/Data607/master/Final%20Project/RAW/NYPD.json" #File was converted from PDF.
json_df <- as.data.frame(fromJSON(JSON_URL))
colnames(json_df) = json_df[1, ] # The first row will be the header
json_df = json_df[-1, ] # Removing the first row.
json_df # Test
## City 1993 1995 + - %
## 2 New York 600,346 444,758 -25.9%
## 3 Los Angeles 312,790 266,204 -14.9%
## 4 Chicago * * *
## 5 Houston 141,179 131,602 -6.8%
## 6 Philadelphia 97,659 108,278 +10.9%
## 7 San Diego 85,227 64,235 -24.6%
## 8 Phoenix 96,476 118,126 +22.4%
## 9 Dallas 110,803 98,624 -11.0%
## 10 Detroit 122,329 119,065 -2.7%
## 11 San Antonio 97,671 79,931 -18.2%
## 12 San Jose 36,743 36,096 -1.8%
## 13 Indianapolis 33,530 30,775 -8.2%
## 14 Las Vegas 48,367 60,178 +24.4%
## 15 San Francisco 67,345 60,474 -10.2%
## 16 Baltimore 91,920 94,855 +3.2%
## 17 Jacksonville 67,494 61,129 -9.4%
## 18 Columbus 58,604 58,715 +1.9%
## 19 Milwaukee 50,435 52,679 +4.4%
## 20 Memphis 62,150 65,597 +5.5%
## 21 Washington, D.C. 66,758 67,402 +1.0%
## 22 El Paso 46,738 41,692 -10.8%
## 23 Boston 55,555 52,278 -6.0%
## 24 Seattle 62,679 55,507 -11.4%
## 25 Nashville 55,500 56,090 +1.0%
## 26 Austin 51,468 42,586 -17.6%
## 27 Denver 39,796 34,769 -12.6%
## 28 Cleveland 40,006 38,665 -3.4%
## 29 New Orleans 52,773 53,399 +1.2%
## 30 Fort Worth 49,801 39,667 -20.3%
NY_Crime <- filter(json_df,json_df$City=='New York')
U_P <- gather(NY_Crime, year, value, `1993`:`1995`)
U_P$`+ - %`<- NULL
U_P$City <- NULL
U_P$value <- as.numeric(gsub(",","",U_P$value))
U_P
## year value
## 1 1993 600346
## 2 1995 444758
# Basic barplot
ggplot(data=U_P, aes(x=year, y=value)) +
geom_bar(stat="identity") + scale_y_continuous(name="Crime", labels = scales::comma)
According to the sources utilized crime decreased by 25.9% during William Brattons first term as commisioners - This would be highly controversial later on as people began to argue that numbers were inflated to better represent the impact Bratton had.
(read.csv("Crime_Column_Description.csv"))
## Column
## 1 CMPLNT_NUM
## 2 CMPLNT_FR_DT
## 3 CMPLNT_FR_TM
## 4 CMPLNT_TO_DT
## 5 CMPLNT_TO_TM
## 6 RPT_DT
## 7 KY_CD
## 8 OFNS_DESC
## 9 PD_CD
## 10 PD_DESC
## 11 CRM_ATPT_CPTD_CD
## 12 LAW_CAT_CD
## 13 JURIS_DESC
## 14 BORO_NM
## 15 ADDR_PCT_CD
## 16 LOC_OF_OCCUR_DESC
## 17 PREM_TYP_DESC
## 18 PARKS_NM
## 19 HADEVELOPT
## 20 X_COORD_CD
## 21 Y_COORD_CD
## 22 Latitude
## 23 Longitude
## Description
## 1 Randomly generated persistent ID for each complaint
## 2 Exact date of occurrence for the reported event (or starting date of occurrence, if CMPLNT_TO_DT exists)
## 3 Exact time of occurrence for the reported event (or starting time of occurrence, if CMPLNT_TO_TM exists)
## 4 Ending date of occurrence for the reported event, if exact time of occurrence is unknown
## 5 Ending time of occurrence for the reported event, if exact time of occurrence is unknown
## 6 Date event was reported to police
## 7 Three digit offense classification code
## 8 Description of offense corresponding with key code
## 9 Three digit internal classification code (more granular than Key Code)
## 10 Description of internal classification corresponding with PD code (more granular than Offense Description)
## 11 Indicator of whether crime was successfully completed or attempted, but failed or was interrupted prematurely
## 12 Level of offense: felony, misdemeanor, violation
## 13 Jurisdiction responsible for incident. Either internal, like Police, Transit, and Housing; or external, like Correction, Port Authority, etc.
## 14 The name of the borough in which the incident occurred
## 15 The precinct in which the incident occurred
## 16 Specific location of occurrence in or around the premises; inside, opposite of, front of, rear of
## 17 Specific description of premises; grocery store, residence, street, etc.
## 18 Name of NYC park, playground or greenspace of occurrence, if applicable (state parks are not included)
## 19 Name of NYCHA housing development of occurrence, if applicable
## 20 X-coordinate for New York State Plane Coordinate System, Long Island Zone, NAD 83, units feet (FIPS 3104)
## 21 Y-coordinate for New York State Plane Coordinate System, Long Island Zone, NAD 83, units feet (FIPS 3104)
## 22 Latitude coordinate for Global Coordinate System, WGS 1984, decimal degrees (EPSG 4326)
## 23 Longitude coordinate for Global Coordinate System, WGS 1984, decimal degrees (EPSG 4326)
Import CSV and rename columns so they are more readily understandable.
NYPD <- read.csv("NYPD_Complaint_Data_Historic.csv")
colnames(NYPD) <- c("crime_id","occurance_date","occurance_time","ending_date","ending_time","reported_date","offense_classification_code","offense_classification_description","internal_classification_code","internal_classification_description","crime_status","level_of_offense","type_of_jurisdiction","borough","precienct","specific_location","type_of_location","park_name","housing_name","x_coordinate","y_coordinate","latitude","longitude","location")
NYPD <- NYPD[,-c(1,7,9,19,18,20,21,24)] #Removal of columns
head(NYPD)
## occurance_date occurance_time ending_date ending_time reported_date
## 1 12/31/2015 23:45:00 12/31/2015
## 2 12/31/2015 23:36:00 12/31/2015
## 3 12/31/2015 23:30:00 12/31/2015
## 4 12/31/2015 23:30:00 12/31/2015
## 5 12/31/2015 23:25:00 12/31/2015 23:30:00 12/31/2015
## 6 12/31/2015 23:18:00 12/31/2015 23:25:00 12/31/2015
## offense_classification_description internal_classification_description
## 1 FORGERY FORGERY,ETC.,UNCLASSIFIED-FELO
## 2 MURDER & NON-NEGL. MANSLAUGHTER
## 3 DANGEROUS DRUGS CONTROLLED SUBSTANCE,INTENT TO
## 4 ASSAULT 3 & RELATED OFFENSES ASSAULT 3
## 5 ASSAULT 3 & RELATED OFFENSES ASSAULT 3
## 6 FELONY ASSAULT ASSAULT 2,1,UNCLASSIFIED
## crime_status level_of_offense type_of_jurisdiction borough precienct
## 1 COMPLETED FELONY N.Y. POLICE DEPT BRONX 44
## 2 COMPLETED FELONY N.Y. POLICE DEPT QUEENS 103
## 3 COMPLETED FELONY N.Y. POLICE DEPT MANHATTAN 28
## 4 COMPLETED MISDEMEANOR N.Y. POLICE DEPT QUEENS 105
## 5 COMPLETED MISDEMEANOR N.Y. POLICE DEPT MANHATTAN 13
## 6 ATTEMPTED FELONY N.Y. POLICE DEPT BROOKLYN 71
## specific_location type_of_location latitude longitude
## 1 INSIDE BAR/NIGHT CLUB 40.82885 -73.91666
## 2 OUTSIDE 40.69734 -73.78456
## 3 OTHER 40.80261 -73.94505
## 4 INSIDE RESIDENCE-HOUSE 40.65455 -73.72634
## 5 FRONT OF OTHER 40.73800 -73.98789
## 6 FRONT OF DRUG STORE 40.66502 -73.95711
We take Occurance_date and subset into more time relate columns, month, day, weekday, etc. Filter data to appropriate year to rid of lagging rows.
NYPD <- NYPD %>%
mutate(occurance_year = year(mdy(occurance_date)),
occurance_month = month(mdy(occurance_date)),
occurance_day = day(mdy(occurance_date)),
occurance_weekdays = weekdays(mdy(occurance_date)),
diff_reported.occurance = difftime(mdy(reported_date),mdy(occurance_date),units = "day"),
occurance_date_time = as.POSIXct(paste(occurance_date,occurance_time),format = "%m/%d/%Y %H:%M:%S"),
ending_date_time = as.POSIXct(paste(ending_date,ending_time),format = "%m/%d/%Y %H:%M:%S"),
diff_ending.occurance = round(difftime(ending_date_time,occurance_date_time,units = "hours"),digits = 2),
weekends = ifelse(occurance_weekdays %in% c("Sunday","Saturday"),"Yes","No")
) %>%
filter(occurance_year == "2014" | occurance_year == "2015")
NYPD1 <- NYPD %>%
group_by(occurance_year) %>%
summarise(total_crime = length(occurance_date))
NYPD1$total_crime <- as.numeric(NYPD1$total_crime)
NYPD1$occurance_year <- as.character(NYPD1$occurance_year)
ggplot(data=NYPD1, aes(x=occurance_year, y=total_crime)) +
geom_bar(stat="identity") + scale_y_continuous(name="Crime", labels = scales::comma)
H1 <- ggplot(data=NYPD1, aes(x=occurance_year, y=total_crime)) +
geom_bar(stat="identity") + scale_y_continuous(name="Crime", labels = scales::comma)
H2 <- ggplot(data=U_P, aes(x=year, y=value)) +
geom_bar(stat="identity") + scale_y_continuous(name="Crime", labels = scales::comma)
grid.arrange(H1, H2, ncol = 2)
NYPD1
## # A tibble: 2 x 2
## occurance_year total_crime
## <chr> <dbl>
## 1 2014 490363
## 2 2015 468576
As per observation we notice that crime has declined as a whole in NYC in the past 2 decades, not by an entirely staggering amount. Total crime was reported at 490,363 in 2014 and has come down to 468,576 in 2015. This is a decrease of about 4.4%. However the 2015 crime numbers represent a 5.3% increase from the reported 1995 crime numbers. This might be due to the decrease in “Creative” reporting of the compstat data in the past decade.
NYPD2014 <- filter(NYPD,occurance_year == "2014")
NYPD2015 <- filter(NYPD,occurance_year == "2015")
T20141 <- NYPD2014 %>%
ggplot(aes(as.factor(x = occurance_month))) +
geom_bar() +
ggtitle("Monthly New York Crime - 2014") + xlab("Month") + ylab("number of crime")
T20151 <- NYPD2015 %>%
ggplot(aes(as.factor(x = occurance_month))) +
geom_bar() +
ggtitle("Monthly New York Crime - 2015") + xlab("Month") + ylab("number of crime")
grid.arrange(T20141, T20151, ncol = 2)
(NYPD %>%
ggplot(aes(as.factor(x = occurance_month),fill = as.factor(occurance_year))) +
geom_bar(position = "fill") +
xlab("Month") + ylab("proportion") +
scale_fill_discrete(guide = guide_legend(title = "year")))
Here we see a strong correlation between crime spiking as the weather becomes warmer and slowing down in the winter tempeartures. Theres a strong decline in crime in january, likely due to ramped up efforts around the New year.
NYPD20142 <- NYPD2014 %>%
ggplot(aes(x = borough, fill = level_of_offense)) +
geom_bar(position = "dodge") + ggtitle("Borough - 2014")
NYPD20152 <- NYPD2015 %>%
ggplot(aes(x = borough, fill = level_of_offense)) +
geom_bar(position = "dodge") + ggtitle("Borough - 2015")
grid.arrange(NYPD20142, NYPD20152, ncol = 1)
(NYPDB1 <- NYPD2014 %>%
group_by(borough) %>%
summarise(total_crime = length(occurance_date)))
## # A tibble: 5 x 2
## borough total_crime
## <fct> <int>
## 1 BRONX 106016
## 2 BROOKLYN 148453
## 3 MANHATTAN 113096
## 4 QUEENS 100087
## 5 STATEN ISLAND 22711
(NYPDB2 <- NYPD2015 %>%
group_by(borough) %>%
summarise(total_crime = length(occurance_date)))
## # A tibble: 5 x 2
## borough total_crime
## <fct> <int>
## 1 BRONX 102950
## 2 BROOKLYN 140351
## 3 MANHATTAN 110580
## 4 QUEENS 92981
## 5 STATEN ISLAND 21714
borough.crime <- NYPD %>%
group_by(borough) %>%
summarise(n_borough = n())
borough.crime2014 <- NYPD2014 %>%
group_by(borough) %>%
summarise(n_borough = n())
borough.crime2015 <- NYPD2015 %>%
group_by(borough) %>%
summarise(n_borough = n())
borough.crime2014 %>%
ggplot(aes(x = borough, y = n_borough)) +
geom_bar(stat = "identity") +
ggtitle("Number of crime in different borough - 2014") +
xlab("Borough") + ylab("number of crime") +
geom_text(aes(label = ..y..), vjust = -.5)
borough.crime2015 %>%
ggplot(aes(x = borough, y = n_borough)) +
geom_bar(stat = "identity") +
ggtitle("Number of crime in different borough - 2015") +
xlab("Borough") + ylab("number of crime") +
geom_text(aes(label = ..y..), vjust = -.5)
T1A<- NYPD2014 %>%
group_by(offense_classification_description) %>%
summarise(n_class = n()) %>%
arrange(desc(n_class)) %>%
head(n = 5) %>%
ggplot(aes(x = reorder(offense_classification_description, -n_class), y = n_class)) +
geom_bar(stat = "identity") +
ggtitle("Top 5 crime type in New York - 2014") +
xlab("offense classification") + ylab("number of crime") +
geom_text(aes(label = ..y..), vjust = 0.5) +
coord_flip()
T1B<- NYPD2015 %>%
group_by(offense_classification_description) %>%
summarise(n_class = n()) %>%
arrange(desc(n_class)) %>%
head(n = 5) %>%
ggplot(aes(x = reorder(offense_classification_description, -n_class), y = n_class)) +
geom_bar(stat = "identity") +
ggtitle("Top 5 crime type in New York - 2015") +
xlab("offense classification") + ylab("number of crime") +
geom_text(aes(label = ..y..), vjust = 0.5) +
coord_flip()
grid.arrange(T1A, T1B, ncol = 1)
NYPD2014DRUGS <- filter(NYPD2014,offense_classification_description == "DANGEROUS DRUGS")
DRUG.distribution2014 <- sample_n(NYPD2014DRUGS, 10e3)
leaflet(data = DRUG.distribution2014) %>%
addProviderTiles("Stamen.TonerLite",
group = "Toner",
options = providerTileOptions(minZoom = 1, maxZoom = 100)) %>%
addCircleMarkers(~ longitude, ~latitude, radius = 0.0001, color = "orange", fillOpacity = .00001)
## Warning in validateCoords(lng, lat, funcName): Data contains 214 rows with
## either missing or invalid lat/lon values and will be ignored
NYPD2015DRUGS <- filter(NYPD2015,offense_classification_description == "DANGEROUS DRUGS")
DRUG.distribution2015 <- sample_n(NYPD2015DRUGS, 10e3)
leaflet(data = DRUG.distribution2015) %>%
addProviderTiles("Stamen.TonerLite",
group = "Toner",
options = providerTileOptions(minZoom = 1, maxZoom = 100)) %>%
addCircleMarkers(~ longitude, ~latitude, radius = 0.0001, color = "red", fillOpacity = .00001)
## Warning in validateCoords(lng, lat, funcName): Data contains 40 rows with
## either missing or invalid lat/lon values and will be ignored
include_graphics(rep("S1.png", 1))
The shaded portion on the image represents the war against drugs as per 1993, in comparison with 2014 and 2015 those areas are still showing residual impacts of the past. Being that they are still drug hotspots.
William Bratton has had a major impact on the way crime is fought in todays time, he led a data revolution that gave insgihts into fighting crime that was not possible in the past. Although there may have been creativity as play in the generally large impact he had in his first tenure as NYC commisoner, his second term was still impactful with crime decreasing across the boroughs.