#https://timesofsandiego.com/wp-content/uploads/2020/04/Mile-of-Cars.jpg

1.1 Problem Statement

According to a report by the San Diego Association of Governments, also known as SANDAG, revealed that the San Diego region saw a 3 percent increase in domestic violence in the first half of 2020 over the same period last year. SANDAG’s data shows more notable increases in domestic violence in certain local communities: Santee (18 percent) El Cajon (18 percent) National City (74 percent).

News source https://www.justice.gov/usao-sdca/pr/us-attorneys-across-california-join-district-attorneys-help-victims-domestic-violence

1.2 Focus of Analysis

National City is selected for this research analysis.
Domestic violence is the general category of this research.

1.3 Data source :

Crime Victims in San Diego, from 2016 though July 2020, with UCR codes for the crime and the age, race and sex of the victim. The data below is a complete data set for victims of violent crime from ARJIS (Automated Regional Justice Information System) for San Diego County, the nation’s fifth largest County, from 2016 through 2020 https://data.sandiegodata.org/dataset/arjis-org-crime-victims-pra/

1.4 Caution:

The general category classifications of “SEXUAL ASSAULT” is subdivided into 6 different categories

1.5 Caution:

each incident could have multiple reports: one report for suspect and one for each victim(s)

#GPS coordinates of National City, California, United States. Latitude: 32.6692 Longitude: -117.0890.

library(leaflet)
## 
## Attaching package: 'leaflet'
## The following object is masked from 'package:xts':
## 
##     addLegend
leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  setView(-117.0890, 32.6692, zoom = 12) %>%
  addMarkers(lng=-117.0890, lat=32.6692, popup="National City, CA")
#raw <- read.csv("C:/Users/menif/Downloads/sdcrime_16_20.csv")
all <-  read.csv("C:/Users/Joe/Downloads/sdcrime_16_20.csv")

all$date <- ymd_hms(all$activitydate)
all$year <- year(all$date)
all$month <- month(all$date,label=T)
all$weekday <-wday(all$date,label = T)
all$hour <- hour(all$date)
all$date <- floor_date(all$date, "day")
kable(head(all,n=5))
pk activitynumber activitydate year agency violationsection violationtype chargedescription chargelevel codeucr crimecategory personrole race age sex zipcode censusblock censustract city census_race tract_geoid block_geoid intptlat intptlon geometry date month weekday hour
12144871 ‘01600014’ 2016-01-01 00:00:00 2016 NATIONAL CITY 10851 VC TAKE VEHICLE W/O OWNER’S CONSENT/VEHICLE THEFT FELONY 7A0 Vehicle Theft VICTIM OTHER NA FEMALE 91950 2000 22000 NATIONAL CITY other 14000US06073022000 10100US060730220002000 32.67887 -117.0875 POINT (-117.0875061797541 32.67887038013) 2016-01-01 Jan Fri 0
12127327 ‘16000042’ 2016-01-01 00:00:00 2016 SAN DIEGO 459 PC BURGLARY/UNSPECIFIED FELONY 5A6 Non Res Burglary nan nan NA nan 92109 20000 7907 SAN DIEGO unknown 14000US06073007907 10100US0607300790720000 NaN NaN nan 2016-01-01 Jan Fri 0
12278698 ‘16005661’ 2016-01-01 00:00:00 2016 SAN DIEGO 488 PC PETTY THEFT MISDEMEANOR 6DG Larceny < $400 SUSPECT WHITE 28 MALE 92115 20230 2902 SAN DIEGO nhwhite 14000US06073002902 10100US0607300290220230 NaN NaN nan 2016-01-01 Jan Fri 0
12278698 ‘16005661’ 2016-01-01 00:00:00 2016 SAN DIEGO 488 PC PETTY THEFT MISDEMEANOR 6DG Larceny < $400 VICTIM/WITNESS HISPANIC 74 MALE 92115 20230 2902 SAN DIEGO hisp 14000US06073002902 10100US0607300290220230 NaN NaN nan 2016-01-01 Jan Fri 0
12364997 ‘16008822’ 2016-01-01 00:00:00 2016 SAN DIEGO 487(A) PC GRAND THEFT:MONEY/LABOR/PROPERTY OVER $950 FELONY 6AE Larceny >= $400 VICTIM/WITNESS WHITE 70 MALE 92109 20120 7905 SAN DIEGO nhwhite 14000US06073007905 10100US0607300790520120 NaN NaN nan 2016-01-01 Jan Fri 0
colnames(all)
##  [1] "pk"                "activitynumber"    "activitydate"     
##  [4] "year"              "agency"            "violationsection" 
##  [7] "violationtype"     "chargedescription" "chargelevel"      
## [10] "codeucr"           "crimecategory"     "personrole"       
## [13] "race"              "age"               "sex"              
## [16] "zipcode"           "censusblock"       "censustract"      
## [19] "city"              "census_race"       "tract_geoid"      
## [22] "block_geoid"       "intptlat"          "intptlon"         
## [25] "geometry"          "date"              "month"            
## [28] "weekday"           "hour"
all <- all[,c(1,8:9,11:20,23:25,4,26:29)]
names(all)[c(1,14:15)] <- c("ID","latitude","longitute")
colnames(all)
##  [1] "ID"                "chargedescription" "chargelevel"      
##  [4] "crimecategory"     "personrole"        "race"             
##  [7] "age"               "sex"               "zipcode"          
## [10] "censusblock"       "censustract"       "city"             
## [13] "census_race"       "latitude"          "longitute"        
## [16] "geometry"          "year"              "date"             
## [19] "month"             "weekday"           "hour"
kable(head(all,n=5))
ID chargedescription chargelevel crimecategory personrole race age sex zipcode censusblock censustract city census_race latitude longitute geometry year date month weekday hour
12144871 TAKE VEHICLE W/O OWNER’S CONSENT/VEHICLE THEFT FELONY Vehicle Theft VICTIM OTHER NA FEMALE 91950 2000 22000 NATIONAL CITY other 32.67887 -117.0875 POINT (-117.0875061797541 32.67887038013) 2016 2016-01-01 Jan Fri 0
12127327 BURGLARY/UNSPECIFIED FELONY Non Res Burglary nan nan NA nan 92109 20000 7907 SAN DIEGO unknown NaN NaN nan 2016 2016-01-01 Jan Fri 0
12278698 PETTY THEFT MISDEMEANOR Larceny < $400 SUSPECT WHITE 28 MALE 92115 20230 2902 SAN DIEGO nhwhite NaN NaN nan 2016 2016-01-01 Jan Fri 0
12278698 PETTY THEFT MISDEMEANOR Larceny < $400 VICTIM/WITNESS HISPANIC 74 MALE 92115 20230 2902 SAN DIEGO hisp NaN NaN nan 2016 2016-01-01 Jan Fri 0
12364997 GRAND THEFT:MONEY/LABOR/PROPERTY OVER $950 FELONY Larceny >= $400 VICTIM/WITNESS WHITE 70 MALE 92109 20120 7905 SAN DIEGO nhwhite NaN NaN nan 2016 2016-01-01 Jan Fri 0

2. Data selection:

battery <-  all %>% 
    filter(city == "NATIONAL CITY" & personrole=="SUSPECT")  %>%
                  filter(chargedescription== "INFLICT CORPORAL INJURY ON SPOUSE/COHABITANT" |
                  chargedescription=="BATTERY:SPOUSE/EX SPOUSE/DATE/ETC (M)" |
                  chargedescription=="SPOUSAL/COHABITANT ABUSE WITH SERIOUS INJURY (F)" |
                  chargedescription=="SPOUSAL/COHABITANT ABUSE WITH MINOR INJURY (F)" )                  

kable(table(battery$sex)) 
Var1 Freq
FEMALE 250
MALE 734
nan 0
NONBINARY 0
UNKNOWN 3
kable(table(battery$race)) 
Var1 Freq
ASIAN INDIAN 1
BLACK 162
CAMBODIAN 0
CHINESE 2
EAST AFRICAN 0
FILIPINO 33
GUAMANIAN 0
HAWAIIAN 0
HISPANIC 650
INDIAN 1
JAPANESE 0
KOREAN 0
LAOTIAN 0
MIDDLE EASTERN 1
nan 3
OTHER 9
OTHER ASIAN 9
PACIFIC ISLANDER 4
SAMOAN 0
VIETNAMESE 0
WHITE 112
nc_month <- battery %>% 
   group_by(year,month) %>%
           summarise(Total = n())
## `summarise()` regrouping output by 'year' (override with `.groups` argument)
nc_month$date <- paste(nc_month$year, nc_month$month, sep="-")
nc_month$date <- parse_date_time(nc_month$date,"Ym")
tseries <- xts(nc_month$Total, order.by=as.POSIXct(nc_month$date))

hchart(tseries, name = "Domestic Battery") %>%
  hc_add_theme(hc_theme_darkunica()) %>%
  hc_credits(enabled = TRUE, text = "Sources: SANDAG", 
             style = list(fontSize = "12px")) %>%
  hc_title(text = "Domestic Battery in National City, California") %>%
  hc_subtitle(text="Monthly counts from January 2016 to July 2020")
## Warning: `as_data_frame()` is deprecated as of tibble 2.0.0.
## Please use `as_tibble()` instead.
## The signature and semantics have changed, see `?as_tibble`.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
  #hc_legend(enabled = TRUE)
  
nc_month %>% ggplot(aes(x=date,y=Total))  +
             geom_bar(stat="identity",fill="darkblue",alpha=.7) +
             geom_smooth(color="red")+
             labs(x="Monthly",y="Count by Suspect", title = "Domestic Battery Incidents in National City, California", 
               subtitle = "Monthly counts from January 2016 to July 2020",
               caption = "Data source: SANDAG\nIllustration by @JoeLongSanDiego ")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Domestic Violence incidents in 2020 in National City, CA

battery_2020 <- filter(battery,year==2020)

ggplot(data=battery_2020,aes(x=month,fill=chargedescription)) + geom_bar() +
   labs(title = "Charge Description",subtitle = "2020 in National City, CA ",x="Month",y="Count" )

ggplot(data=battery_2020,aes(x=month,fill=sex)) + geom_bar() + 
   labs(title = "Gender of suspects",subtitle = "2020 in National City, CA ",x="Month",y="Count" )

battery_2020race <- battery_2020 %>% group_by(race) %>% summarise(count=n())  %>% 
                          arrange(desc(count)) 
## `summarise()` ungrouping output (override with `.groups` argument)
ggplot(data=battery_2020race,aes(x=reorder(race,-count),y=count,fill=race)) + 
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 90))

Each domestic violence incident might have more than one victim.

 v_battery <-  all %>% 
    filter(city == "NATIONAL CITY" & personrole=="VICTIM")  %>%
                  filter(chargedescription== "INFLICT CORPORAL INJURY ON SPOUSE/COHABITANT" |
                  chargedescription=="BATTERY:SPOUSE/EX SPOUSE/DATE/ETC (M)" |
                  chargedescription=="SPOUSAL/COHABITANT ABUSE WITH SERIOUS INJURY (F)" |
                  chargedescription=="SPOUSAL/COHABITANT ABUSE WITH MINOR INJURY (F)" )   

v_battery_2020 <- filter(v_battery, year==2020)

nc_month <- v_battery_2020 %>% 
   group_by(month,chargedescription) %>%
           summarise(Total = n())
## `summarise()` regrouping output by 'month' (override with `.groups` argument)
nc_month %>% ggplot(aes(x=month,y=Total,fill=chargedescription))  +
             geom_bar(stat = "identity") +
             
             labs(x="Monthly",y="Count by Victims", title = "Victims of Domestic Violence in National City, California", 
               subtitle = "Monthly counts from January 2020 to July 2020",
               caption = "Data source: SANDAG\nIllustration by @JoeLongSanDiego ")

#-------------------


race_nc <- v_battery %>% group_by(race) %>% summarise(count=n())  %>% arrange(desc(count)) 
## `summarise()` ungrouping output (override with `.groups` argument)
race_nc %>%
  kable(caption = "Race of victim _ Domestic Violence") %>%
   kable_styling(bootstrap_options = "striped", full_width = F, position = "center") %>%
   column_spec(1:2, color = "black", background = "lightblue") 
Race of victim _ Domestic Violence
race count
HISPANIC 668
WHITE 122
BLACK 112
FILIPINO 47
OTHER 27
OTHER ASIAN 9
PACIFIC ISLANDER 3
ASIAN INDIAN 2
CHINESE 2
MIDDLE EASTERN 2
SAMOAN 1
vrace_nc <- battery %>% group_by(race) %>% summarise(count=n())  %>% arrange(desc(count)) 
## `summarise()` ungrouping output (override with `.groups` argument)
vrace_nc %>%
  kable(caption = "Race of SUSPECT _ Domestic Violence") %>%
   kable_styling(bootstrap_options = "striped", full_width = F, position = "center") %>%
   column_spec(1:2, color = "black", background = "lightblue") 
Race of SUSPECT _ Domestic Violence
race count
HISPANIC 650
BLACK 162
WHITE 112
FILIPINO 33
OTHER 9
OTHER ASIAN 9
PACIFIC ISLANDER 4
nan 3
CHINESE 2
ASIAN INDIAN 1
INDIAN 1
MIDDLE EASTERN 1
#---- suspect
age_suspect <- battery %>%
  group_by(age) %>%
  summarise(total = n()) %>%
  na.omit()
## `summarise()` ungrouping output (override with `.groups` argument)
age_suspect %>%
  ggplot(aes(x = age, y = total)) +
  geom_line(color="blue") +
  geom_point(size=age_suspect$total/4,color="red") +
  labs(title = "Age of Suspects _ Domestic Violence Incidents\nNational City", 
       x = "Age", y = "Total", subtitle  = "Youngest = 16      Oldest = 85")

#---------------     VICTIMS IN 2020

victim_2020 <- filter(v_battery, year==2020)
age <- victim_2020 %>%
  group_by(age) %>%
  summarise(total = n()) %>%
  na.omit()
## `summarise()` ungrouping output (override with `.groups` argument)
age %>%
  ggplot(aes(x = age, y = total)) +
  geom_line(color="blue") +
  geom_point(size=age$total,color="red") +
  labs(title = "Age of Victim _ 2020 Domestic Violence", 
       x = "Age", y = "Total", subtitle  = "Youngest = 0      Oldest = 75\nNational City")

#---------SUSPECT IN 2020

suspect_2020 <- filter(battery, year==2020)
age <- suspect_2020 %>%
  group_by(age) %>%
  summarise(total = n()) %>%
  na.omit()
## `summarise()` ungrouping output (override with `.groups` argument)
age %>%
  ggplot(aes(x = age, y = total)) +
  geom_line(color="blue") +
  geom_point(size=age$total,color="red") +
  labs(title = "Age of Suspect _ 2020 Domestic Violence in National city, CA", 
       x = "Age", y = "Total", subtitle  = "Youngest = 17      Oldest = 76\nCircle sizes reflect count totals")

##Incident Mapping of Domestic Violence in 2020 in National City, CA **

library(leaflet)
suspect_2020 %>%
    leaflet() %>%
    addTiles() %>%
    addMarkers(~longitute, ~latitude,clusterOptions = markerClusterOptions())
## Warning in validateCoords(lng, lat, funcName): Data contains 5 rows with either
## missing or invalid lat/lon values and will be ignored

Overall Incident reports in 2020 from National City, CA

charge_2020 <-  all %>% 
    filter(city == "NATIONAL CITY" & year==2020 )  
charge_2020ID <- charge_2020[!duplicated(charge_2020$ID),]
                 
charge_2020 <- charge_2020ID %>% group_by(chargedescription) %>% summarise(Total=n()) %>%
                           arrange(desc(Total))
## `summarise()` ungrouping output (override with `.groups` argument)
head(charge_2020,n=20) %>%
  kable() %>%
   kable_styling(bootstrap_options = "striped", full_width = F, position = "center") %>%
   column_spec(1:2, color = "black", background = "lightblue") %>%
   row_spec(c(4,6,13),bold=TRUE) %>%
   footnote(general = "Three descriptions could be re-classified under DOMESTIC VIOLENCE
                     * if combining the three charges, total count could be higher")
chargedescription Total
TAKE VEHICLE W/O OWNER’S CONSENT/VEHICLE THEFT (F) 162
BURGLARY (VEHICLE) (F) 133
PETTY THEFT(All Other Larceny) (M) 72
BATTERY:SPOUSE/EX SPOUSE/DATE/ETC (M) 67
SIMPLE BATTERY (M) 54
SPOUSAL/COHABITANT ABUSE WITH MINOR INJURY (F) 54
ROBBERY (F) 53
GRAND THEFT:MONEY/LABOR/PROPERTY (F) 45
PETTY THEFT(from Veh) (M) 44
BURGLARY (COMMERCIAL) (F) 41
ASSAULT W/DEADLY WEAPON:NOT F/ARM (F) 38
SHOPLIFTING (M) 32
SPOUSAL/COHABITANT ABUSE WITH SERIOUS INJURY (F) 31
PETTY THEFT(from Building) (M) 30
BURGLARY (RESIDENTIAL) (F) 25
PETTY THEFT(Shoplift) (M) 24
OTHER AGENCY VEHICLE THEFT/RECOVERY (F) 21
WILLFUL CRUELTY TO CHILD WITHOUT INJURY OR DEATH (F) 17
GRAND THEFT (Theft From Mot Veh) (F) 13
BATTERY ON PERSON (M) 12
Note:
Three descriptions could be re-classified under DOMESTIC VIOLENCE
* if combining the three charges, total count could be higher
ggplot(data=head(charge_2020,n=10), aes(x=reorder(chargedescription,Total), y=Total)) +
  geom_bar(stat = "identity",fill="blue")+
 
 
  labs(title = "Ten Top Crime Incidents _ 2020 National City, CA",
       x="",y="Count") +
    coord_flip()