library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.3.0      ✔ stringr 1.5.0 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(odbc)
## Warning: package 'odbc' was built under R version 4.2.3
library(DBI)
library(RSQLite)
## Warning: package 'RSQLite' was built under R version 4.2.3
library(readr)

Read in the whole dataset in csv format

lapd_crime_2020_to_20231112 <- read_csv("C:/Users/joel.torres/Documents/Data Analytics/LAPD_crime/Crime_Data_from_2020_to_Present_20231112.csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 834320 Columns: 28
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (17): DR_NO, Date Rptd, DATE OCC, TIME OCC, AREA, AREA NAME, Rpt Dist No...
## dbl (10): Part 1-2, Crm Cd, Vict Age, Premis Cd, Weapon Used Cd, Crm Cd 1, C...
## lgl  (1): Crm Cd 4
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Exploratory Data Analysis

Initial thoughts about the data

I would like to look into what age group victims are more susceptible to crime in general. Are certain age groups more prone to specific type of crimes. What are the differences in crime by gender and victim age. Does victim descent play a role in the counts of crime. What geographic area based off of police district zones have higher counts of crime.

First focus on victim age groups, gender, and types of crime associated with each. I’m thinking of classifying crimes into one category of violent crimes which include murder, manslaughter, rape, robbery, and assault. The other group would then just be classified as non-violent crimes.

Calling the dataset to create a tibble

Basic Data Exploration

lapd_crime_2020_to_20231112
## # A tibble: 834,320 × 28
##    DR_NO   Date …¹ DATE …² TIME …³ AREA  AREA …⁴ Rpt D…⁵ Part …⁶ Crm C…⁷ Crm C…⁸
##    <chr>   <chr>   <chr>   <chr>   <chr> <chr>   <chr>     <dbl>   <dbl> <chr>  
##  1 010304… 01/08/… 01/08/… 2230    03    Southw… 0377          2     624 BATTER…
##  2 190101… 01/02/… 01/01/… 0330    01    Central 0163          2     624 BATTER…
##  3 200110… 04/14/… 02/13/… 1200    01    Central 0155          2     845 SEX OF…
##  4 191501… 01/01/… 01/01/… 1730    15    N Holl… 1543          2     745 VANDAL…
##  5 191921… 01/01/… 01/01/… 0415    19    Mission 1998          2     740 VANDAL…
##  6 200100… 01/02/… 01/01/… 0030    01    Central 0163          1     121 RAPE, …
##  7 200100… 01/02/… 01/02/… 1315    01    Central 0161          1     442 SHOPLI…
##  8 200100… 01/04/… 01/04/… 0040    01    Central 0155          2     946 OTHER …
##  9 200100… 01/04/… 01/04/… 0200    01    Central 0101          1     341 THEFT-…
## 10 201710… 06/19/… 05/26/… 1925    17    Devons… 1708          1     341 THEFT-…
## # … with 834,310 more rows, 18 more variables: Mocodes <chr>, `Vict Age` <dbl>,
## #   `Vict Sex` <chr>, `Vict Descent` <chr>, `Premis Cd` <dbl>,
## #   `Premis Desc` <chr>, `Weapon Used Cd` <dbl>, `Weapon Desc` <chr>,
## #   Status <chr>, `Status Desc` <chr>, `Crm Cd 1` <dbl>, `Crm Cd 2` <dbl>,
## #   `Crm Cd 3` <dbl>, `Crm Cd 4` <lgl>, LOCATION <chr>, `Cross Street` <chr>,
## #   LAT <dbl>, LON <dbl>, and abbreviated variable names ¹​`Date Rptd`,
## #   ²​`DATE OCC`, ³​`TIME OCC`, ⁴​`AREA NAME`, ⁵​`Rpt Dist No`, ⁶​`Part 1-2`, …

Based off of this crime summary tibble I can conclude there are at least 834,320 rows of data and/or crime incidents.

The lapd_crime main dataset is too large to view all columns within a tibble in markdown. To view full dataset click the following link

LACity.org

Here is an iframe of the LA City LAPD crime dataset in Power BI

knitr::include_url("https://app.powerbi.com/view?r=eyJrIjoiMzRiZGM2Y2MtOTBmNS00YTRkLTljY2EtMDJhODRiNzY4ZmQzIiwidCI6IjFjMmFhOTkyLTgzNjItNGNiZi1hOTYxLTk1Njk5NWMzNDM3MyIsImMiOjZ9&embedImagePlaceholder=true")

To view the Power BI visualization on another page click the following

Power BI

Read in a csv file of a basic summary of the crime data.

lapd_crime_summary <- read.csv("C:/Users/joel.torres/Documents/Data Analytics/Rstudio/LAPD_crime/crime_basic_summary.csv")

Calling a csv file produced a chart that was not readable.

lapd_crime_summary
##   total_ct_crime.avg_age.min_vict_age.max_vict_age
## 1                                  834320;29;0;120

Calling a xls file produced a chart that was better formatted to read.

library(readxl)
crime_basic_summary_2 <- read_excel("C:/Users/joel.torres/Documents/Data Analytics/Rstudio/LAPD_crime/crime_basic_summary_2.xls")
crime_basic_summary_2
## # A tibble: 1 × 4
##   total_ct_crime avg_age min_vict_age max_vict_age
##            <dbl>   <dbl>        <dbl>        <dbl>
## 1         834320      29            0          120

An average age of a victim to be 29 years old. Minimum age for victim at 0 years of age and maximum age of a victim at 120 years of age.

Use SQLite to query database from memory

con <- dbConnect(SQLite(), ":memory:")
copy_to(con, crime_basic_summary_2)

dbDisconnect(con)

con <- dbConnect(SQLite(), ":memory:")
copy_to(con, lapd_crime_2020_to_20231112)

Explore victim age and crime description

dbGetQuery(con, '
  SELECT "Vict Age", "Crm Cd Desc"
  FROM lapd_crime_2020_to_20231112
  LIMIT 50
')
##    Vict Age                                              Crm Cd Desc
## 1        36                                 BATTERY - SIMPLE ASSAULT
## 2        25                                 BATTERY - SIMPLE ASSAULT
## 3         0                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 4        76                 VANDALISM - MISDEAMEANOR ($399 OR UNDER)
## 5        31  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 6        25                                           RAPE, FORCIBLE
## 7        23                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 8         0                                OTHER MISCELLANEOUS CRIME
## 9        23 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 10        0 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 11       29                                    BURGLARY FROM VEHICLE
## 12       35                   CRIMINAL THREATS - NO WEAPON DISPLAYED
## 13       41 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 14        0                                                    ARSON
## 15       24                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 16       24                        INTIMATE PARTNER - SIMPLE ASSAULT
## 17       34                        INTIMATE PARTNER - SIMPLE ASSAULT
## 18       29                       THEFT PLAIN - PETTY ($950 & UNDER)
## 19       46                                        THEFT OF IDENTITY
## 20       66                                                  ROBBERY
## 21       40                                        THEFT OF IDENTITY
## 22       31 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 23       27                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 24       46                                        THEFT OF IDENTITY
## 25       62           ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT
## 26       34                                        THEFT OF IDENTITY
## 27       43                                    BURGLARY FROM VEHICLE
## 28       71           ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT
## 29       50                                 BATTERY - SIMPLE ASSAULT
## 30       19                                           RAPE, FORCIBLE
## 31       51                                                 BURGLARY
## 32       27                                        THEFT OF IDENTITY
## 33       33                        INTIMATE PARTNER - SIMPLE ASSAULT
## 34        0                                         VEHICLE - STOLEN
## 35       69                   CRIMINAL THREATS - NO WEAPON DISPLAYED
## 36       39                                        THEFT OF IDENTITY
## 37       19                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 38       57                                 BATTERY - SIMPLE ASSAULT
## 39        0  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 40       78                                 BATTERY - SIMPLE ASSAULT
## 41        0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 42       24                                    BURGLARY FROM VEHICLE
## 43       52           ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT
## 44       38           ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT
## 45       55                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 46        0          THEFT FROM MOTOR VEHICLE - PETTY ($950 & UNDER)
## 47       66  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 48       34                                          BRANDISH WEAPON
## 49       50                    INTIMATE PARTNER - AGGRAVATED ASSAULT
## 50       35                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)

At first glance age value 0 for “Vict Age” column seems like a problem in data reporting. The crime description does not fit the age.

Explore crime description for NULL and age 0 values

dbGetQuery(con, '
  SELECT "Vict Age", "Crm Cd Desc"
  FROM lapd_crime_2020_to_20231112
  ORDER BY "Vict Age" Asc
  LIMIT 100
')
##     Vict Age                                              Crm Cd Desc
## 1         -3                 EMBEZZLEMENT, PETTY THEFT ($950 & UNDER)
## 2         -2                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 3         -2                                        THEFT OF IDENTITY
## 4         -2                                        THEFT OF IDENTITY
## 5         -2                                OTHER MISCELLANEOUS CRIME
## 6         -2 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 7         -2                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 8         -2                                                 BURGLARY
## 9         -2                   CRIMINAL THREATS - NO WEAPON DISPLAYED
## 10        -2                                        THEFT OF IDENTITY
## 11        -2                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 12        -2                                         VEHICLE - STOLEN
## 13        -2                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 14        -2 SEX,UNLAWFUL(INC MUTUAL CONSENT, PENETRATION W/ FRGN OBJ
## 15        -1                                                 BURGLARY
## 16        -1                                                 BURGLARY
## 17        -1                                        THEFT OF IDENTITY
## 18        -1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 19        -1 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 20        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 21        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 22        -1                 VANDALISM - MISDEAMEANOR ($399 OR UNDER)
## 23        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 24        -1                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 25        -1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 26        -1                                                 BURGLARY
## 27        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 28        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 29        -1                                        THEFT OF IDENTITY
## 30        -1                         DOCUMENT FORGERY / STOLEN FELONY
## 31        -1      THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND OVER)
## 32        -1                                        THEFT OF IDENTITY
## 33        -1                                                 BURGLARY
## 34        -1                       THEFT PLAIN - PETTY ($950 & UNDER)
## 35        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 36        -1                       THEFT PLAIN - PETTY ($950 & UNDER)
## 37        -1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 38        -1                                        THEFT OF IDENTITY
## 39        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 40        -1                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 41        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 42        -1                                              TRESPASSING
## 43        -1                         DOCUMENT FORGERY / STOLEN FELONY
## 44        -1                                                 BURGLARY
## 45        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 46        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 47        -1      THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND OVER)
## 48        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 49        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 50        -1                       THEFT PLAIN - PETTY ($950 & UNDER)
## 51        -1                                        THEFT OF IDENTITY
## 52        -1 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 53        -1 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 54        -1                                       BUNCO, GRAND THEFT
## 55        -1                                          BRANDISH WEAPON
## 56        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 57        -1                       THEFT PLAIN - PETTY ($950 & UNDER)
## 58        -1                             UNAUTHORIZED COMPUTER ACCESS
## 59        -1                                       BUNCO, GRAND THEFT
## 60        -1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 61        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 62        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 63        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 64        -1 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 65        -1               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 66        -1                       THEFT PLAIN - PETTY ($950 & UNDER)
## 67        -1                                        THEFT OF IDENTITY
## 68        -1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 69        -1                                        THEFT OF IDENTITY
## 70        -1                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 71        -1                                        THEFT OF IDENTITY
## 72        -1                                       BUNCO, GRAND THEFT
## 73         0                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 74         0                                OTHER MISCELLANEOUS CRIME
## 75         0 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 76         0                                                    ARSON
## 77         0                                         VEHICLE - STOLEN
## 78         0  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 79         0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 80         0          THEFT FROM MOTOR VEHICLE - PETTY ($950 & UNDER)
## 81         0                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 82         0                                  BATTERY POLICE (SIMPLE)
## 83         0                                  BATTERY POLICE (SIMPLE)
## 84         0                 VANDALISM - MISDEAMEANOR ($399 OR UNDER)
## 85         0                                                 BURGLARY
## 86         0                                         VEHICLE - STOLEN
## 87         0                                                  ROBBERY
## 88         0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 89         0                                         VEHICLE - STOLEN
## 90         0                                         VEHICLE - STOLEN
## 91         0                 VANDALISM - MISDEAMEANOR ($399 OR UNDER)
## 92         0          THEFT FROM MOTOR VEHICLE - PETTY ($950 & UNDER)
## 93         0                                    BURGLARY FROM VEHICLE
## 94         0                                         VEHICLE - STOLEN
## 95         0                                         VEHICLE - STOLEN
## 96         0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 97         0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 98         0                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 99         0                                         VEHICLE - STOLEN
## 100        0          THEFT FROM MOTOR VEHICLE - PETTY ($950 & UNDER)

In RMarkdown the NULL values are coming in as negative numbers. I know this because in SQL Server those same negative values are NULL.

Upon looking further at the data set with just the first top 100 rows and in ascending order by age I could see there are quite a few values for Null/negative values and 0 values.Taking a closer look at the victim age columns and Crime descriptions we can gather that crimes that have NULL, negative (-) values, and 0 value have crime descriptions which are victimless or where the crime committed for the age group of 0 seems to be out of the norm.For example crimes of burglary, theft, trespassing, and out of compliance are a few of the crimes for which there is no direct victim. If there was a victim but the office reporting could not get victims age then reporting a 0 was possible. Due to all these various possibilities I will be removing value 0 for victim age column.

Cleaning up the age column data

This data set is going to come with quite a bit of reporting errors. The data set as described was inputted from crime incident reports and was not cleaned up for any type of formatting errors. When exploring and summarizing for age I will not be including the null/negative and 0 values. The first summary for table 1.1 is therefore inaccurate.

Let’s get a count of crime with age reported and crime with no age reported (0 value).

dbGetQuery(con, '
  SELECT COUNT("Vict Age") AS "Count_of_0age_victim_crime"
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" <= 0
  ORDER BY "Vict Age" ASC
')
##   Count_of_0age_victim_crime
## 1                     206407
dbGetQuery(con, '
  SELECT COUNT("Vict Age") AS "Count_of_victim_crime_wAge"
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" > 0
  ORDER BY "Vict Age" ASC
')
##   Count_of_victim_crime_wAge
## 1                     627913

Exploring age distributions for crime by box plot

library(tidyverse)

crime_by_vict_age <- read_xlsx("C:/Users/joel.torres/Documents/Data Analytics/LAPD_crime/crime_by_vict_age.xlsx")

lapd_crime_df <- data.frame(lapd_crime_2020_to_20231112)
lapd_victim_crime_df <- data.frame(crime_by_vict_age)

ggplot(data = lapd_victim_crime_df, aes(x = Vict_Age, y= "")) +
  geom_boxplot()

The dataset for this boxplot did not include Nulls/Negative numbers nor 0 values. I was curious about the IQR range and if they’ll be any outliers. Based on the results there are definitely quite a few outliers that I’d like to address.

Quantile values for victim age data set

q_marks <- quantile(crime_by_vict_age$`Vict_Age`)
iqr <- IQR(crime_by_vict_age$Vict_Age)
q_marks[2]
## 25% 
##  28
q_marks[4]
## 75% 
##  50
iqr
## [1] 22
up_range <- q_marks[4] + 1.5*iqr # Upper Range

up_range
## 75% 
##  83

Our results indicate that quartile 1 is 28, quartile 3 is 50. The Interquartile Range (IQR) is 22. When calculating the quartile range outside of the IQR box but within the whisker range I get a result of 83. This means that any values 84 and above are considered outliers.

dbGetQuery(con, '
  SELECT COUNT("Vict Age") as vict_total_above83
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" > 83
')
##   vict_total_above83
## 1               3785

Summary for victim_age column

834,220 total crime cases. 206407 total crime for incidents where victim age of 0 is not applicable. Which leaves 627913 crimes with victim ages are greater than 1 after filtering out 0 values. We also see that there are 3785 outliers in our dataset. These outliers are victim age values that are greater than 83. I will have to further analyse the outlier values to see if there is any merit to keeping the values or removing the values for the sake of doing a summary of statistics on the victim age. To do so we’ll need to analyze the crime description for the age outliers.

Exploration of age outliers in victim data set

dbGetQuery(con, '
  SELECT COUNT(DISTINCT "Crm Cd Desc") AS Desc_crime_count_greater_than_age_83
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" > 83
')
##   Desc_crime_count_greater_than_age_83
## 1                                   76
dbGetQuery(con, '
  SELECT DISTINCT "Crm Cd Desc" AS Crime_description
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" > 83
')
##                                           Crime_description
## 1                                             THEFT, PERSON
## 2                                         ATTEMPTED ROBBERY
## 3                            CHILD NEGLECT (SEE 300 W.I.C.)
## 4                        THEFT PLAIN - PETTY ($950 & UNDER)
## 5                                         THEFT OF IDENTITY
## 6                                  BATTERY - SIMPLE ASSAULT
## 7                                                  BURGLARY
## 8       THEFT FROM MOTOR VEHICLE - GRAND ($950.01 AND OVER)
## 9                                                   ROBBERY
## 10                                      BURGLARY, ATTEMPTED
## 11                                       BUNCO, GRAND THEFT
## 12                                    BURGLARY FROM VEHICLE
## 13 THEFT-GRAND ($950.01 & OVER)EXCPT,GUNS,FOWL,LIVESTK,PROD
## 14                 VANDALISM - MISDEAMEANOR ($399 OR UNDER)
## 15           ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT
## 16  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 17                                            BIKE - STOLEN
## 18                         DOCUMENT FORGERY / STOLEN FELONY
## 19          THEFT FROM MOTOR VEHICLE - PETTY ($950 & UNDER)
## 20                    CREDIT CARDS, FRAUD USE ($950 & UNDER
## 21                                OTHER MISCELLANEOUS CRIME
## 22                                               KIDNAPPING
## 23                           VIOLATION OF RESTRAINING ORDER
## 24                                 VIOLATION OF COURT ORDER
## 25                      SEXUAL PENETRATION W/FOREIGN OBJECT
## 26                                                 STALKING
## 27                                        INDECENT EXPOSURE
## 28                                              TRESPASSING
## 29                                       BUNCO, PETTY THEFT
## 30                                     DISTURBING THE PEACE
## 31                                          BRANDISH WEAPON
## 32                        INTIMATE PARTNER - SIMPLE ASSAULT
## 33                         BURGLARY FROM VEHICLE, ATTEMPTED
## 34                           CHILD ANNOYING (17YRS & UNDER)
## 35 VEHICLE, STOLEN - OTHER (MOTORIZED SCOOTERS, BIKES, ETC)
## 36                    INTIMATE PARTNER - AGGRAVATED ASSAULT
## 37                                            OTHER ASSAULT
## 38                  LETTERS, LEWD  -  TELEPHONE CALLS, LEWD
## 39                        SHOTS FIRED AT INHABITED DWELLING
## 40                  CHILD ABUSE (PHYSICAL) - SIMPLE ASSAULT
## 41                   CRIMINAL THREATS - NO WEAPON DISPLAYED
## 42                                 VEHICLE - ATTEMPT STOLEN
## 43                                               PICKPOCKET
## 44                             UNAUTHORIZED COMPUTER ACCESS
## 45                                                    ARSON
## 46                                    THEFT PLAIN - ATTEMPT
## 47                              BATTERY WITH SEXUAL CONTACT
## 48                                        CONTEMPT OF COURT
## 49                          THREATENING PHONE CALLS/LETTERS
## 50                                        CRIMINAL HOMICIDE
## 51                                                  PROWLER
## 52                                           BUNCO, ATTEMPT
## 53                 VIOLATION OF TEMPORARY RESTRAINING ORDER
## 54                                           CHILD STEALING
## 55                 SHOPLIFTING - PETTY THEFT ($950 & UNDER)
## 56                                           RAPE, FORCIBLE
## 57                HUMAN TRAFFICKING - INVOLUNTARY SERVITUDE
## 58                               KIDNAPPING - GRAND ATTEMPT
## 59                                         VEHICLE - STOLEN
## 60                         DISHONEST EMPLOYEE - GRAND THEFT
## 61                      DOCUMENT WORTHLESS ($200.01 & OVER)
## 62                                  BATTERY POLICE (SIMPLE)
## 63                                              COUNTERFEIT
## 64                                                EXTORTION
## 65                 CREDIT CARDS, FRAUD USE ($950.01 & OVER)
## 66                SEX OFFENDER REGISTRANT OUT OF COMPLIANCE
## 67 CRM AGNST CHLD (13 OR UNDER) (14-15 & SUSP 10 YRS OLDER)
## 68               EMBEZZLEMENT, GRAND THEFT ($950.01 & OVER)
## 69                           DISCHARGE FIREARMS/SHOTS FIRED
## 70  SODOMY/SEXUAL CONTACT B/W PENIS OF ONE PERS TO ANUS OTH
## 71                 SHOPLIFTING-GRAND THEFT ($950.01 & OVER)
## 72         SHOTS FIRED AT MOVING VEHICLE, TRAIN OR AIRCRAFT
## 73                              THEFT FROM PERSON - ATTEMPT
## 74                       THEFT FROM MOTOR VEHICLE - ATTEMPT
## 75                               WEAPONS POSSESSION/BOMBING
## 76                                          PURSE SNATCHING

Summary of results:

Based off of the crime description for victim ages 84 and above you see that some crimes are what might be considered crimes with no particular individual but that of against a group. However, the number of crimes equal to or greater than 84 are 3,785 compared to that of crimes equal to 0 or null are 206,335.It is apparent that the 0 value for age was meant as a way to signify that a crime according to the police report either was a victimless crime, had no value to input (null), or also a group crime, or was just missing. Either way when calculating a mean summary statistics for the victim_age variable we cannot use those 206,335 rows for 0 age values.

At first I was going to use the victim_age column as a way to differentiate between crimes with victims and non-victim crimes. However, the way this column value was reported in the data set makes it difficult to distinguish. For one instead of using null for crimes with no victim they used the value of 0. Also, the interpretation of a victim is defined differently in the reporting as seen by the data. For example for the same crime description of auto theft some reports have a value of 0 indicating the officer felt this was a victimless crime, however, in other incidences of the same crime it was reported with an age of an adult. This means that other officers felt that the person who’s car got stolen was a victim. For the purposes of this study I will evaluate all crime as having a victim when age is reported greater than 0. When making summary calculations for age variable I will rule out the age 0 value.

The idea here is that crimes that were reported as having an age value of 99 were seen as crimes against a group of people. Some crimes for example theft/vandalism to a vehicle would be easier to report an age value to because you can pinpoint the crime to an individual. However, a crime of theft/burglary to a church is a crime that does have a victim, not an individual, but to a group of people.

When I calculate for mean age for crimes I need to take out the 0 values for reasons stated earlier. But what about for the values that are outliers? I believe that we can rationaly make the conclusion that most of the outliers (84 and above), definitely age value 99, were values used for crimes against groups.

For clarification, I am summarizing that all crimes have a victim, however, when summarizing the mean I will not be using the 0 age values nor the outliers. That being said using the mean to describe these crimes might not be as appropriate compared to using the boxplot IQR, median, and mode values.

I will first analyse the crime descriptions and crime codes to check how many crime categories we have. I will check those categories for crime descriptions that are more applicable to crimes which might not apply to a singular victim but more so have an effect on a group.

Crimes that would effect a group or collective whole

Crimes that involve vandalism and burglary can possibly be distinguished as being crimes that in our specific requirements fall under crimes that effect a group as whole. Burglary for example is typical of crimes to property whereas crimes involving a robbery are crimes that require a threat or physical nature. However, for our intent and purposes even a crime such as burglary can in fact indirectly have an individual victim. For example if a persons home is burglarized then they have become a victim even though they might not have directly been threatened physically or verbally or have been present but in general said crime can in fact effect one person. If this is the case for such crimes then it’s possible the officer reporting the crime might have put the age of the victim that reported/called in the crime or who it effected indirectly. Vandalism therefore as well can fall under the same rationale as that of burglary. About the only crime for victim age outliers that I can definitively say the age reported by the officer is not representative of the individual or the group is VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS). But lets look at the Crm_Cd_Desc and see how it matches with Premis_Desc.

dbGetQuery(con, '
  SELECT "Crm Cd Desc" AS Crime_description, "Premis Desc", "Vict Age"
  FROM lapd_crime_2020_to_20231112
  WHERE "Vict Age" > 83
  AND "Crm Cd Desc" LIKE "%CHURCH VANDALISMS%"
  ORDER BY "Vict Age" DESC
  LIMIT 50
')
##                                          Crime_description
## 1  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 2  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 3  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 4  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 5  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 6  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 7  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 8  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 9  VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 10 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 11 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 12 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 13 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 14 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 15 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 16 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 17 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 18 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 19 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 20 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 21 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 22 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 23 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 24 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 25 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 26 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 27 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 28 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 29 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 30 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 31 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 32 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 33 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 34 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 35 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 36 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 37 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 38 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 39 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 40 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 41 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 42 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 43 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 44 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 45 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 46 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 47 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 48 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 49 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
## 50 VANDALISM - FELONY ($400 & OVER, ALL CHURCH VANDALISMS)
##                                     Premis Desc Vict Age
## 1                        SINGLE FAMILY DWELLING       99
## 2                      VEHICLE, PASSENGER/TRUCK       99
## 3                      VEHICLE, PASSENGER/TRUCK       99
## 4                      VEHICLE, PASSENGER/TRUCK       99
## 5  MULTI-UNIT DWELLING (APARTMENT, DUPLEX, ETC)       99
## 6  MULTI-UNIT DWELLING (APARTMENT, DUPLEX, ETC)       99
## 7                      VEHICLE, PASSENGER/TRUCK       99
## 8                                        STREET       99
## 9                              DEPARTMENT STORE       99
## 10                         RESTAURANT/FAST FOOD       99
## 11                                       STREET       99
## 12                         RESTAURANT/FAST FOOD       99
## 13                     VEHICLE, PASSENGER/TRUCK       99
## 14                               OTHER BUSINESS       99
## 15                                       STREET       99
## 16                                       STREET       99
## 17                       SINGLE FAMILY DWELLING       98
## 18                                  OTHER STORE       98
## 19                     VEHICLE, PASSENGER/TRUCK       98
## 20                               OTHER BUSINESS       97
## 21                       SINGLE FAMILY DWELLING       97
## 22 MULTI-UNIT DWELLING (APARTMENT, DUPLEX, ETC)       96
## 23                                  PARKING LOT       96
## 24                     VEHICLE, PASSENGER/TRUCK       95
## 25                       SINGLE FAMILY DWELLING       95
## 26                               OTHER BUSINESS       94
## 27                       SINGLE FAMILY DWELLING       94
## 28                     VEHICLE, PASSENGER/TRUCK       94
## 29                     VEHICLE, PASSENGER/TRUCK       94
## 30                                       STREET       94
## 31                       SINGLE FAMILY DWELLING       93
## 32                       SINGLE FAMILY DWELLING       93
## 33                       SINGLE FAMILY DWELLING       93
## 34                       SINGLE FAMILY DWELLING       92
## 35                       SINGLE FAMILY DWELLING       92
## 36                       SINGLE FAMILY DWELLING       92
## 37                               OTHER BUSINESS       92
## 38                       SINGLE FAMILY DWELLING       91
## 39                       SINGLE FAMILY DWELLING       91
## 40                       SINGLE FAMILY DWELLING       91
## 41                              OTHER RESIDENCE       91
## 42                       SINGLE FAMILY DWELLING       91
## 43                                        ALLEY       91
## 44                       SINGLE FAMILY DWELLING       90
## 45                                       STREET       90
## 46                                     DRIVEWAY       90
## 47                       SINGLE FAMILY DWELLING       90
## 48                     VEHICLE, PASSENGER/TRUCK       90
## 49 MULTI-UNIT DWELLING (APARTMENT, DUPLEX, ETC)       90
## 50                       SINGLE FAMILY DWELLING       90

I was trying to make the assumption that the outlier Victim Age values which are values greater than 83 are prone to values which do not represent the correct age of a victim. If I could surmise that the crime description was a crime for a non specific individual, like a group, then I could remove such data from creating a statistal summary. If the age values are not valid then that would effect the mean age values from crime. However, upon inspecting the Premise Description along with it’s crime code descriptions there doesn’t seem to be a correlation that crimes to property such as vandalism are only effecting the general group and not individuals (that might have a legitimate age value). There are vandalism crimes with Premis Desc that are happening in a single family dwelling or vehicle which can imply that the victim age is possibly correct.

Summary of the victim age values

For this dataset because there are so few crimes above the age of 83, and because of the possibility that a higher age was used to denote an unknown age, or a crime to public property, and because with the use of standard deviation the calculated values greater than 83 are outliers. Therefore I will not include these outliers, along with value age 0, to calculate statistical summaries.

Transform data into age groups

dbGetQuery(con, '
  SELECT 
  "DR_NO", "Vict Age",
  IIF("Vict Age" between 1 and 4, "Toddler",
    IIF("Vict Age" between 5 and 12, "Kid", 
    IIF("Vict Age" between 13 and 17, "Teenager", 
    IIF("Vict Age" between 18 and 25, "Young Adult", 
    IIF("Vict Age" between 26 and 39, "Adult", 
    IIF("Vict Age" between 40 and 65, "Middle Age", "Senior")))))) as Age_Group
FROM lapd_crime_2020_to_20231112
where "Vict Age" between 1 and 83
order by "Vict Age" asc
LIMIT 10;
')
##        DR_NO Vict Age Age_Group
## 1  210816791        2   Toddler
## 2  200907120        2   Toddler
## 3  211304234        2   Toddler
## 4  200715304        2   Toddler
## 5  221806711        2   Toddler
## 6  211219616        2   Toddler
## 7  201320179        2   Toddler
## 8  200516029        2   Toddler
## 9  200307249        2   Toddler
## 10 200108589        2   Toddler

A csv file was created from the sql query above but for the whole data set result not just limited to the 10 rows. Which gives me a category age grouping for every victim age except the outliers and the 0 values. Lets take a look at the distinct category age groups and their respective age range.

lapd_vict_age_bins <- read_excel("C:/Users/joel.torres/Documents/Data Analytics/LAPD_crime/lapd_vict_age_groups_4.xlsx")
dbDisconnect(con)
con <- dbConnect(SQLite(), ":memory:")
copy_to(con, lapd_vict_age_bins)
dbGetQuery(con, '
  SELECT DISTINCT "Age_Group",
  IIF("Age_Group" = "Toddler", "a) 1-4",
    IIF("Age_Group" = "Kid", "b) 5-12", 
    IIF("Age_Group" = "Teenager", "c) 13-17", 
    IIF("Age_Group" = "Young Adult", "d) 18-25", 
    IIF("Age_Group" = "Adult", "e) 26-39", 
    IIF("Age_Group" = "Middle Age", "f) 40-65", "g) 66-plus")))))) as Age_Range
FROM lapd_vict_age_bins


')
##     Age_Group  Age_Range
## 1     Toddler     a) 1-4
## 2         Kid    b) 5-12
## 3    Teenager   c) 13-17
## 4 Young Adult   d) 18-25
## 5       Adult   e) 26-39
## 6  Middle Age   f) 40-65
## 7      Senior g) 66-plus

I have 7 different categories with their respective age ranges.

Lets take a look at total count of crime for each age group

dbGetQuery(con, '
  SELECT "Age_Group", COUNT("Age_Group") as Age_Group_Total
  FROM lapd_vict_age_bins
  GROUP BY "Age_Group"
  ORDER BY Age_Group_Total ASC;
')
##     Age_Group Age_Group_Total
## 1     Toddler            1246
## 2         Kid            5964
## 3    Teenager           14935
## 4      Senior           38578
## 5 Young Adult           96117
## 6  Middle Age          231759
## 7       Adult          235528

Here we can see that most crime in Los Angeles effects the middle age and Adult populations.