How does the crime rate change over time in Montgomery County, Howard County, Prince George’s County, Anne Arundel County, and Baltimore County?

Introduction

How does the crime rate change over time in Maryland? I will be choosing the five most populated counties in Maryland which are Montgomery County, Howard County, Prince George’s County, Anne Arundel County and Baltimore County.This data set tracks the amount of crime in Maryland more specifically in each county, this data set covers violent crime, such as rape, murder etc. and property crime such as m/v theft, b&e etc. This data set covers the crimes from 1975 to 2020. The data set also tracks the population, as well as the change in percentage per crime, as well as the crime per 100,000 people. To answer this question this data set to find the crime rate I will use grand_total, which is the total of crimes counted all together and we will divide it by the population and multiply it by 100,000 to find the rate of crime. I will be looking at how the crime rate changes over the 47 years of data this data set covers.

Data Analysis

First, set the working directory and get the data set. After I use gsub to clean up the data set and get rid of any spaces, commas, etc. Look at the structure of the data set and dimensions. After, use filter to make sure the only counties we are looking at are Montgomery County, Prince George’s, Howard, Anne Arundel, and Baltimore county. Then use the mutate function to create the crime rate, by taking the grand total of crime dividing it by the population and multiplying by 100,000. Using the select function, we can make the data set only show, jurisdiction, year, population, and crime rate this helps with the visualization. Later check for any n/a values, and then visualize the data using a line graph.

# load the libraries
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
setwd("C:/Users/rjzavaleta/Downloads/Data 101")
crime_county <- read_csv("Violent_Crime___Property_Crime_by_County__1975_to_Present.csv")
# cleaning
names(crime_county) <- tolower(names(crime_county))
names(crime_county) <- gsub(" ","_",names(crime_county))
names(crime_county) <- gsub("[(). //-]", "_", names(crime_county))
head(crime_county)
## # A tibble: 6 × 38
##   jurisdiction     year population murder  rape robbery agg__assault `b_&_e`
##   <chr>           <dbl>      <dbl>  <dbl> <dbl>   <dbl>        <dbl>   <dbl>
## 1 Allegany County  1975      79655      3     5      20          114     669
## 2 Allegany County  1976      83923      2     2      24           59     581
## 3 Allegany County  1977      82102      3     7      32           85     592
## 4 Allegany County  1978      79966      1     2      18           81     539
## 5 Allegany County  1979      79721      1     7      18           84     502
## 6 Allegany County  1980      80461      2    12      26           79     541
## # ℹ 30 more variables: larceny_theft <dbl>, m_v_theft <dbl>, grand_total <dbl>,
## #   percent_change <dbl>, violent_crime_total <dbl>,
## #   violent_crime_percent <dbl>, violent_crime_percent_change <dbl>,
## #   property_crime_totals <dbl>, property_crime_percent <dbl>,
## #   property_crime_percent_change <dbl>,
## #   `overall_crime_rate_per_100,000_people` <dbl>,
## #   `overall_percent_change_per_100,000_people` <dbl>, …
# check the dataset
str(crime_county)
## spc_tbl_ [1,104 × 38] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ jurisdiction                                         : chr [1:1104] "Allegany County" "Allegany County" "Allegany County" "Allegany County" ...
##  $ year                                                 : num [1:1104] 1975 1976 1977 1978 1979 ...
##  $ population                                           : num [1:1104] 79655 83923 82102 79966 79721 ...
##  $ murder                                               : num [1:1104] 3 2 3 1 1 2 11 1 5 2 ...
##  $ rape                                                 : num [1:1104] 5 2 7 2 7 12 13 18 9 15 ...
##  $ robbery                                              : num [1:1104] 20 24 32 18 18 26 24 18 19 6 ...
##  $ agg__assault                                         : num [1:1104] 114 59 85 81 84 79 101 80 89 67 ...
##  $ b_&_e                                                : num [1:1104] 669 581 592 539 502 541 539 447 347 361 ...
##  $ larceny_theft                                        : num [1:1104] 1425 1384 1390 1390 1611 ...
##  $ m_v_theft                                            : num [1:1104] 93 73 102 100 99 108 88 55 67 68 ...
##  $ grand_total                                          : num [1:1104] 2329 2125 2211 2131 2322 ...
##  $ percent_change                                       : num [1:1104] NA -8.8 4 -3.6 9 6.5 0 -11.5 -11 -4.7 ...
##  $ violent_crime_total                                  : num [1:1104] 142 87 127 102 110 119 149 117 122 90 ...
##  $ violent_crime_percent                                : num [1:1104] 6.1 4.1 5.7 4.8 4.7 4.8 6 5.3 6.3 4.8 ...
##  $ violent_crime_percent_change                         : num [1:1104] NA -38.7 46 -19.7 7.8 8.2 25.2 -21.5 4.3 -26.2 ...
##  $ property_crime_totals                                : num [1:1104] 2187 2038 2084 2029 2212 ...
##  $ property_crime_percent                               : num [1:1104] 93.9 95.9 94.3 95.2 95.3 95.2 94 94.7 93.7 95.2 ...
##  $ property_crime_percent_change                        : num [1:1104] NA -6.8 2.3 -2.6 9 6.5 -1.3 -10.8 -11.9 -3.2 ...
##  $ overall_crime_rate_per_100,000_people                : num [1:1104] 2924 2532 2693 2665 2913 ...
##  $ overall_percent_change_per_100,000_people            : num [1:1104] NA -13.4 6.4 -1 9.3 5.6 -1.7 -11.6 -11.8 -2.6 ...
##  $ violent_crime_rate_per_100,000_people                : num [1:1104] 178 104 155 128 138 ...
##  $ violent_crime_rate_percent_change_per_100,000_people : num [1:1104] NA -41.8 49.2 -17.5 8.2 7.2 23.2 -21.6 3.3 -24.6 ...
##  $ property_crime_rate_per_100,000_people               : num [1:1104] 2746 2428 2538 2537 2775 ...
##  $ property_crime_rate_percent_change_per_100,000_people: num [1:1104] NA -11.6 4.5 0 9.4 5.5 -2.9 -10.9 -12.7 -1.1 ...
##  $ murder_per_100,000_people                            : num [1:1104] 3.8 2.4 3.7 1.3 1.3 2.5 13.5 1.2 6.1 2.5 ...
##  $ rape_per_100,000_people                              : num [1:1104] 6.3 2.4 8.5 2.5 8.8 14.9 15.9 22 10.9 18.6 ...
##  $ robbery_per_100,000_people                           : num [1:1104] 25.1 28.6 39 22.5 22.6 32.3 29.3 22 23 7.4 ...
##  $ agg__assault_per_100,000_people                      : num [1:1104] 143.1 70.3 103.5 101.3 105.4 ...
##  $ b_&_e_per_100,000_people                             : num [1:1104] 840 692 721 674 630 ...
##  $ larceny_theft_per_100,000_people                     : num [1:1104] 1789 1649 1693 1738 2021 ...
##  $ m_v_theft_per_100,000_people                         : num [1:1104] 117 87 124 125 124 ...
##  $ murder__rate_percent_change_per_100,000_people       : num [1:1104] NA -36.7 53.3 -65.8 0.3 ...
##  $ rape_rate_percent_change_per_100,000_people          : num [1:1104] NA -62 257.8 -70.7 251.1 ...
##  $ robbery_rate_percent_change_per_100,000_people       : num [1:1104] NA 13.9 36.3 -42.2 0.3 43.1 -9.2 -25.1 4.6 -67.7 ...
##  $ agg__assault__rate_percent_change_per_100,000_people : num [1:1104] NA -50.9 47.3 -2.2 4 -6.8 25.8 -20.9 10.2 -23.1 ...
##  $ b_&_e_rate_percent_change_per_100,000_people         : num [1:1104] NA -17.6 4.2 -6.5 -6.6 6.8 -2 -17.1 -23.1 6.3 ...
##  $ larceny_theft__rate_percent_change_per_100,000_people: num [1:1104] NA -7.8 2.7 2.7 16.3 4.9 -2.1 -7.6 -10.9 -3.2 ...
##  $ m_v_theft__rate_percent_change_per_100,000_people    : num [1:1104] NA -25.5 42.8 0.7 -0.7 8.1 -19.8 -37.6 20.7 3.7 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   JURISDICTION = col_character(),
##   ..   YEAR = col_double(),
##   ..   POPULATION = col_double(),
##   ..   MURDER = col_double(),
##   ..   RAPE = col_double(),
##   ..   ROBBERY = col_double(),
##   ..   `AGG. ASSAULT` = col_double(),
##   ..   `B & E` = col_double(),
##   ..   `LARCENY THEFT` = col_double(),
##   ..   `M/V THEFT` = col_double(),
##   ..   `GRAND TOTAL` = col_double(),
##   ..   `PERCENT CHANGE` = col_double(),
##   ..   `VIOLENT CRIME TOTAL` = col_double(),
##   ..   `VIOLENT CRIME PERCENT` = col_double(),
##   ..   `VIOLENT CRIME PERCENT CHANGE` = col_double(),
##   ..   `PROPERTY CRIME TOTALS` = col_double(),
##   ..   `PROPERTY CRIME PERCENT` = col_double(),
##   ..   `PROPERTY CRIME PERCENT CHANGE` = col_double(),
##   ..   `OVERALL CRIME RATE PER 100,000 PEOPLE` = col_double(),
##   ..   `OVERALL PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `VIOLENT CRIME RATE PER 100,000 PEOPLE` = col_double(),
##   ..   `VIOLENT CRIME RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `PROPERTY CRIME RATE PER 100,000 PEOPLE` = col_double(),
##   ..   `PROPERTY CRIME RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `MURDER PER 100,000 PEOPLE` = col_double(),
##   ..   `RAPE PER 100,000 PEOPLE` = col_double(),
##   ..   `ROBBERY PER 100,000 PEOPLE` = col_double(),
##   ..   `AGG. ASSAULT PER 100,000 PEOPLE` = col_double(),
##   ..   `B & E PER 100,000 PEOPLE` = col_double(),
##   ..   `LARCENY THEFT PER 100,000 PEOPLE` = col_double(),
##   ..   `M/V THEFT PER 100,000 PEOPLE` = col_double(),
##   ..   `MURDER  RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `RAPE RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `ROBBERY RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `AGG. ASSAULT  RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `B & E RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `LARCENY THEFT  RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double(),
##   ..   `M/V THEFT  RATE PERCENT CHANGE PER 100,000 PEOPLE` = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
dim(crime_county)
## [1] 1104   38
top_five <- crime_county |>
  filter(jurisdiction %in% c("Howard County","Montgomery County", "Prince George's County","Baltimore County","Anne Arundel County")) |> # select only the five counties we are looking at 
  mutate(crime_population_rate = (grand_total/population)*100000) |> #create a variable that shows crime rate
  select(jurisdiction,year,population, crime_population_rate) # shows only jurisdiction, year, population, and crime rate
head(top_five)
## # A tibble: 6 × 4
##   jurisdiction         year population crime_population_rate
##   <chr>               <dbl>      <dbl>                 <dbl>
## 1 Anne Arundel County  1975     331390                 6760.
## 2 Anne Arundel County  1976     340345                 5507.
## 3 Anne Arundel County  1977     347538                 5322.
## 4 Anne Arundel County  1978     363169                 4714.
## 5 Anne Arundel County  1979     361749                 4825.
## 6 Anne Arundel County  1980     370099                 5489.
colSums(is.na(top_five)) # look for any n/a values
##          jurisdiction                  year            population 
##                     0                     0                     0 
## crime_population_rate 
##                     0
options(scipen = 999)
top_five |>
  ggplot(aes(x= year, y = crime_population_rate, fill = jurisdiction, colour = jurisdiction))  + geom_point() +geom_line() + scale_color_brewer(palette = "Set1") + labs(title = "Crime Rate Per 100,000 in Maryland (MOCO, PG, HOWARD, AA, BALT)") 

Conclusion

Based on the graph we can clearly see that there has been a downward trend in crime rate in all five counties, in the beginning and all throughout the graph Prince George’s county had the highest crime rate, but at the end Baltimore county becomes the county with the highest crime rate per 100,000. Howard County and Montgomery county are usually have the lowest crime rates throughout the graph as well as Anne Arundel county usually in the middle of all five counties. For all five counties there is a big spike in crime around 1980, and that maybe could be researched even further as to what happened during this time period. However the bigger thing to research is that how the five counties were able to get the crime rate down, whether it was more law enforcement, safety rules and laws. Or it was from something different entirely, I believe that is the most important thing to research after seeing this graph.

References

To find the populations of the counties: https://worldpopulationreview.com/us-counties/maryland

To find the formula of crime rate: https://www.criminaljustice.ny.gov/crimnet/ojsa/countycrimestats.htm

Visiualization was used from past knowledge in DATA-110