The purpose of this document is to compare by visual means the situation of crime and some possible correlations between variables related to crime in two American cities, San Francisco, Ca and Seattle, Wa.
We will show that Seattle is a ‘hotter’ city than San Francisco, both in terms of Temperatures and Crime. We will also show that it is not immediately evident that crime incidents are somewhat correlated with Temperatures (at least during the summer of 2014).
According to the FBI, index crime in the United States includes violent crime and property crime. Violent crime consists of four criminal offenses: murder and non-negligent manslaughter, forcible rape, robbery, and aggravated assault; property crime consists of burglary, larceny, motor vehicle theft, and arson.
Inspecting the available datasets, we note that both Police Departments use different Offense categories in their reports. There are 34 different crime categories in the San Francisco dataset, while the one from Seattle includes 48 categories.
If we want to compare statistics related to different categories of crime for both cities,we need to establish first a common definition for the different kind of law offenses.
A common ground for classifying criminal activities can be found in the FBI Uniform Crime Reporting program. The program was conceived in 1929 by the International Association of Chiefs of Police to meet the need for reliable uniform crime statistics for the nation and it is now in charge of the FBI.
Each UCR offense belongs to one of three categories: Crimes Against Persons, Crimes Against Property, and Crimes Against Society.
We load the UCR Crime categoriess from the FBI UCR Web site
| Offense | Type |
|---|---|
| ASSAULT | Person |
| HOMICIDE | Person |
| KIDNAPPING/ABDUCTION | Person |
| SEX , FORCIBLE | Person |
| SEX , NONFORCIBLE | Person |
| ARSON | Property |
| BRIBERY | Property |
| BURGLARY/BREAKING & ENTERING | Property |
| COUNTERFEITING/FORGERY | Property |
| DESTRUCTION/DAMAGE/VANDALISM OF PROPERTY | Property |
| EMBEZZLEMENT | Property |
| EXTORTION/BLACKMAIL | Property |
| FRAUD | Property |
| LARCENY/THEFT | Property |
| MOTOR VEHICLE THEFT | Property |
| STOLEN PROPERTY | Property |
| ROBBERY | Property |
| BAD CHECKS | Property |
| DRUG/NARCOTIC | Society |
| GAMBLING | Society |
| PORNOGRAPHY/OBSCENE MATERIAL | Society |
| WEAPON LAW VIOLATIONS | Society |
| CURFEW/LOITERING/VAGRANCY | Society |
| DISORDERLY CONDUCT | Society |
| DRIVING UNDER THE INFLUENCE | Society |
| DRUNKENNESS | Society |
| FAMILY , NONVIOLENT | Society |
| LIQUOR LAW VIOLATIONS | Society |
| PEEPING TOM | Society |
| TRESPASS OF REAL PROPERTY | Society |
| RUNAWAY | Not a Crime |
| NOT A CRIME | Not a Crime |
| PROSTITUTION | Society |
Crimes Against Persons, e.g., murder, rape, and assault, are those whose victims are always individuals. The object of Crimes Against Property, e.g., robbery, bribery, and burglary, is to obtain money, property, or some other benefit. Crimes Against Society, e.g., gambling, prostitution, and drug violations, represent society’s prohibition against engaging in certain types of activity; they are typically victimless crimes in which property is not the object.
We have taken the liberty to add two new categories to the original FBI’s Offense categories (‘NOT A CRIME’ and ‘PROSTITUTION’) since this will prove useful for better classifying and comparing criminal activities later.
After assigning each Offense category used by the local Police Departments to the corresponding one in the UCR dataset we get the following conversion tables.
| Cat | UCR |
|---|---|
| ARSON | ARSON |
| NON-CRIMINAL | NOT A CRIME |
| LARCENY/THEFT | LARCENY/THEFT |
| DRUG/NARCOTIC | DRUG/NARCOTIC |
| DRIVING UNDER THE INFLUENCE | DRIVING UNDER THE INFLUENCE |
| OTHER OFFENSES | NA |
| TRESPASS | TRESPASS OF REAL PROPERTY |
| VEHICLE THEFT | MOTOR VEHICLE THEFT |
| ASSAULT | ASSAULT |
| FRAUD | FRAUD |
| SUSPICIOUS OCC | NOT A CRIME |
| SECONDARY CODES | NA |
| WEAPON LAWS | WEAPON LAW VIOLATIONS |
| MISSING PERSON | NOT A CRIME |
| WARRANTS | NOT A CRIME |
| ROBBERY | ROBBERY |
| DRUNKENNESS | DRUNKENNESS |
| PROSTITUTION | PROSTITUTION |
| LIQUOR LAWS | LIQUOR LAW VIOLATIONS |
| KIDNAPPING | KIDNAPPING/ABDUCTION |
| FAMILY OFFENSES | FAMILY , NONVIOLENT |
| LOITERING | CURFEW/LOITERING/VAGRANCY |
| DISORDERLY CONDUCT | DISORDERLY CONDUCT |
| FORGERY/COUNTERFEITING | COUNTERFEITING/FORGERY |
| EMBEZZLEMENT | EMBEZZLEMENT |
| BURGLARY | BURGLARY/BREAKING & ENTERING |
| SUICIDE | NOT A CRIME |
| VANDALISM | DESTRUCTION/DAMAGE/VANDALISM OF PROPERTY |
| STOLEN PROPERTY | STOLEN PROPERTY |
| RUNAWAY | NOT A CRIME |
| GAMBLING | GAMBLING |
| EXTORTION | EXTORTION/BLACKMAIL |
| PORNOGRAPHY/OBSCENE MAT | PORNOGRAPHY/OBSCENE MATERIAL |
| BRIBERY | BRIBERY |
| Cat | UCR |
|---|---|
| BURGLARY | BURGLARY/BREAKING & ENTERING |
| FRAUD | FRAUD |
| MAIL THEFT | LARCENY/THEFT |
| COUNTERFEIT | COUNTERFEITING/FORGERY |
| OTHER PROPERTY | NA |
| EMBEZZLE | EMBEZZLEMENT |
| CAR PROWL | LARCENY/THEFT |
| THREATS | DISORDERLY CONDUCT |
| PROPERTY DAMAGE | DESTRUCTION/DAMAGE/VANDALISM OF PROPERTY |
| LOST PROPERTY | NOT A CRIME |
| FORGERY | COUNTERFEITING/FORGERY |
| VEHICLE THEFT | MOTOR VEHICLE THEFT |
| BURGLARY-SECURE PARKING-RES | LARCENY/THEFT |
| PICKPOCKET | LARCENY/THEFT |
| BIKE THEFT | LARCENY/THEFT |
| NARCOTICS | DRUG/NARCOTIC |
| DISPUTE | DISORDERLY CONDUCT |
| ASSAULT | ASSAULT |
| STOLEN PROPERTY | STOLEN PROPERTY |
| WARRANT ARREST | NOT A CRIME |
| TRAFFIC | NOT A CRIME |
| SHOPLIFTING | LARCENY/THEFT |
| DISTURBANCE | DISORDERLY CONDUCT |
| VIOLATION OF COURT ORDER | DISORDERLY CONDUCT |
| ILLEGAL DUMPING | NA |
| PROSTITUTION | PROSTITUTION |
| ROBBERY | ROBBERY |
| TRESPASS | TRESPASS OF REAL PROPERTY |
| LIQUOR VIOLATION | LIQUOR LAW VIOLATIONS |
| BIAS INCIDENT | DISORDERLY CONDUCT |
| THEFT OF SERVICES | LARCENY/THEFT |
| HOMICIDE | HOMICIDE |
| RECOVERED PROPERTY | NOT A CRIME |
| OBSTRUCT | NA |
| RECKLESS BURNING | DISORDERLY CONDUCT |
| INJURY | NOT A CRIME |
| WEAPON | WEAPON LAW VIOLATIONS |
| PURSE SNATCH | LARCENY/THEFT |
| FALSE REPORT | NA |
| ELUDING | NA |
| ANIMAL COMPLAINT | NOT A CRIME |
| PORNOGRAPHY | PORNOGRAPHY/OBSCENE MATERIAL |
| DUI | DRIVING UNDER THE INFLUENCE |
| FIREWORK | DISORDERLY CONDUCT |
| [INC - CASE DC USE ONLY] | ASSAULT |
| PUBLIC NUISANCE | DISORDERLY CONDUCT |
| DISORDERLY CONDUCT | DISORDERLY CONDUCT |
| ESCAPE | NA |
As a last step in the transformation of the data sets, we need to use common date and time formats if we want to be able to research criminal activities from a time reference viewpoint. We create the variable TimeStamp in both datasets. Finally, we get rid of some variables we will not use in this study. Below a sample of a datapoint from each dataset.
## A sf record contains:
## UCR: BURGLARY/BREAKING & ENTERING
## Category: BURGLARY
## Descript: SAFE BURGLARY
## PdDistrict: TENDERLOIN
## Location: (37.7838066631424, -122.409129633669)
## Type: Property
## TimeStamp: 2014-06-04 01:30:00
## A seattle record contains:
## UCR: ASSAULT
## Category: [INC - CASE DC USE ONLY]
## District.Sector: L
## Zone.Beat: L1
## Location: (47.726097796, -122.290899625)
## Type: Person
## TimeStamp: 2014-07-26 10:00:00
In order to make fair comparisons we need to take into account some demographic information, such as the population of each city. Population data has been taken from the Wikipedia pages dedicated to each city.
| City | Population |
|---|---|
| San Francisco | 852469 |
| Seattle | 662400 |
We are now allowed to compare values for each standardized Offense category. As we can see below, we should study each Offense category in particular and solve many intriguing differences (e.g. take a look at those categories with considerable values in one city and no cases in the other -NA values-).
| Offense | SF | Seattle | Type |
|---|---|---|---|
| ARSON | 63 | NA | Property |
| ASSAULT | 2882 | 2023 | Person |
| BRIBERY | 1 | NA | Property |
| BURGLARY/BREAKING & ENTERING | 6 | 3212 | Property |
| COUNTERFEITING/FORGERY | 18 | 218 | Property |
| CURFEW/LOITERING/VAGRANCY | 3 | NA | Society |
| DESTRUCTION/DAMAGE/VANDALISM OF PROPERTY | 17 | 2365 | Property |
| DISORDERLY CONDUCT | 31 | 2830 | Society |
| DRIVING UNDER THE INFLUENCE | 100 | 34 | Society |
| DRUG/NARCOTIC | 1345 | 391 | Society |
| DRUNKENNESS | 147 | NA | Society |
| EMBEZZLEMENT | 10 | 57 | Property |
| EXTORTION/BLACKMAIL | 7 | NA | Property |
| FAMILY , NONVIOLENT | 10 | NA | Society |
| FRAUD | 242 | 1473 | Property |
| GAMBLING | 1 | NA | Society |
| HOMICIDE | NA | 8 | Person |
| KIDNAPPING/ABDUCTION | 117 | NA | Person |
| LARCENY/THEFT | 9466 | 8874 | Property |
| LIQUOR LAW VIOLATIONS | 42 | 48 | Society |
| MOTOR VEHICLE THEFT | 1966 | 3057 | Property |
| NOT A CRIME | 7446 | 1636 | Not a Crime |
| PORNOGRAPHY/OBSCENE MATERIAL | 1 | 3 | Society |
| PROSTITUTION | 112 | 202 | Society |
| ROBBERY | 308 | 736 | Property |
| STOLEN PROPERTY | 8 | 1136 | Property |
| TRESPASS OF REAL PROPERTY | 281 | 486 | Society |
| WEAPON LAW VIOLATIONS | 354 | 137 | Society |
Since we are interested in comparing crime in general and in discovering crime patterns related to external variables, we will from now on group offenses according to their types: Crimes agains Persons, Crimes against Property and Crimes against Society.
Comparing total number of cases of each type for both cities we get:
Except for Crimes against Property, the situation seems similar for both cities. Actually, we should take into account their corresponding populations.
Adjusting by population we will get figures corresponding to crime rates, a more realistic approach. We will present figures as the number of offenses in each city to the population of that city, expressed per 100,000 inhabitants.
We can now easily see that crime rate in Seattle almost double that of San Francisco, and that that is mostly due to the high ratio of offenses against property and society.
A different approach is to see how the different type of offenses evolve with time for each city. This could be useful to try to find out patterns related to common characteristics, like day of week, holidays, etc.
Again, it is easy to see from these plots that, except for Crimes against Persons, Seattle almost doubles San Francisco’s crime rates. But the value added by this kind of plot resides in that we can see how crime rates relate to each other at a particular time.
For instance, we note there are two spikes on the same day close to day number 20 in the graphs related to offenses against society. Day 20 is Friday 20th of June, 2014. It is not a national holiday. Are there any hiden common factors making this day particular?
We will expand on this idea and investigate possible relationships of crime variables with external ones. In our case, we will do so with Temperature. Since we are analyzing data corresponding to the summer season, we will focus on the impact ot high temperatures with crime rates.
The University of Dayton’s site contains files of daily average temperatures for 157 U.S. and 167 international cities. The files are updated on a regular basis and contain data from January 1, 1995 to present. Source data for this site are from the National Climatic Data Center. The data is available for research and non-commercial purposes only here
Which city is hotter in the summertime? Below, a comparison of daily average temperatures in both cities.
Boxplots are nice to compare central tendency statistics like the median and the variability of the data points. We can easily see in the graph above that, contrary to expectations considering latitudes, the summer season of 2014 in Seattle was hotter than in San Francisco.
If we need details, we can resort to a line graph including daily averages for both cities.
We can now appreciate in the graph above that June 20 was not a particularly hot day in neither city, being well below the average in both cases.
We hope the preceding plots have shown clear enough that Seattle is a ‘hotter’ city than San Francisco, both in terms of Temperatures and Crime. It is also clear that it is not immediately evident that crime incidents are somewhat correlated with Temperatures (at least during the summer of 2014).