The second project of this course aimed to reinforce a basic understanding of common statistical measures and analysis applied to geospatial problems. We applied these statistical concepts to analyze point data between the years of 2013 and 2014 that pertain to crime occurrences in St. Louis Missouri. The analysis was done using the statistical analysis software R and accompanying RStudio.
We aimed to achieve three goals throughout the course of this analysis:
1) Discuss the overall findings (i.e. patterns, trends, clustering) of the crime data set
2) Describe the statistical distribution, variance, values, and outliers of the crime data set
3) Create visuals (i.e. plots and maps) to support the overall findings and descriptive statistics of the data set
Through the goals mentioned above this project necessitates the use and practice of both descriptive (exploratory) and inferential (confirmatory) spatial analytic methods (Rogerson, P., 2001)
To perform our analysis we utilized the R and Rstudio software to manipulate our data set, create visual graphics, perform descriptive statistics, infer on statistically relevant patterns and trends, and discuss the overall findings of the data. Several R packages were included to achieve the aforementioned including:
The procedural steps used to complete this project referred to the nature of statistical thinking. The first two condsiderations were related to the data; its relevance and how to obtain it (Rogerson, P., 2001). Our initial dataset was provided for us through the course website so obtainment was already fulfilled. In reference to the relevance of our data we utilized the dplyr package to group, filter, and summarize our crime data set into a more useful structure for our analysis (i.e. grouping by crime type, temporal metrics, and crime count). Next we explain the basis for our assumptions. The data set was unknown to the analysts which supported unbiased assumptions. An objective approach of statistical exploration was undertaken to formulate an understanding about the nature of the crime data set. Then, we layed out our arguments which related to crime type patterns over the given time period in the specified spatial location. Lastly, we formulated questions about the attributes of our data set that included descriptive statistical questions such as what are common crime count values, what is the range of crime count values, are there any outliers in the data set, and what does the distribution of our data set look like? Beyond these descriptive measures we sought to answer questions about the different types of crimes in St. Louis and were there temporal patterns both monthly and yearly (Rogerson, P., 2001).
Through this methodology we were able to deduce answers to the above questions through an unbiased vantage point around crime data in St. Louis Missouri between the years of 2013 and 2014.
We began with the core application of statistics, describing and understanding our crime data set (O’Sullivan, D. & Unwin, D., 2010). What we wished to understand were common measures of central tendency (mean and median) and spread (range, standard deviation, interquartile ranges).
We examined the distribution of the data from multiple scales, both global and local:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 143.0 164.5 186.0 185.7 207.0 228.0
From the above global summary information and corresponding Figure 1 it is fair to infer that the distribution of the data is fairly normal. The mean is approximately equal to the median, the range of values is small, and the quartiles are balanced between the minimum, median, and maximum values. The simliarity of the mean and median do not resemble any significant skew in the overall data set distribution (O’Sullivan, D. & Unwin, D., 2010). We can also infer that the typical value (median or mean) is approximately 186 crimes, the variation in crime count is 85, and there is no obvious evidence of outliers upon the examination of the global data set.
Below, Figure 2, we examine the date on a refined scale which refers to the distributions of individual crime type counts per each month. From the summary statistics and corresponding figure we can see more variance in values of each crime type.Typical values of 20, 17, and 11 correspond arson, dui, and homicide crime types. The range of values is fairly minimal except for the crime type of arson. The arson crime type has a clear outlier with a value of 45 at the maximum end in the month of August. Overall the distributions of the data set related to monthly crime type counts are fairly normal with a slight skew on the arson crime type due to its extreme outlier.
## # A tibble: 3 × 8
## crimetype mean median sd min Q1 Q3 max
## <chr> <dbl> <int> <dbl> <int> <dbl> <dbl> <int>
## 1 arson 20.7 20 9.96 9 15.5 21.5 45
## 2 dui 16.9 17 6.27 8 13 20.5 27
## 3 homicide 13 11 6.32 7 8 16 28
The overall findings of the data set revealed some notable patterns
and attributes from both a spatial and temporal vantage point. First,
the data set documented three crime types between 2013 and 2014 in
St. Louis Missouri. Arson was the most frequently occurring crime with
228 documented instances, followed by 186 dui’s, and 143 homicides
(Figure 3).
After examining the global crime count per crime type over the entire
time period we took a more in depth look at the temporal variability
across crime types to detect for any patterns or trends. Below, Figure
4, depicts the total crime count each month between 2013 and 2014. The
variability across the months in total crime incidents is minimal with
the exception of a sharp spike in August. This spike in total crime in
August is largely due to the increased occurrence of arson incidents
that was noted early. Upon further research this spike, specifically
occurring in 2014, could correlate to the Ferguson unrest, a series of
protests and riots in response to a fatal shooting carried out by a
police officer (The New York Times, 2014).Spatial analysis is still
necessary to support this hypothesis.
Figure 5, below, examines the crime count each month of both years of
the data set. Arson has its peak incident rate during September of 2013
followed by a rapid decline through the end of the year. Homicide’s and
dui’s maintain a seemingly normal fluctation in pattern throughout 2013
with a notable decline in dui’s as the year ends. Examination of the
2014 monthly crime count displays a different story. dui’s show a sharp
rise at the beginning of the year, peaking in April, followed by another
rapid decline. Arson incidents have a steep rise at the start of 2014
until March and then hold relatively consistent until the data ends in
August 2014. Homicides continue to exhibit a fluctuating pattern
throughout. With this yearly examination of monthly crime counts our
hypothesis of increasing arson incidents around the time of the Ferguson
unrest may not align with the reality of the data.
Lastly, we examined any if all underlying spatial patterns and trends
within the crimes data set for St. Louis Missouri. To examine spatial
patterns we utilized a crime data set that was aggregated by
neighborhoods in the hopes of discerning more representative spatial
patterns. Additionally, we included a St. Louis city boundary shape file
to contain our crime incidents and encourage a more in depth
understanding of the spatial distribution of our point features. Figure
6 shows a grid layout of four crime incident maps bounded by the
St. Louis city shape file. From the upper left to lower right we have
the total crimes, arson incidents, dui incidents, and homicide incidents
aggregated by neighborhood. Arson incidents appear to have strong
clusters in the North West and South regions of St. Louis. The North
West region does correspond to the Ferguson area. Dui’s are pretty
evenly disbursed throughout the city and homicides have a linear cluster
in the Northern region across an East-West band.
Our last spatial analysis culminated in a synced map view of the 2013 and 2014 crime data in a side by side visual interaction. We felt this was an important addition to the overall analysis that would allow users to more easily explore the aggregated crime data sets across a temporal setting. To achieve this we grouped the crime type subsets by year and utilized a leaflet map with distinguished colored, circle symbology for visually examining the different crimes. Figure 7, below is the result of this analytical procedure.In this figure red equates to arson, blue dui, and green homicide. We can note from this visualization that the variation in homicide and arson resemble similar patterns across the temporal period but dui’s show a clear increase in the Southern region of the city in 2014 compared to 2013. Unfortunately, due to formatting issues the synchronized map would not work in the final report. The screen shot below is a depiction of its capabilities with an attached link in the caption to the html RMarkdown report.
This project highlights the importance of spatial and temporal aggregation in geospatial analysis. Different aggregation approaches (grouping crime incidents by month, year, or crime type) revealed distinct patterns and trends in the data set. While overall crime counts appeared relatively stable, temporal analysis identified notable spikes in certain months, particularly driven by arson incidents. Spatial visualizations further revealed clustering patterns across neighborhoods and differences between crime types.
These results demonstrate how spatial analytical techniques coupled with visualization tools can provide insights into crime distributions and trends. More advanced spatial statistical methods, such as kernel density estimation, spatial autocorrelation measures, or hot spot detection could further examine the understanding of clustering patterns within the city.
O’Sullivan, D. & Unwin, D. (2010). Appendix A. Geographic Information Analysis (Appendix A). John Wiley & Sons.
Rogerson, P. (2001). Introduction to Statistical Analysis in Geography. Statistical Methods for Geography (Chapter 1). London: Sage Publications.
The New York Times (August 10th, 2015). What Happened in Ferguson?. Retrieved March 7th, 2026, from https://www.nytimes.com/interactive/2014/08/13/us/ferguson-missouri-town-under-siege-after-police-shooting.html/