Paper 2 - “Crime Prediction & Monitoring Framework Based on Spatial Analysis”
PAPER 2 - Proposal
- Predict crimes using data mining techniques and theories from criminology such as Rational Choice Theory and Routine Activity Theory.
PAPER 2 - Data
Dataset obtained from UK police department website (data from 2015-17)
- 5 relevant attributes for crime prediction:
crime type, locAtion, date, latitude, and longitude
Data was preprocessed with removal of inconsistent data, and transforming data for prediction
PAPER 2 - Data Visualization (1)
PAPER 2 - Data Visualization (2)
PAPER 2 - Data Visualization (3)
PAPER 2 - kNN (1)
![]()
k-NN is a way to group things based on their similarities. It works by looking at the things closest to the one you’re trying to classify and assigning it to the group that has the most similar things. This can be used to predict where crimes are likely to happen base on where they’ve happened before. It takes into account location and date, and computes the distance between different areas to group them together.
PAPER 2 - kNN (2)
First step, compute the distances between a test instance and all training instances
For this, the latitude, longitude and the number of days as the coordinates and compute the distance factor as
\(d_i = \sqrt{(x_i - \bar{x})^2 + (y_i - \bar{y})^2}\), \(d_i = \sqrt{(x_i - \bar{x})^2 + (y_i - \bar{y})^2 + (Z_i)^2}\)
To avoid the squaring and square root, the Manhattan distance was computed
\[ d_i = |x_i - x| + |y_i - \textbf{1}| + |z_i|\]
PAPER 2 - kNN (3)
The next step is to identify the k-nearest neighbors to the test instance based on their distances. The value of k is a pre-defined parameter in the K-NN algorithm, and it determines the number of nearest neighbors to be considered for classification.
| latitude |
longitude |
| 53.74933 |
-2.000177 |
| 52.85565 |
-2.799866 |
| 53.98675 |
-2.548269 |
| 53.97359 |
-2.828916 |
| 53.72188 |
-2.161456 |
| 53.84521 |
-2.546358 |
| 52.12237 |
-2.252581 |
| 52.25856 |
-2.534608 |
PAPER 2 - kNN (4)
Their corresponding class labels (i.e., types of crime) are used to determine the class membership of the test instance. The class membership is determined by taking a majority vote of the k-nearest neighbors. This means that the class label assigned to the test instance is the one that occurs most frequently among its k-nearest neighbors.
| latitude |
longitude |
prediction |
probability |
textAddress |
| 53.74933 |
-2.000177 |
Drugs |
0.67 |
Walker Ln, Hebden Bridge HX7, UK |
| 52.85565 |
-2.799866 |
Shoplifting |
0.40 |
B4397, Shrewsburry SY4 5ST, UK |
| 53.98675 |
-2.548269 |
Shoplifting |
0.40 |
Whitendale Road, Clitheroe BB7 3BL, UK |
| 53.97359 |
-2.828916 |
Shoplifting |
0.40 |
Hillam Ln, Lancaster LA2 0DX, UK |
| 53.72188 |
-2.161456 |
Shoplifting |
0.67 |
Dark Red, Todmorden OL14 7ER, UK |
| 53.84521 |
-2.546358 |
Shoplifting |
0.40 |
NA |
| 52.12237 |
-2.252581 |
Criminal damage and arson |
0.67 |
115 B4424, Callow End, Worcester WR2 4TH, UK |
| 52.25856 |
-2.534608 |
Shoplifting |
0.40 |
Bromyard Rd, Tenbury Wells WR15, UK |
PAPER 2 - Naïve Bayes (1)
based on Bayes theorem which describes the probability of an event based on the prior knowledge of conditions that might be related to the event.
\[\operatorname{Pr}(h|x) = \frac{\operatorname{Pr}(x|h) \operatorname{Pr}(h)}{\operatorname{Pr}(x)}\]
The Naïve Bayes classifier classifies a new instance X by assigning the most probable target value i.e. the maximum likelihood. i.e.
\[\max_{d_i \epsilon d} \operatorname{Pr}(d_i) \prod_{k=1}^n \operatorname{Pr}\left(\frac{x_k}{d_i}\right)\]
PAPER 2 - Naïve Bayes (2)
- Naive Bayes can classify crime types by calculating the conditional probabilities of each attribute given each class.
- The model is trained using a crime dataset, and once trained, can predict the class of a new instance based on its attributes.
- The classifier calculates the posterior probability of each class given the attribute values and assigns the class with the highest probability as the predicted class.
PAPER 2 - Naïve Bayes (3)
| crime |
probability |
| Anti-social behaviour |
0.3750000 |
| Criminal damage and arson |
0.1607143 |
| Violence and sexual offences |
0.1071429 |
| Burgarly |
0.1071429 |
| Shoplifting |
0.0892857 |
| Other theft |
0.0712857 |
| Drugs |
0.0357143 |
| Public order |
0.0357143 |
| Vehicle crime |
0.0178571 |
| Bycicle theft |
0.0000000 |
| Other crime |
0.0000000 |
| Robbery |
0.0000000 |
| Possession of weapons |
0.0000000 |
| Theft from the person |
0.0000000 |