This paper aims to find out if there is any association between the different kinds of crimes committed in a given country of a given year by applying association rules to the crime report data. First, data about the crime was gathered from the website of FBI, from the uniform crime reporting statistics.The website consists a recored of all the violent and property crimes from the year 1985 to 2014. You can find the website and more detailed information on the dataset here : links But, here I have analysed the data of only 1985. After the data was downloaded, it had to be processed so that it can be used as to apply association rules. The data looked something like this:
LocalCrimeOneYearofData <- read.csv("C:/Users/PC-CATHERINE/Downloads/LocalCrimeOneYearofData.csv", stringsAsFactors=FALSE)
LocalCrimeOneYearofData[1:5,]
## State Months Population Violent.crime.total
## 1 TX 12 111317 355
## 2 ID 12 NA 111
## 3 SC 12 NA 612
## 4 OH 12 226704 1873
## 5 CA 12 NA 427
## Murder.and.nonnegligent.Manslaughter Legacy.rape..1 Revised.rape..2 Robbery
## 1 8 36 NA 96
## 2 3 8 NA 10
## 3 10 32 NA 49
## 4 17 158 NA 511
## 5 3 27 NA 166
## Aggravated.assault Property.crime.total Burglary Larceny.theft
## 1 215 6156 1623 4116
## 2 90 1797 533 1190
## 3 521 2218 946 1117
## 4 1187 13261 3197 9126
## 5 231 3964 1483 2128
## Motor.vehicle.theft
## 1 417
## 2 74
## 3 155
## 4 938
## 5 353
By looking at the data we understand that each cell value indicates the total number of a particular sort of crime that was committed in a given jurisdiction/agency in the year of 1985. But since our paper is not exploring each state, but is trying to have a composite look at the crime scenario of the entire country as a whole, we will not take into consideration the number of times each crime was committed in a given jurisdiction. Since association rules are commonly used to find the association of goods brought in a super market, lets create an analogy between our data and the data of the transaction of a supermarket to help clarify in a better way what we are trying to achieve here.We would look at each jurisdiction as a transaction ID and each kind of crime as an item that was purchased in the supermarket. Now, ususally, in simple market basket analysis, we don’t take into account the amount or quantity of each item that was purchased in a particular transaction. So, we are gonna do the same with our crime data. We don’t care about how many times a particular crime was committed in a given jurisdiction. We only care about if it happened. So, we will tranform the data in the following way:
install.packages("arulesViz",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/PC-CATHERINE/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'arulesViz' successfully unpacked and MD5 sums checked
##
## The downloaded binary packages are in
## C:\Users\PC-CATHERINE\AppData\Local\Temp\RtmpiMevA8\downloaded_packages
install.packages("arules",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/PC-CATHERINE/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## package 'arules' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'arules'
## Warning in file.copy(savedcopy, lib, recursive = TRUE):
## problem copying C:\Users\PC-CATHERINE\Documents\R\win-
## library\3.6\00LOCK\arules\libs\x64\arules.dll to C:\Users\PC-
## CATHERINE\Documents\R\win-library\3.6\arules\libs\x64\arules.dll: Permission
## denied
## Warning: restored 'arules'
##
## The downloaded binary packages are in
## C:\Users\PC-CATHERINE\AppData\Local\Temp\RtmpiMevA8\downloaded_packages
install.packages("dplyr",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/PC-CATHERINE/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
##
## There is a binary version available but the source version is later:
## binary source needs_compilation
## dplyr 0.8.3 0.8.4 TRUE
##
## Binaries will be installed
## package 'dplyr' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'dplyr'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying C:
## \Users\PC-CATHERINE\Documents\R\win-library\3.6\00LOCK\dplyr\libs\x64\dplyr.dll
## to C:\Users\PC-CATHERINE\Documents\R\win-library\3.6\dplyr\libs\x64\dplyr.dll:
## Permission denied
## Warning: restored 'dplyr'
##
## The downloaded binary packages are in
## C:\Users\PC-CATHERINE\AppData\Local\Temp\RtmpiMevA8\downloaded_packages
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(arules)
## Warning: package 'arules' was built under R version 3.6.2
## Loading required package: Matrix
##
## Attaching package: 'arules'
## The following object is masked from 'package:dplyr':
##
## recode
## The following objects are masked from 'package:base':
##
## abbreviate, write
library(arulesViz)
## Warning: package 'arulesViz' was built under R version 3.6.2
## Loading required package: grid
## Registered S3 method overwritten by 'seriation':
## method from
## reorder.hclust gclus
#Removing the months clumn as we know the data is for the entire year
LocalCrimeOneYearofData<-LocalCrimeOneYearofData[,-c(2)]
#Transforming the columns
LocalCrimeOneYearofData$Murder.and.nonnegligent.Manslaughter<-ifelse(is.na(LocalCrimeOneYearofData$Murder.and.nonnegligent.Manslaughter) | LocalCrimeOneYearofData$Murder.and.nonnegligent.Manslaughter==0, "", "Murder.and.nonnegligent.Manslaughter")
LocalCrimeOneYearofData$Legacy.rape..1<-ifelse(is.na(LocalCrimeOneYearofData$Legacy.rape..1) | LocalCrimeOneYearofData$Legacy.rape..1==0, "", "Legacy.rape_1")
LocalCrimeOneYearofData$Revised.rape..2<-ifelse(is.na(LocalCrimeOneYearofData$Revised.rape..2) | LocalCrimeOneYearofData$Revised.rape..2==0, "", "Revised.rape_2")
LocalCrimeOneYearofData$Robbery<-ifelse(is.na(LocalCrimeOneYearofData$Robbery) | LocalCrimeOneYearofData$Robbery==0, "", "Robbery")
LocalCrimeOneYearofData$Aggravated.assault<-ifelse(is.na(LocalCrimeOneYearofData$Aggravated.assault) | LocalCrimeOneYearofData$Aggravated.assault==0, "", "Aggravated.assault")
LocalCrimeOneYearofData$Burglary<-ifelse(is.na(LocalCrimeOneYearofData$Burglary) | LocalCrimeOneYearofData$Burglary==0, "", "Burglary")
LocalCrimeOneYearofData$Larceny.theft<-ifelse(is.na(LocalCrimeOneYearofData$Larceny.theft) | LocalCrimeOneYearofData$Larceny.theft==0, "", "Larceny.theft")
LocalCrimeOneYearofData$Motor.vehicle.theft<-ifelse(is.na(LocalCrimeOneYearofData$Motor.vehicle.theft) | LocalCrimeOneYearofData$Motor.vehicle.theft==0, "", "Motor.vehicle.theft")
#Removing the population and the count of total number of violent and property crimes committed
LocalCrimeOneYearofData<-LocalCrimeOneYearofData[,-c(2,3,9)]
#Saving the state column and then removing it
State<-c(LocalCrimeOneYearofData$State)
LocalCrimeOneYearofData<-LocalCrimeOneYearofData[,-c(1)]
# Writing the current dataframe as a table without columns and then reading it as transactions data
write.table(LocalCrimeOneYearofData, file = "crime.csv", col.names = FALSE, row.names = FALSE, sep = ",")
trans<- read.transactions("crime.csv", sep =",", format("basket"), rm.duplicates = TRUE)
After we have done that, lets take a look at what we have:
items
[1] {Aggravated.assault,
Burglary,
Larceny.theft,
Legacy.rape_1,
Motor.vehicle.theft,
Murder.and.nonnegligent.Manslaughter, Robbery}
[2] {Aggravated.assault,
Burglary,
Larceny.theft,
Legacy.rape_1,
Motor.vehicle.theft,
Murder.and.nonnegligent.Manslaughter, Robbery}
[3] {Aggravated.assault,
Burglary,
Larceny.theft,
Legacy.rape_1,
Motor.vehicle.theft,
Murder.and.nonnegligent.Manslaughter, Robbery}
[4] {Aggravated.assault,
Burglary,
Larceny.theft,
Legacy.rape_1,
Motor.vehicle.theft,
Murder.and.nonnegligent.Manslaughter, Robbery}
[5] {Aggravated.assault,
Burglary,
Larceny.theft,
Legacy.rape_1,
Motor.vehicle.theft,
Murder.and.nonnegligent.Manslaughter, Robbery}
Now, that we have the data in the transaction form, we can easily apply association rules to it.
#Lets strat by creating the frequency table
freq_items<-eclat(trans, parameter=list(supp=0.50, maxlen=8))
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.5 1 8 frequent itemsets FALSE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 227
##
## create itemset ...
## set transactions ...[7 item(s), 455 transaction(s)] done [0.00s].
## sorting and recoding items ... [7 item(s)] done [0.00s].
## creating bit matrix ... [7 row(s), 455 column(s)] done [0.00s].
## writing ... [127 set(s)] done [0.00s].
## Creating S4 object ... done [0.00s].
inspect(freq_items[1:5,])
## items support count
## [1] {Aggravated.assault,
## Burglary,
## Larceny.theft,
## Legacy.rape_1,
## Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} 0.9098901 414
## [2] {Aggravated.assault,
## Burglary,
## Legacy.rape_1,
## Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} 0.9098901 414
## [3] {Aggravated.assault,
## Larceny.theft,
## Legacy.rape_1,
## Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} 0.9098901 414
## [4] {Burglary,
## Larceny.theft,
## Legacy.rape_1,
## Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} 0.9098901 414
## [5] {Burglary,
## Legacy.rape_1,
## Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} 0.9098901 414
The absolute minimum support count is found in 227 jurisdictions, that is around in around 227 jurisdictions, 50% of the crimes are committed together.
freq_rules<-ruleInduction(freq_items, trans, confidence=0.95)
freq_rules<-sort(freq_rules, by="lift", decreasing=TRUE)
inspect(freq_rules[1:5,])
## lhs rhs support confidence lift itemset
## [1] {Aggravated.assault,
## Burglary,
## Larceny.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 16
## [2] {Aggravated.assault,
## Burglary,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 17
## [3] {Aggravated.assault,
## Larceny.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 18
## [4] {Burglary,
## Larceny.theft,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 19
## [5] {Burglary,
## Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 20
Violent crimes consists of murder and manslaughter, legacy rape, revised rape, robberyand aggravated assult. Whereas the property crimes consists of burglary, larceny theft and motorcycle theft. From looking at the results we can see that aggravated assault, burglary, larceny theft and murder are most likely to happen with legacy rape (as their lift value is the highest). Morever, the values of support and confidence further reinforce this hypothesis that these crimes are more likely to happen together. But what does this mean given our data? When I say these crimes are more likely to happen together, I don’t mean that they will happen at the same time or in will be reported in one crime incident report. This is because even though we drew the analogy of the jurisdiction being transaction ID, its not. That means that a high lift value has a more aggregate interpretation: It implies that if a jurisdiction has crimes reported on the lhs, then it is more likely (or probable) for that particular state to have the crime reporting on rhs. So, we are not talking about each report incident, but about a jurisdiction on a whole. Lets look deeper. Lets try and see what is the situation for each of this crime when they are on rhs:
rules_Legacy.rape_1<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Legacy.rape_1"), control=list(verbose=F))
rules_rape_byconf<-sort(rules_Legacy.rape_1, by="confidence", decreasing=TRUE)
inspect(rules_rape_byconf[1:5])
## lhs rhs support confidence lift count
## [1] {Murder.and.nonnegligent.Manslaughter} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 416
## [2] {Murder.and.nonnegligent.Manslaughter,
## Robbery} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 416
## [3] {Aggravated.assault,
## Murder.and.nonnegligent.Manslaughter} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 416
## [4] {Burglary,
## Murder.and.nonnegligent.Manslaughter} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 416
## [5] {Larceny.theft,
## Murder.and.nonnegligent.Manslaughter} => {Legacy.rape_1} 0.9142857 0.9719626 1.005098 416
Looking at the confidence, suuport and lift values, we cn say that the first 5 rules which states that crimes like murder, robbery assault, burglary and larceny theft in a jurisdiction implies the prevalence of legacy rape in that jurisdiction as well in 1985 USA.
rules_Murder<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Murder.and.nonnegligent.Manslaughter"), control=list(verbose=F))
rules_murder_byconf<-sort(rules_Murder, by="confidence", decreasing=TRUE)
inspect(rules_murder_byconf[1:5])
## lhs rhs support confidence lift count
## [1] {Legacy.rape_1,
## Robbery} => {Murder.and.nonnegligent.Manslaughter} 0.9142857 0.9476082 1.007387 416
## [2] {Aggravated.assault,
## Legacy.rape_1,
## Robbery} => {Murder.and.nonnegligent.Manslaughter} 0.9142857 0.9476082 1.007387 416
## [3] {Burglary,
## Legacy.rape_1,
## Robbery} => {Murder.and.nonnegligent.Manslaughter} 0.9142857 0.9476082 1.007387 416
## [4] {Larceny.theft,
## Legacy.rape_1,
## Robbery} => {Murder.and.nonnegligent.Manslaughter} 0.9142857 0.9476082 1.007387 416
## [5] {Aggravated.assault,
## Burglary,
## Legacy.rape_1,
## Robbery} => {Murder.and.nonnegligent.Manslaughter} 0.9142857 0.9476082 1.007387 416
The support, confidence and lift again show that rape, robbery assault, burglary, larceny imply the prevalence of murder and manslaughter in a jurisdiction as well. But notice that the support , confidence and lift values are less than the previous case. This implies that this rule is not as strong as the previous one. Hence we can’t say that the rule of murder being impleid by rape is as strong as rape bening implied by murder.
rules_Burglary<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Burglary"), control=list(verbose=F))
rules_burglary_byconf<-sort(rules_Burglary, by="confidence", decreasing=TRUE)
inspect(rules_burglary_byconf[1:5])
## lhs rhs support confidence
## [1] {} => {Burglary} 1.0000000 1
## [2] {Murder.and.nonnegligent.Manslaughter} => {Burglary} 0.9406593 1
## [3] {Legacy.rape_1} => {Burglary} 0.9670330 1
## [4] {Motor.vehicle.theft} => {Burglary} 0.9956044 1
## [5] {Robbery} => {Burglary} 0.9956044 1
## lift count
## [1] 1 455
## [2] 1 428
## [3] 1 440
## [4] 1 453
## [5] 1 453
This shows us that the most probable chance on burglary happens alone (because of the empty set on lhs). That means that a jurisdiction where no other crime has been reported is most likely to have a burglary report. Also since the value of confidence and support is 1, this is a very strong rule and it happens in almost all of the jurisdictions.
rules_Robbery<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Robbery"), control=list(verbose=F))
rules_robbery_byconf<-sort(rules_Robbery, by="confidence", decreasing=TRUE)
inspect(rules_robbery_byconf[1:5])
## lhs rhs support confidence lift count
## [1] {Murder.and.nonnegligent.Manslaughter} => {Robbery} 0.9406593 1 1.004415 428
## [2] {Legacy.rape_1,
## Murder.and.nonnegligent.Manslaughter} => {Robbery} 0.9142857 1 1.004415 416
## [3] {Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter} => {Robbery} 0.9362637 1 1.004415 426
## [4] {Aggravated.assault,
## Murder.and.nonnegligent.Manslaughter} => {Robbery} 0.9406593 1 1.004415 428
## [5] {Burglary,
## Murder.and.nonnegligent.Manslaughter} => {Robbery} 0.9406593 1 1.004415 428
This is an interesting yet very true result that we observe. The support, confidence and lift values tells us that apparently robbery is most likely to happen along with murder and manslaughter. This is very true and is still prevalent as is eveident by the news. This too is a very strong rule as the confidence is 1.
rules_Larceny.theft<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Larceny.theft"), control=list(verbose=F))
rules_Larceny_byconf<-sort(rules_Larceny.theft, by="confidence", decreasing=TRUE)
inspect(rules_Larceny_byconf[1:5])
## lhs rhs support confidence lift count
## [1] {} => {Larceny.theft} 1.0000000 1 1 455
## [2] {Murder.and.nonnegligent.Manslaughter} => {Larceny.theft} 0.9406593 1 1 428
## [3] {Legacy.rape_1} => {Larceny.theft} 0.9670330 1 1 440
## [4] {Motor.vehicle.theft} => {Larceny.theft} 0.9956044 1 1 453
## [5] {Robbery} => {Larceny.theft} 0.9956044 1 1 453
Like burglary, larceny theft too is most likely to happen alone, that is, in a jurisdiction where none of the other crimes have been reported. Lets look at assault next:
rules_Aggravated.assault<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Aggravated.assault"), control=list(verbose=F))
rules_assault_byconf<-sort(rules_Aggravated.assault, by="confidence", decreasing=TRUE)
inspect(rules_assault_byconf[1:5])
## lhs rhs support confidence lift count
## [1] {Murder.and.nonnegligent.Manslaughter} => {Aggravated.assault} 0.9406593 1 1.002203 428
## [2] {Legacy.rape_1} => {Aggravated.assault} 0.9670330 1 1.002203 440
## [3] {Robbery} => {Aggravated.assault} 0.9956044 1 1.002203 453
## [4] {Legacy.rape_1,
## Murder.and.nonnegligent.Manslaughter} => {Aggravated.assault} 0.9142857 1 1.002203 416
## [5] {Motor.vehicle.theft,
## Murder.and.nonnegligent.Manslaughter} => {Aggravated.assault} 0.9362637 1 1.002203 426
These results too are intutive and understable. It says that aggravated assault is most likely to happen in a jurisdiction where murder and rape have been reported or are prevalent.
rules_Motor.vehicle.theft<-apriori(data=trans, parameter=list(supp=0.001,conf = 0.08),
appearance=list(default="lhs", rhs="Motor.vehicle.theft"), control=list(verbose=F))
rules_vehicle_theft_byconf<-sort(rules_Motor.vehicle.theft, by="confidence", decreasing=TRUE)
inspect(rules_vehicle_theft_byconf[1:5])
## lhs rhs support confidence
## [1] {} => {Motor.vehicle.theft} 0.9956044 0.9956044
## [2] {Burglary} => {Motor.vehicle.theft} 0.9956044 0.9956044
## [3] {Larceny.theft} => {Motor.vehicle.theft} 0.9956044 0.9956044
## [4] {Burglary,Larceny.theft} => {Motor.vehicle.theft} 0.9956044 0.9956044
## [5] {Aggravated.assault} => {Motor.vehicle.theft} 0.9934066 0.9955947
## lift count
## [1] 1.0000000 453
## [2] 1.0000000 453
## [3] 1.0000000 453
## [4] 1.0000000 453
## [5] 0.9999903 452
Lastly, lets look at the motorcycle theft. This theft too is very similar to burglary and larceny theft and has similar implications. From this we can safely say that property crimes are more likely not determined by any other crimes and can happen in jurisdictions where violent crimes are not present at all. But its not the other way around. The presence of property crime does implicate the presence of violent crimes.
Now, we know that these crimes can be classified into two type: violent crimes and property crimes. Lets add that as a level and see how the results are then.
names<-c("Aggravated.assault","Burglary","Larceny.theft","Legacy.rape_1","Motor.vehicle.theft","Murder.and.nonnegligent.Manslaughter","Robbery")
level_label<-c("Violent Crime","Property Crime" , "Property Crime","Violent Crime","Property Crime","Violent Crime","Violent Crime")
itemInfo(trans) <- data.frame(labels = names, level1 = level_label)
trans_level2<-aggregate(trans, by="level1")
freq_items_2<-eclat(trans_level2, parameter=list(supp=0.50, maxlen=8))
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.5 1 8 frequent itemsets FALSE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 227
##
## create itemset ...
## set transactions ...[2 item(s), 455 transaction(s)] done [0.00s].
## sorting and recoding items ... [2 item(s)] done [0.00s].
## creating bit matrix ... [2 row(s), 455 column(s)] done [0.00s].
## writing ... [3 set(s)] done [0.00s].
## Creating S4 object ... done [0.00s].
freq_rules_l2<-ruleInduction(freq_items_2, trans_level2, confidence=0.95)
freq_rules_l2<-sort(freq_rules_l2, by="lift", decreasing=TRUE)
inspect(freq_rules_l2)
## lhs rhs support confidence lift itemset
## [1] {Violent Crime} => {Property Crime} 0.9978022 1.0000000 1 1
## [2] {Property Crime} => {Violent Crime} 0.9978022 0.9978022 1 1
From the confidence and suport value, we can see that out of all the agency/jurisduction reports, both property and violent crimes encompass 99% of all the reports. This is not suprising because we just ahve these two classification of the entire set of crimes.
Lastly, lets take very frequently occuring crimes(by that I mean crimes which happen in more number of jurisdictions, not the number of times the crimes happen) and see if we have a heirarchy amoungst them.
install.packages("stats",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/PC-CATHERINE/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## Warning: package 'stats' is not available (for R version 3.6.1)
## Warning: package 'stats' is a base package, and should not be updated
install.packages("arules",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/PC-CATHERINE/Documents/R/win-library/3.6'
## (as 'lib' is unspecified)
## Warning: package 'arules' is in use and will not be installed
library(arules)
library(stats)
trans_sel<-trans[,itemFrequency(trans)>0.70]
d_jac_item<-dissimilarity(trans_sel, which="items")
plot(hclust(d_jac_item, method = "ward.D2"), main = "Dendrogram for items")
The violent crimes like rape and murder and manslaughter are at seperate branches, but assault and robbery along with burglary and larceny theft form the last branches.This justifies our results and also the occurence of of these crimes in real life as well. WE all have heard numerous reports of robbery gone wrong where the inhabitants were assaulted or injured. This proves that association rule can help in mapping out which crimes are more likely to happen in a particulr place, given the reports of already reported crimes.
This paper can be further evolved to not just analyse the “crime basket” of a place but also can be applied to criminals. We all know that a past criminal record for a criminal implies a potential future one. But for which crimes exactly? This question can be answered by collecting data of criminal records of criminals and try to see if there is any combination of crimes that are more likely to imply the commitance of another crime for a particular criminal.