# Data : The Pattern of Arrests made since year  between 1997 & 2002 based on Color , Age , Gender, Citizen, Released and the no of checks.
#GitHub Location : https://raw.githubusercontent.com/Vishal0229/R_BridgeCourse/master/Arrests.csv
#Description :- The data set gives the Arrests made between year 1997 to 2002 based on  different factors like Age, color, Gender, whether they are released or not , they were  employed at time of crime, whether they Citizens or not  and how many checks were performed on them.

#Introduction :- This project does an analysis on the data provided for arrests made between 1997-2002 , for various gender, age and colour. We will try to figure out which group i.e. age/color/gender  is more susceptible to crime .
#setwd("C:/Users/ARORA/Desktop/Gittie/CUNY/R_Course/Week2/vincentarelbundock-Rdatasets-028956a/csv/carData")
#getwd()
readObj <- read.csv("https://raw.githubusercontent.com/Vishal0229/R_BridgeCourse/master/Arrests.csv")
#fix(readObj)
head(readObj)
##   X released colour year age    sex employed citizen checks
## 1 1      Yes  White 2002  21   Male      Yes     Yes      3
## 2 2       No  Black 1999  17   Male      Yes     Yes      3
## 3 3      Yes  White 2000  24   Male      Yes     Yes      3
## 4 4       No  Black 2000  46   Male      Yes     Yes      1
## 5 5      Yes  Black 1999  27 Female      Yes     Yes      1
## 6 6      Yes  Black 1998  16 Female      Yes     Yes      0
summary(readObj)
##        X        released     colour          year           age       
##  Min.   :   1   No : 892   Black:1288   Min.   :1997   Min.   :12.00  
##  1st Qu.:1307   Yes:4334   White:3938   1st Qu.:1998   1st Qu.:18.00  
##  Median :2614                           Median :2000   Median :21.00  
##  Mean   :2614                           Mean   :2000   Mean   :23.85  
##  3rd Qu.:3920                           3rd Qu.:2001   3rd Qu.:27.00  
##  Max.   :5226                           Max.   :2002   Max.   :66.00  
##      sex       employed   citizen        checks     
##  Female: 443   No :1115   No : 771   Min.   :0.000  
##  Male  :4783   Yes:4111   Yes:4455   1st Qu.:0.000  
##                                      Median :1.000  
##                                      Mean   :1.636  
##                                      3rd Qu.:3.000  
##                                      Max.   :6.000
hist(readObj$age, xlab="Age", main="Histogram depicting Age factor in crime")

plot(readObj$year, readObj$age, xlab="YEAR", ylab="AGE" , main="Scatter Plot (Year vs Age)", xlim=c(1998,2003), ylim=c(10,70), pch=20, col=2)

 colorGroups <- table(readObj$colour, readObj$sex)
 
colorGroups
##        
##         Female Male
##   Black     72 1216
##   White    371 3567
barplot(colorGroups, xlab="Gender Groups", ylab="Freq", main="Bar Plot")

ageGroups <- table(readObj$age)

ageGroups
## 
##  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29 
##   4  18  85 202 307 443 476 473 398 382 287 240 219 153 142 119 111  90 
##  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47 
##  97  70  71  89  84  63  79  67  51  60  53  45  35  30  29  34  15  16 
##  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  64  66 
##  19  12   6  10   6   8   8   3   1   1   1   4   2   2   1   3   2
barplot(ageGroups, xlab="AGE", ylab="Freq", xlim=c(0,60), main="Bar Plot")

<!– Conclusion :- From above data and graphs have 2 disctinct pictures , one barplot with colourGroups tells us the how many Gender specific crimes have been committed in each category(Black/White).

But the main poiny which is quiet evident from the above plots(Historgam & BarPlot(ageGroups))is that mostly the crime committing age is between 18-22 years, i.e. normally crimes are committed by young age group.Age is prime important factor in Crime, as the no of crime committed at younger age is more than crime committed at later age

->