Importing RCurl library for reading github url
library(RCurl)
Reading and importing csv from my github
data <- getURL("https://raw.githubusercontent.com/mianshariq/SPS_Bridge/main/MplsStops.csv?token=AQ2YTI45J2CA2CL3G5UNK4TBA5OGM", ssl.verifypeer=0L, followlocation=1L)
df=read.csv(text=data)
First 5 rows of my data
head(df)
Summary of my data
summary(df)
## X idNum date problem
## Min. : 6823 Length:51920 Length:51920 Length:51920
## 1st Qu.:20379 Class :character Class :character Class :character
## Median :33864 Mode :character Mode :character Mode :character
## Mean :33861
## 3rd Qu.:47387
## Max. :60838
## MDC citationIssued personSearch vehicleSearch
## Length:51920 Length:51920 Length:51920 Length:51920
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## preRace race gender lat
## Length:51920 Length:51920 Length:51920 Min. :44.89
## Class :character Class :character Class :character 1st Qu.:44.95
## Mode :character Mode :character Mode :character Median :44.98
## Mean :44.97
## 3rd Qu.:45.00
## Max. :45.05
## long policePrecinct neighborhood
## Min. :-93.33 Min. :1.000 Length:51920
## 1st Qu.:-93.29 1st Qu.:2.000 Class :character
## Median :-93.28 Median :3.000 Mode :character
## Mean :-93.27 Mean :3.257
## 3rd Qu.:-93.25 3rd Qu.:4.000
## Max. :-93.20 Max. :5.000
Creating subset of my data
df1=df[,c(1,4,10,11,14,15)]
head(df1)
Selecting only first 100 rows
df100=df1[1:100,]
Summary of new table
summary(df100)
## X problem race gender
## Min. :6823 Length:100 Length:100 Length:100
## 1st Qu.:6848 Class :character Class :character Class :character
## Median :6874 Mode :character Mode :character Mode :character
## Mean :6873
## 3rd Qu.:6898
## Max. :6923
## policePrecinct neighborhood
## Min. :1.00 Length:100
## 1st Qu.:2.00 Class :character
## Median :4.00 Mode :character
## Mean :3.33
## 3rd Qu.:5.00
## Max. :5.00
Changing the values in certain rows
df100$race = ifelse(df100$race %in% c("East African"),"Black", df100$race)
df100$problem = ifelse(df100$problem %in% c("suspicious"),"SuspiciousStop", df100$problem)
df100[1:20,]