Two datasets were used for this project. The first is the gas station dataset. This dataset contains 72798 observations and 31 variables. It gives information about the various locations of gas stations in the US. Some of the variables include state, county, city, longitude, latitude, availability of diesel and so on.
The second dataset is the Philly Crime Data (2015 - 2024). It is a longitudinal data that contains crime cases in the Philadelphia area since 2015. It has 15243 observations and 19 variables. Some of the variables include: race, sex, date of crime, fatal crimes, street name, block number, zip code and so on.
The datasets can be found here:
Gas Station Dataset: https://github.com/chinwex/STA553/raw/main/w07/POC.csv
Philly Crime Dataset:https://github.com/chinwex/STA553/raw/main/w07/PhillyCrimeSince2015.csv
The table below gives information about the first six gas stations in the dataset.
dt <- read.csv(file="https://github.com/chinwex/STA553/raw/main/w07/POC.csv")[,-1]
kable(head(dt))
| site_row_id | STATE | county | ADDRESS | CITY | ycoord | xcoord | SITE_DESCRIPTION | service_or_fuel | diesel | twentyfour_hour_flag | car_wash | truckstop_flag | description | PUMP_TECH | POC | HIFCA | ZIPnew | POCAGE | POCGAP | ZIPPOC | HFG | MSA | dist.to.poc | cate.poc.density | cate.poc.age | cate.poc.age.20 | cate.poc.intensity | cate.poc.intensity.tot | MSA_POC | MSA_POC.1 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1-3R8J-494 | CA | Los Angeles | 37120 47TH ST E | PALMDALE | 34.55584 | -118.0452 | Los Angeles-Long Beach-Santa Ana CA | Fuel | Y | Y | N | Y | URBAN | O | 0 | 0 | 93552 | NA | NA | 0 | 1 | 4480 | 8.2275601 | (-1e-06,1] | (15,140] | (15,140] | (5,Inf] | (8,Inf] | 1 | 1 |
| 1-3R8J-362 | WA | Franklin | 1212 N 4TH AVE | PASCO | 46.23890 | -119.0950 | Kennewick-Pasco-Richland WA | Fuel | N | N | N | N | URBAN | O | 0 | 1 | 99301 | NA | NA | 1 | 0 | 6740 | 0.2788194 | (-1e-06,1] | (15,140] | (15,140] | (5,Inf] | (8,Inf] | 0 | 0 |
| 1-3R8J-199 | NV | Washoe | 99 DAMONTE RANCH PKWY | RENO | 39.41961 | -119.7549 | Reno-Sparks NV | Fuel | Y | Y | N | N | URBAN | O | 0 | 1 | 89521 | NA | NA | 1 | 0 | 6720 | 1.3055498 | (1,5] | (0,15] | (0,15] | (0,5] | (0,8] | 0 | 0 |
| 1-3R8J-261 | UT | Salt Lake | 5404 S 4200 W | SALT LAKE CITY | 40.65107 | -112.0101 | Salt Lake City UT | Fuel | N | Y | N | N | URBAN | O | 0 | 1 | 84118 | NA | NA | 0 | 0 | 7160 | 8.2792641 | (-1e-06,1] | (15,140] | (15,140] | (0,5] | (0,8] | 0 | 0 |
| 1-3R8J-493 | CA | Los Angeles | 1731 E AVE J | LANCASTER | 34.68966 | -118.0984 | Los Angeles-Long Beach-Santa Ana CA | Fuel | N | Y | N | N | URBAN | O | 0 | 0 | 93535 | NA | NA | 0 | 1 | 4480 | 17.6058504 | (-1e-06,1] | (140,Inf] | (15,140] | (-0.0001,0] | (-0.0001,0] | 1 | 1 |
| 1-3R8J-508 | WA | Benton | 2707 S QUILLAN ST | KENNEWICK | 46.18435 | -119.1739 | Kennewick-Pasco-Richland WA | Fuel | Y | Y | N | N | URBAN | O | 0 | 1 | 99337 | NA | NA | 0 | 0 | 6740 | 8.7976927 | (-1e-06,1] | (15,140] | (15,140] | (5,Inf] | (8,Inf] | 0 | 0 |
A random sample of 500 gas stations were taken from the dataset to plot on the map.
set.seed(100)
dt500 <- dt[sample(nrow(dt), size = 500), ]
title1 <- tags$div( HTML('<font color = "darkred" size =5><b>A Map of 500 Gas Stations in the US</b></font>'))
leaflet(dt500) %>%
addTiles() %>%
setView(lng = -75.5978, lat=39.9522, zoom = 9) %>%
addMarkers(~xcoord, ~ycoord, popup = ~paste("State: ",STATE,
"<br>County: ", county,
"<br>Address: ", ADDRESS,
"<br>Zipcode: ", ZIPnew))%>%
addControl(title1, position = "topright", className = "map-title")
A Map of 500 Gas Stations in the US
The table below shows the first 6 crime cases in the dataset.
phily <- na.omit(read.csv(file = "https://github.com/chinwex/STA553/raw/main/w07/PhillyCrimeSince2015.csv"))
kable(head(phily))
| dc_key | race | sex | fatal | date | has_court_case | age | street_name | block_number | zip_code | council_district | police_district | neighborhood | house_district | senate_district | school_catchment | lng | lat |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2.02422E+11 | Black (Non-Hispanic) | Female | Nonfatal | 3/3/2024 14:49 | No | 20 | N COLORADO ST | 2500 | 19132 | 5 | 22 | Sharswood-Stanton | 181 | 3 | Tanner G. Duckrey School | -75.16060 | 39.99166 |
| 2.02426E+11 | Hispanic (Black or White) | Male | Nonfatal | 3/1/2024 22:18 | No | 58 | N FRANKLIN ST | 2600 | 19133 | 5 | 26 | Northern Liberties-West Kensington | 197 | 3 | John F. Hartranft School | -75.14468 | 39.99152 |
| 2.02422E+11 | Black (Non-Hispanic) | Male | Fatal | 2/29/2024 22:59 | No | 49 | MOUNT PLEASANT DR | 3700 | 19121 | 4 | 22 | Park | 190 | 7 | James G. Blaine School | -75.20027 | 39.98462 |
| 2.02422E+11 | Black (Non-Hispanic) | Female | Fatal | 2/29/2024 22:59 | No | 38 | MOUNT PLEASANT DR | 3700 | 19121 | 4 | 22 | Park | 190 | 7 | James G. Blaine School | -75.20027 | 39.98462 |
| 2.02419E+11 | Black (Non-Hispanic) | Male | Nonfatal | 2/29/2024 19:30 | No | 19 | MASTER ST | 5600 | 19131 | 4 | 19 | Haddington-Overbrook | 192 | 7 | Universal Charter School at Bluford | -75.23338 | 39.97346 |
| 2.02439E+11 | Black (Non-Hispanic) | Male | Fatal | 2/29/2024 1:53 | No | 31 | PULASKI AVE | 5500 | 19144 | 8 | 39 | East Falls-Westside | 198 | 3 | John B. Kelly School | -75.17899 | 40.02939 |
All 2023 crime cases where filtered from the dataset and shown in the map below.
phily$year <- format(as.Date(phily$date, format = "%m/%d/%Y"), "%Y")
phily2023 <- filter(phily, year == 2023)
philcolor <- rep("#0072B2", length(phily2023$fatal))
philcolor[which(phily2023$fatal == "Fatal")] <- "#D55E00"
labels <- paste("Street Name: ",phily2023$street_name,
"<br>Block Number: ", phily2023$block_number,
"<br>Neighborhood: ", phily2023$neighborhood,
"<br>Zipcode: ", phily2023$zip_code)%>%
lapply(htmltools::HTML)
title <- tags$div( HTML('<font color = "purple" size =5><b>2023 Philly Crime Locations</b></font>'))
Annotat <- tags$div(HTML('<center><font color = "blue" size =3>The circle sizes are proportional to the ages</font>'))
leaflet(phily2023) %>%
addTiles() %>%
setView(lng = mean(phily2023$lng), lat=mean(phily2023$lat), zoom = 15) %>%
addCircleMarkers(
~lng,
~lat,
color = philcolor,
fillColor = ifelse(phily2023$fatal == "Nonfatal", "#ffff66", "#ff99ff"),
radius = ~(phily2023$age/10)*2,
opacity = 1,
# stroke = FALSE,
fillOpacity = 0.25,
label = ~labels ) %>%
addLegend(position = "bottomright",
colors = c("#0072B2", "#D55E00"),
labels= c("Nonfatal", "Fatal"),
title= "Crime Type",
opacity = 0.5)%>%
addControl(title, position = "topright", className = "map-title")%>%
addControl(Annotat, position = "bottomleft")%>%
addLegendSize(position = 'bottomright',
values = (phily2023$age/10)*2,
color = 'gray',
fillColor = 'gray',
opacity = .5,
title = 'Age',
shape = 'circle',
orientation = 'vertical',
breaks = 4)
A Map of Philly Crime Locations in 2023
The map above shows various crime locations in Philadelphia with orange circles representing fatal crimes and blue circles representing nonfatal crimes. From the plot, it is clear that in 2023, there were more nonfatal cases than fatal ones. The circle sizes are proportional to the ages on the dataset. Most of the fatal crimes are associated with younger ages. Majority of crime cases are seen in North Philadelphia.