The final project deals with the open Chicago data at the location given below - related to traffic crashes in city of Chicago. https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if The data contains the crashes details including important parameters like crash date, posted speed limit, weather condition, # of lanes, traffic way type.
As a part of this visualization project, we will be building a user friendly visualization in R Shiny using Plotly - to show how the crashes frequency and seriousness is impacted based on important parameters like the month of the year, posted speed limit, # of lanes, etc.
Below I have prepared some visualizations using plotly which I am utilizing in generating a Shiny app for the final users to help look for multiple options to explore how the Chicago accidents have happened under varu=ious categries.
library(RSocrata)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following objects are masked from 'package:plyr':
##
## arrange, mutate, rename, summarise
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
chicago_accidents <- read.socrata("https://data.cityofchicago.org/resource/85ca-t3if.json")
print(dim(chicago_accidents))
## [1] 362077 49
colnames(chicago_accidents)
## [1] "rd_no" "crash_date"
## [3] "posted_speed_limit" "traffic_control_device"
## [5] "device_condition" "weather_condition"
## [7] "lighting_condition" "first_crash_type"
## [9] "trafficway_type" "alignment"
## [11] "roadway_surface_cond" "road_defect"
## [13] "report_type" "crash_type"
## [15] "intersection_related_i" "damage"
## [17] "date_police_notified" "prim_contributory_cause"
## [19] "sec_contributory_cause" "street_no"
## [21] "street_direction" "street_name"
## [23] "beat_of_occurrence" "num_units"
## [25] "most_severe_injury" "injuries_total"
## [27] "injuries_fatal" "injuries_incapacitating"
## [29] "injuries_non_incapacitating" "injuries_reported_not_evident"
## [31] "injuries_no_indication" "injuries_unknown"
## [33] "crash_hour" "crash_day_of_week"
## [35] "crash_month" "latitude"
## [37] "longitude" "location.type"
## [39] "location.coordinates" "lane_cnt"
## [41] "hit_and_run_i" "statements_taken_i"
## [43] "crash_date_est_i" "private_property_i"
## [45] "photos_taken_i" "work_zone_i"
## [47] "work_zone_type" "dooring_i"
## [49] "workers_present_i"
chicago_accidents %>% head(5) %>% kable() %>% kable_styling()
| rd_no | crash_date | posted_speed_limit | traffic_control_device | device_condition | weather_condition | lighting_condition | first_crash_type | trafficway_type | alignment | roadway_surface_cond | road_defect | report_type | crash_type | intersection_related_i | damage | date_police_notified | prim_contributory_cause | sec_contributory_cause | street_no | street_direction | street_name | beat_of_occurrence | num_units | most_severe_injury | injuries_total | injuries_fatal | injuries_incapacitating | injuries_non_incapacitating | injuries_reported_not_evident | injuries_no_indication | injuries_unknown | crash_hour | crash_day_of_week | crash_month | latitude | longitude | location.type | location.coordinates | lane_cnt | hit_and_run_i | statements_taken_i | crash_date_est_i | private_property_i | photos_taken_i | work_zone_i | work_zone_type | dooring_i | workers_present_i |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AJ101349 | 1483369980 | 30 | NO CONTROLS | NO CONTROLS | CLEAR | DAYLIGHT | SIDESWIPE SAME DIRECTION | PARKING LOT | STRAIGHT AND LEVEL | DRY | NO DEFECTS | NOT ON SCENE (DESK REPORT) | NO INJURY / DRIVE AWAY | N | $501 - $1,500 | 1483373700 | IMPROPER LANE USAGE | NOT APPLICABLE | 450 | E | 35TH ST | 211 | 2 | NO INDICATION OF INJURY | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 9 | 2 | 1 | 41.831296077 | -87.614925683 | Point | c(-87.614925683354, 41.831296076845) | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| AJ103671 | 1483541100 | 35 | NO CONTROLS | NO CONTROLS | CLEAR | DAYLIGHT | SIDESWIPE SAME DIRECTION | NOT DIVIDED | STRAIGHT AND LEVEL | DRY | NO DEFECTS | NOT ON SCENE (DESK REPORT) | NO INJURY / DRIVE AWAY | NA | $501 - $1,500 | 1483542600 | UNABLE TO DETERMINE | ANIMAL | 7144 | N | RIDGE BLVD | 2411 | 2 | NO INDICATION OF INJURY | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 8 | 4 | 1 | 42.012292006 | -87.683227658 | Point | c(-87.683227657543, 42.012292006227) | 2 | Y | NA | NA | NA | NA | NA | NA | NA | NA |
| AJ114251 | 1484323200 | 30 | TRAFFIC SIGNAL | FUNCTIONING PROPERLY | CLOUDY/OVERCAST | DAYLIGHT | SIDESWIPE SAME DIRECTION | NOT DIVIDED | STRAIGHT AND LEVEL | ICE | NO DEFECTS | ON SCENE | NO INJURY / DRIVE AWAY | Y | OVER $1,500 | 1484323380 | IMPROPER LANE USAGE | IMPROPER OVERTAKING/PASSING | 700 | S | CICERO AVE | 1533 | 2 | NO INDICATION OF INJURY | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 10 | 6 | 1 | 41.87205805 | -87.745079343 | Point | c(-87.745079343101, 41.872058050459) | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| AJ123519 | 1484961180 | 15 | STOP SIGN/FLASHER | FUNCTIONING PROPERLY | CLEAR | DARKNESS, LIGHTED ROAD | ANGLE | NOT DIVIDED | STRAIGHT AND LEVEL | DRY | NO DEFECTS | NOT ON SCENE (DESK REPORT) | NO INJURY / DRIVE AWAY | NA | OVER $1,500 | 1484961900 | FAILING TO YIELD RIGHT-OF-WAY | UNABLE TO DETERMINE | 8700 | S | HARPER AVE | 412 | 2 | NO INDICATION OF INJURY | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 19 | 6 | 1 | 41.736802907 | -87.587057965 | Point | c(-87.587057964619, 41.736802906927) | 2 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| AJ390611 | 1502668800 | 30 | NO CONTROLS | NO CONTROLS | CLEAR | DARKNESS | PARKED MOTOR VEHICLE | ONE-WAY | STRAIGHT AND LEVEL | DRY | NO DEFECTS | ON SCENE | INJURY AND / OR TOW DUE TO CRASH | NA | OVER $1,500 | 1502731800 | IMPROPER BACKING | UNABLE TO DETERMINE | 8454 | S | COLFAX AVE | 423 | 2 | NO INDICATION OF INJURY | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 19 | 1 | 8 | 41.741094906 | -87.561352197 | Point | c(-87.561352197263, 41.741094905698) | 1 | N | NA | NA | NA | NA | NA | NA | NA | NA |
Plotting total crashes - month wise:
chicago_accidents$crash_month_year <- substr(chicago_accidents$crash_date, 1,7)
plyr::count(chicago_accidents, "crash_month_year") %>%
subset(crash_month_year > '2015-09') %>%
plot_ly(x = ~crash_month_year, y = ~freq, mode = 'lines', type = 'scatter')
Plotting non-fatal crashes distribution based on road condition for one of the 12 months - from overall reading.
# Displaying the distribution of the crashes based on the road conditions for the month of January.
chicago_accidents %>% subset(injuries_fatal = 0) %>%
subset(crash_month == 1) %>%
plyr::count("roadway_surface_cond") %>%
plot_ly(x = ~roadway_surface_cond, y = ~freq, type = 'bar') %>%
layout(yaxis = list(title = 'value'), barmode = 'stack')
On similar lines, fatal crashes distribution –> injuries_fatal > 0:
chicago_accidents %>% subset(injuries_fatal > 0) %>%
subset(crash_month == 1) %>%
plyr::count("roadway_surface_cond") %>%
plot_ly(x = ~roadway_surface_cond, y = ~freq, type = 'bar') %>%
layout(yaxis = list(title = 'value'), barmode = 'stack')
We will use the above graphs to present plots based on what the user choses - fatal or no-fatal crashes, and then the month of the year the user wants to analyze.