Introduction to farsdata Package

®γσ, Lian Hu ®

2018-11-05

This package is primarily built for educational purposes. The package enables one to explore fatal traffic accidents from 2013-2015 using data from the National Highway Traffic Safety Administration (NHTSA) Fatality Analysis Reporting System (FARS).

The Data

The data in this package come from the National Highway Traffic Safety Administration (NHTSA) Fatality Analysis Reporting System (FARS) data.

fars_2013_fn <- make_filename(2013)
fars_2013 <- fars_read(fars_2013_fn) 
dim(fars_2013)
## [1] 30202    50
fars_2013
## # A tibble: 30,202 x 50
##    STATE ST_CASE VE_TOTAL VE_FORMS PVH_INVL  PEDS PERNOTMVIT PERMVIT
##    <int>   <int>    <int>    <int>    <int> <int>      <int>   <int>
##  1     1   10001        1        1        0     0          0       8
##  2     1   10002        2        2        0     0          0       2
##  3     1   10003        1        1        0     0          0       1
##  4     1   10004        1        1        0     0          0       3
##  5     1   10005        2        2        0     0          0       3
##  6     1   10006        2        2        0     0          0       3
##  7     1   10007        1        1        0     0          0       1
##  8     1   10008        2        2        0     0          0       2
##  9     1   10009        1        1        0     0          0       1
## 10     1   10010        2        2        0     0          0       4
## # ... with 30,192 more rows, and 42 more variables: PERSONS <int>,
## #   COUNTY <int>, CITY <int>, DAY <int>, MONTH <int>, YEAR <int>,
## #   DAY_WEEK <int>, HOUR <int>, MINUTE <int>, NHS <int>, ROAD_FNC <int>,
## #   ROUTE <int>, TWAY_ID <chr>, TWAY_ID2 <chr>, MILEPT <int>,
## #   LATITUDE <dbl>, LONGITUD <dbl>, SP_JUR <int>, HARM_EV <int>,
## #   MAN_COLL <int>, RELJCT1 <int>, RELJCT2 <int>, TYP_INT <int>,
## #   WRK_ZONE <int>, REL_ROAD <int>, LGT_COND <int>, WEATHER1 <int>,
## #   WEATHER2 <int>, WEATHER <int>, SCH_BUS <int>, RAIL <chr>,
## #   NOT_HOUR <int>, NOT_MIN <int>, ARR_HOUR <int>, ARR_MIN <int>,
## #   HOSP_HR <int>, HOSP_MN <int>, CF1 <int>, CF2 <int>, CF3 <int>,
## #   FATALS <int>, DRUNK_DR <int>

For detailed information about the data, see the NHTSA FARS Manuals & Documentation page.

Loading FARS Data

To load all of the data for a given year, use the make_filename() and fars_read() functions, as shown in the previous section.

About the Filename

Use the make_filename command to find out where data is stored on your machine and/or create a filename to save/load new data.

BEWARE: re-installing the package may cause your data to be overridden

fars_2013_fn <- make_filename(2013)
fars_2013_fn
## [1] "accident_2013.csv.bz2"

Single Year

If you wish to just look at fatality data for a a single year, use the fars_read_years() function with a single year as input. The only data columns selected are MONTH and year. This returns a list of length one, and the first element in the list is the tbl_df (the tidyverse data frame) listing the month and year for each fatal accident.

fars_2014 <- fars_read_years(years = 2014)
fars_2014[[1]]
## # A tibble: 30,056 x 2
##    MONTH  year
##    <int> <dbl>
##  1     1  2014
##  2     1  2014
##  3     1  2014
##  4     1  2014
##  5     1  2014
##  6     1  2014
##  7     1  2014
##  8     1  2014
##  9     1  2014
## 10     1  2014
## # ... with 30,046 more rows

Multiple Years

If you wish to look at fatalities for multiple years, enter a vector of years as the argument for the fars_read_years() function (examples: fars_read_years(years = c(2013, 2015)) or fars_read_years(2013:2015). Again, this returns a list of tbl_dfs, with each element of the list showing the month and year for each fatality.

fars_3yrs <- fars_read_years(years = 2013:2015)
fars_3yrs
## [[1]]
## # A tibble: 30,202 x 2
##    MONTH  year
##    <int> <int>
##  1     1  2013
##  2     1  2013
##  3     1  2013
##  4     1  2013
##  5     1  2013
##  6     1  2013
##  7     1  2013
##  8     1  2013
##  9     1  2013
## 10     1  2013
## # ... with 30,192 more rows
## 
## [[2]]
## # A tibble: 30,056 x 2
##    MONTH  year
##    <int> <int>
##  1     1  2014
##  2     1  2014
##  3     1  2014
##  4     1  2014
##  5     1  2014
##  6     1  2014
##  7     1  2014
##  8     1  2014
##  9     1  2014
## 10     1  2014
## # ... with 30,046 more rows
## 
## [[3]]
## # A tibble: 32,166 x 2
##    MONTH  year
##    <int> <int>
##  1     1  2015
##  2     1  2015
##  3     1  2015
##  4     1  2015
##  5     1  2015
##  6     1  2015
##  7     1  2015
##  8     1  2015
##  9     1  2015
## 10     1  2015
## # ... with 32,156 more rows

Summarizing FARS Data

The fars_summarize_years() function take the same argument as the fars_read_years(), and produces a summary of the simple counts of fatalities by month and year:

fars_summary <- fars_summarize_years(2013:2015)
fars_summary
## # A tibble: 12 x 4
##    MONTH `2013` `2014` `2015`
##    <int>  <int>  <int>  <int>
##  1     1   2230   2168   2368
##  2     2   1952   1893   1968
##  3     3   2356   2245   2385
##  4     4   2300   2308   2430
##  5     5   2532   2596   2847
##  6     6   2692   2583   2765
##  7     7   2660   2696   2998
##  8     8   2899   2800   3016
##  9     9   2741   2618   2865
## 10    10   2768   2831   3019
## 11    11   2615   2714   2724
## 12    12   2457   2604   2781

Mapping Fatal Crashes

Finally, the fars_map_state function takes a state ID number and a year, and maps that state’s fatalities with a dot at the fatality location. Note that in order to use this function, you will likely need to load the mapdata package.

For a list of the state ID numbers, see page 26 of the FARS Analytical User’s Guide (2015).

library(mapdata)
fars_map_state(53, 2014)

fars_map_state(36, 2014)