Analyzing the Impact of Commercial Freight Vehicles on Traffic Safety in Metro Atlanta
Background
As an Atlanta resident, I have frequently experienced the heavy amounts of highway traffic that the metro area is well known for, and I’ve seen the many accidents that occur on our highways. Thinking deeper about what could be the causes of or possible solutiuons to this problem, I hypothesized based on anecdotable evidence that the heavy presence of freight traffic (commercial vehicles, semi trucks) within the metro area could be one of the biggest causes of traffic and accidents. Once, I was even involved in an accident on I-75 just miles from Midtown Atlanta where another driver had swerved into me after merging too late due to being stuck next to a semi truck for too long. I wondered, how often accidents like this occur?
Earlier this semester, I read that GDOT is working on a Commercial Vehicle Lane (CVL) Project 1 that would separate truck traffic into dedicated lanes alongside I-75 North between Macon and McDonough. Upon researching this concept further, I found other examples of this type of system from across the world. In New Jersey, a study 2 found that accidents were more likely overall in mixed-use lanes than in segregated car-only lanes, and trucks were involved in a disproportionally high level of accidents in the mixed-use lanes. A study from Dallas, Texas 3 used a simulation model to predict that segregated truck-only lanes could reduce travel times in all lanes. A rendered photo of GDOT’s proposed Commercial Vehicle Lanes can be seen below.
Here in Atlanta, regulations prohibit trucks from using the interstates and highways within the perimeter (I-285), which heavily reduces the number of trucks on those roads, and keeps truck flow very high on the I-285 bypass. However, as the metro has grown immensely in the past few decades, I-285 no longer serves as an effective “bypass” of the city. The urbanized region has extended far beyond the I-285 boundary and deep into the suburban counties surrounding the city. And areas like Cumberland & Perimeter, located near interchanges of I-285, have grown into central business districts that generate heavy amounts of commuter and consumer traffic. Therefore, I-285 is used very heavily today for local traffic, in addition to also being the only bypass option for through traffic like trucks. I suggest that given the sprawl of the city, we must think about how to better serve the needs of metro Atlanta residents (local traffic), and projects like Commercial Vehicle Lanes or truck bypasses could be a solution. In this project, though, I will look deeper into just one aspect of this larger problem: road safety.
Research Goals
In this project, I wish to better understand the impact that commercial vehicles have on the safety of drivers in the Atlanta metro, and if the implementation of truck-only lanes (as seen in GDOT’s upcoming I-75 Commercial Vehicles Lanes project) or a truck bypass of the entire metro could potentially lead to safer driving conditions for metro Atlanta residents. To do so, I will be analyzing accident data from different segments of highway around Atlanta. I would like to find out not only the impact that trucks have when they are involved in accidents, but also determine if there are differences in the frequency of and severeness of all accidents on routes with varying rates of truck traffic flow. In doing so, I am essentially trying to find out the difference in outcomes between routes like GA400 & any highway within the perimeter (I-285) where truck flow is legally limited, and routes like I-285 and the highways the extend outwards from it where truck flow is much higher. With this information, I would like to model the difference in projected accident frequency and severeness in the event that truck flow was reduced in the metro through one of the methods mentioned above.
Importing, Cleaning, Merging, & Calculating the Data
Importing the Accident Report Data
I used GDOT’s Crash Data Dashboard to download a dataset including all accidents reported in the Atlanta Metro between 2021 & 2023. In the code below, I cleaned the data for analysis.
| date_time | county | road | fatalities | serious_injuries | visible_injuries | complaint_injuries | num_vehicles | latitude | longitude |
|---|---|---|---|---|---|---|---|---|---|
| 03/20/2023 06:43 AM | Jackson | Sr 60 | 0 | 0 | 0 | 1 | 2 | 34.12859 | -83.71811 |
| 03/29/2023 09:14 PM | Fulton | Woodstock Rd | 0 | 0 | 0 | 0 | 2 | 34.05890 | -84.38351 |
| 03/24/2023 06:45 PM | DeKalb | Chamblee Dunwoody Rd | 0 | 0 | 0 | 0 | 2 | 33.95150 | -84.33661 |
| 03/24/2023 08:45 AM | Fulton | I 75 | 0 | 0 | 0 | 0 | 2 | 33.74518 | -84.39013 |
| 03/25/2023 07:20 PM | Hall | Capitola Farm Rd | 0 | 0 | 1 | 2 | 2 | 34.15721 | -83.89668 |
In the following code, I spatially joined both sf dataframes to result in a final dataframe with a binary column ‘involved_truck’ that specifies whether a truck was involved or not in all of the accidents reported in the full dataset.
Next, I use a search pattern to filter the dataset to only include data points that occur on one of the metro Atlanta highways I am studying. The method I used was very effective at filtering out accidents on other roads, but it is not perfect due to the inconsistency in the naming of roads in GDOT’s data. However, the results of ther filtering are sufficient for their purpose: a map-based visualization of accidents.
| date_time | county | road | fatalities | serious_injuries | visible_injuries | complaint_injuries | num_vehicles | geometry | involved_truck |
|---|---|---|---|---|---|---|---|---|---|
| 03/24/2023 08:45 AM | Fulton | I 75 | 0 | 0 | 0 | 0 | 2 | POINT (-84.39013 33.74518) | 1 |
| 03/22/2023 08:48 PM | Fulton | 285 | 0 | 0 | 0 | 0 | 2 | POINT (-84.42885 33.90928) | 1 |
| 03/20/2023 12:47 AM | Fulton | I-85 | 0 | 0 | 0 | 2 | 2 | POINT (-84.39241 33.79218) | 0 |
| 03/17/2023 09:15 AM | DeKalb | 285 | 0 | 0 | 0 | 0 | 2 | POINT (-84.23135 33.73162) | 1 |
| 03/10/2023 08:16 AM | Fulton | 285 | 0 | 0 | 0 | 0 | 2 | POINT (-84.35884 33.9108) | 0 |
GDOT Traffic Flow Data
I used GDOT’s Traffic Analysis & Data Application to download a dataset that includes average traffic flow (daily number of cars) and the percentage of that flow that is trucks (freight vehicles) at specified sensors alongside different highways in metro Atlanta.
Each point was used to estimate flow & truck percentage values for short segments of highway defined later in this report. To note: each point covers only one direction of the highway, and values for traffic in each direction of various highways was unavailable from GDOT. Therefore, in this report, I am assuming that flow rates are the same in each direction of traffic. In the code below, I specify the specific traffic sensors I will use in my analysis from GDOT’s traffic flow dataset and clean the data from GDOT and specify the sensors useful for the highways I am studying.
A small number of the sensors I’m studying had missing values for the percentage of the flow that was trucks. For these points, I used imputatiuon to estimate the percentage at each point based on values from the same highway but further from the city center and the difference between points near and far from the city on other highways.
| id | class | geometry | avg_flow | avg_truck_per | avg_num_truck |
|---|---|---|---|---|---|
| 015-0276 | 1U : Urban Principal Arterial - Interstate | POINT (-84.75219 34.22141) | 83566.67 | 25.76667 | 21521 |
| 035-0127 | 1R : Rural Principal Arterial - Interstate | POINT (-84.07545 33.22593) | 91766.67 | 25.53333 | 23440 |
| 063-1192 | 1U : Urban Principal Arterial - Interstate | POINT (-84.3896 33.59806) | 196666.67 | 11.60000 | 22813 |
| 063-1201 | 1U : Urban Principal Arterial - Interstate | POINT (-84.3764 33.64036) | 145666.67 | 15.03333 | 22034 |
| 063-1207 | 1U : Urban Principal Arterial - Interstate | POINT (-84.43306 33.61893) | 182333.33 | 17.40000 | 31726 |
Highway Segments
I imported a shapefile of all highways in Georgia for mapping purposes. I then created a linestring spatial feature objects for each segment of highway I will be analyzing.
| distance | id | geometry |
|---|---|---|
| 5842.47 | 1 | LINESTRING (-84.47792 33.62… |
| 5015.05 | 2 | LINESTRING (-84.37872 33.63… |
| 8052.01 | 3 | LINESTRING (-84.31866 33.67… |
| 19894.57 | 4 | LINESTRING (-84.23868 33.71… |
| 6883.27 | 5 | LINESTRING (-84.27076 33.90… |
TomTom Traffic Speed Data
My next data source is TomTom, which measures traffic backups and speeds based on cell phone GPS data. From their website, I specified 44 segments of highway (in both directions of traffic) from across the metro area in order to study the impact of traffic speeds.
First, I read each of the 44 shapefiles from a folder and assigned id’s.
Next, I imported and cleaned a data file that included variable data for each segment.
| route_id | route | route_description | time | length_m | miles | avg_time | avg_speed | differential | distance | geometry |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 285 Counter, 85 to 75 | heavy_truck | overnight | 5842.47 | 3.63 | 00:03:20 | 65.22 | 1.00 | 5842.47 | LINESTRING (-84.47792 33.62… |
| 1 | 285 Counter, 85 to 75 | heavy_truck | am_rush | 5842.47 | 3.63 | 00:03:21 | 64.93 | 1.00 | 5842.47 | LINESTRING (-84.47792 33.62… |
| 1 | 285 Counter, 85 to 75 | heavy_truck | midday | 5842.47 | 3.63 | 00:03:27 | 63.20 | 1.03 | 5842.47 | LINESTRING (-84.47792 33.62… |
| 1 | 285 Counter, 85 to 75 | heavy_truck | pm_rush | 5842.47 | 3.63 | 00:03:38 | 59.84 | 1.09 | 5842.47 | LINESTRING (-84.47792 33.62… |
Note: The data I gathered is from the entirety of August 2024, as my free trial only allowed for the download of one month’s data.
Note: From the TomTom data, the most useful variable I will include in my models is the speed differential value. In the dataset I downloaded, I gathered travel times (speeds) at four different time periods for each day: morning rush hour (7am-10am), midday (12pm-2pm), evening rush hour (3:30pm-7pm), and overnight (12am-2am). The differential value for each row in the dataset measures the difference in average speed between the given time of day and the base value, defined as the overnight speed. For example, a morning rush value of 1.5 would mean that the overnight speed divided by the morning speed = 1.5. A higher differential value means bigger difference in overnight vs daytime travel times, or in other words, more traffic/delays. This data will allow me to use a measure of traffic as an independent variable in modeling.
Merging the Traffic, Accident & Flow Data,
| route_id | route | route_description | time | length_m | miles | avg_time | avg_speed | differential | distance | geometry | fatalities_mi_yr | accidents_mi_yr | vehicles_mi_yr | injuries_mi_yr | avg_truck_per | avg_flow | fatalities_mi_yr_flow | accidents_mi_yr_flow | vehicles_mi_yr_flow | injuries_mi_yr_flow | is_285 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 285 Counter, 85 to 75 | heavy_truck | overnight | 5842.47 | 3.63 | 00:03:20 | 65.22 | 1.00 | 5842.47 | LINESTRING (-84.47792 33.62… | 2.020202 | 406.9789 | 829.2011 | 11.20294 | 17.4 | 182333.3 | 1.11e-05 | 0.0022321 | 0.0045477 | 6.14e-05 | 1 |
| 1 | 285 Counter, 85 to 75 | heavy_truck | am_rush | 5842.47 | 3.63 | 00:03:21 | 64.93 | 1.00 | 5842.47 | LINESTRING (-84.47792 33.62… | 2.020202 | 406.9789 | 829.2011 | 11.20294 | 17.4 | 182333.3 | 1.11e-05 | 0.0022321 | 0.0045477 | 6.14e-05 | 1 |
| 1 | 285 Counter, 85 to 75 | heavy_truck | midday | 5842.47 | 3.63 | 00:03:27 | 63.20 | 1.03 | 5842.47 | LINESTRING (-84.47792 33.62… | 2.020202 | 406.9789 | 829.2011 | 11.20294 | 17.4 | 182333.3 | 1.11e-05 | 0.0022321 | 0.0045477 | 6.14e-05 | 1 |
| 1 | 285 Counter, 85 to 75 | heavy_truck | pm_rush | 5842.47 | 3.63 | 00:03:38 | 59.84 | 1.09 | 5842.47 | LINESTRING (-84.47792 33.62… | 2.020202 | 406.9789 | 829.2011 | 11.20294 | 17.4 | 182333.3 | 1.11e-05 | 0.0022321 | 0.0045477 | 6.14e-05 | 1 |
Maps
Because I now have spatial data, the best way to analyze it is by mapping the data to visualize where accidents are occurring, where traffic is backed up, and which segments of highway have a lot of trucks on them!
Mapping Traffic on Atlanta Highways:
In these maps, the truck flow percentage is indicated by segment line width.
By mapping the speed differentials, I can get a better view of where traffic backups occur. I mapped differentials in the evening rush hour for traffic exiting Atlanta and in the morning rush hour for traffic entering Atlanta. I see that the morning differentials are much higher on the highways on the northside of Atlanta than the southside, possibly indicating a larger number of commuters using those roadways. I also see that differentials are far higher on the portions of highway outside of the perimeter, and lower inside the perimeter where truck traffic is limited. In the afternoon, the differentials are more evenly distributed across the region, although you see again that differentials are higher outside of the perimeter than inside. Also, the highest and lowest differentials happen to be on I-85 north of I-285 and on I-85 south of I-285, which is interesting. These two maps indicate that there could certainly be a relationship between the truck percentage on a highway and the buildup of rush hour traffic. Additionally, the fact that differentials are higher going towards the city rather than away could be a result of heavy congeestion at each of the I-285 interchanges, which is also heavily impacted by the number of trucks on I-285. Lastly, I note that speed differentials are actually highest on many portions of I-285, indicating that it is the highway most impacted by traffic delays in the metro.
Mapping Accident Frequency:
In these maps, the truck flow percentage is indicated by segment line width.