Introduction

Is an increase in the average age of drivers in the USA associated with lower risk of car fatalities?
According to The Center for Disease control, young drivers account for 14% of the population, but they account for 30% of all healthcare bills related to car accident injuries. In 2019 alone, 2,400 young drivers in the USA died in car accidents and another 258,000 admitted to hospitals to treat the injuries (Center for Disease Control, 2019).These statistics indicate how young drivers are more at risk of accidents. This paper, therefore, is interested in understanding whether with more maturity among drivers, the car fatality rate reduces.

Conceptual framework

According to Center for Disease control, young drivers face the highest risk of being engaged in a car accident. This is because young drivers are more inexperienced and make errors in judgement while driving. That aside, young drivers often drive at night and weekends, which times are considered to be more dangerous than weekday and daytime driving. According to a 2019 National Youth Risk Behavior Survey, 39% of young drivers texted or emailed while driving which points to dangerous driving habits among the youth. Due to these behavioral reasons among younger drivers, this paper hypothesizes that age is a key determinant in car fatalities and an increase in age should be associated with lesser risky behavior thus fewer car accidents.

Econometric (regression) model

An OLS model will be run on an unbalanced panel data for all USA states to answer this paper’s research question which is “Is an increase in the average age of drivers in the USA associated with lower risk of car fatalities?”.
There is a need to control for other variables that determine car fatalities. Since this is panel data, this paper will incorporate dummies for the US states to control for the interstate differences and also will incorporate of dummary for the years to control for the yearly changes. In addition, the Center for Disease Control (2019), Cohen & Einav (2001), Kufera et.al (2006) and Aarts & Letty (2006) find seat belt usage, primary and secondary seat belt enforcement laws, blood alcohol level and car speed respectively, as key determinants in car fatality rates. So in the model, these variables are controlled for. The fatality rate is log transformed and used as a dependent variable and the age variable is used as the key explanatory variable for that will answer the research question. A dummy variable for blood alcohol level is incorporated as an explanatory variable to determine the effect of blood alcohol levels below or equal to 0.08% on car fatalities. Other dummy variables include a 65 and 75 speed limit variables, primary and secondary enforcement law, the rate of seat belt usage and dummy variables for all USA states.

The model to be estimated is as follows:

\(log(fatalityrate) = \beta_0+\beta_1age + \beta_3sb.useage + \delta_1speed65 + \delta_2speed70 + \delta_3primary + \delta_4secondary+\delta_5ba08 + (state dummies) + (year dummies) + \mu_0\)

Data

The unbalanced panel data used in this paper is obtained from 50 US states and the District of Columbia. A description of the 9 variables is shown in Table 1 below.

data<-read.csv("C:\\Users\\Kevin Meng\\OneDrive\\Desktop\\book1.csv")  #pulling the spreadsheet with the data 

data<-cbind(data$state,data$year,data[,5:8],data[,10], data[,12:14])   #subsetting the data columns I need and dropping the rest

#renaming my data columns
colnames(data)<-c("state","year","fatalityrate","sb_useage","speed65","speed70","ba08","age","primary","secondary") 



#pulling the spreedsheet that describes the variables
descrip<-read.csv("C:\\Users\\Kevin Meng\\OneDrive\\Desktop\\variable1.csv") 

#adding some kable styles to my table
descrip %>%
  kbl(caption = "Table 1: Variable Description") %>%
  kable_classic(full_width = F, html_font = "Cambria")%>%
  kable_classic_2(full_width =F) %>%
  kable_material(c("striped", "hover"))%>%
  kable_styling(bootstrap_options = "striped", full_width = F, position = "left")
Table 1: Variable Description
Variable Description
log(fatalityrate) Number of fatalities per million of traffic miles (log trandformed*)
age Mean age of drivers
sb_useage Seat belt useage rate
speed65 Dummy variable for 65 miles per hour speed limit
speed70 Dummy variable for 70 or higher mile per hour speed limit
primary Dummy variable for primary enforcement of seat belt laws
secondary Dummy variable for secondary enforcement of seat belt laws
ba08 Dummy variable for blood alcohol limit = .08%
state U.S State Abbreviation
year years from 1983-1997

The Summary statistics of all the variables are shown in Table 2.

summary<-read.csv("C:\\Users\\Kevin Meng\\OneDrive\\Desktop\\summary1.csv")  #pulling the spreadsheet with the data summary


#adding some kable styles to my table

summary %>%
  kbl(caption = "Table 2: Summary Statistics") %>%
  kable_classic(full_width = F, html_font = "Cambria")%>%
  kable_classic_2(full_width =F) %>%
  kable_material(c("striped", "hover"))%>%
  kable_styling(bootstrap_options = "striped", full_width = F, position = "left") %>%
  footnote(general = "Number of Observations: 765")
Table 2: Summary Statistics
Variable Mean Standard.Deviation Minimum Maximum
fatalityrate 0.02149 0.006 0.008327 0.04547
age 35.14 1.698 28.23 39.17
sb_useage 0.5289 0.17 0.06 1
speed65 0.6458 0.479 0 1
speed70 0.07059 0.256 0 1
primary 0.1216 0.327 0 1
secondary 0.4954 0.5 0 1
ba08 0.1163 0.321 0 1
state N/A N/A N/A N/A
year N/A N/A 1983 1997
Note:
Number of Observations: 765

Results

To answer the research question, the foundation or baseline OLS model below ought to be run to examine the partial effect of age in determining the car accident fatality rate.

\(log(fatalityrate) = \beta_0+\beta_1age+\mu_0\)

However, to control for other variables that are reported in the literature as determinants of car accident fatalities, our final regression will include other independent variables below.

\(log(fatalityrate) = \beta_0+\beta_1age + \beta_3sb.useage + \delta_1speed65 + \delta_2speed70 + \delta_3primary + \delta_4secondary+\delta_5ba08 + (state dummies) + (year dummies) + \mu_0\)

The results of the OLS model are reported in Table 3.

#running all the eight linear models one at a time

model1<-lm(log(fatalityrate)~age,data)

model2<-lm(log(fatalityrate)~age+sb_useage,data)

model3<-lm(log(fatalityrate)~age+sb_useage+speed65,data)

model4<-lm(log(fatalityrate)~age+sb_useage+speed65+speed70,data)

model5<-lm(log(fatalityrate)~age+sb_useage+speed65+speed70+primary,data)

model6<-lm(log(fatalityrate)~age+sb_useage+speed65+speed70+primary+secondary,data)

model7<-lm(log(fatalityrate)~age+sb_useage+speed65+speed70+primary+secondary+ba08,data)

model8<-lm(log(fatalityrate)~age+sb_useage+speed65+speed70+primary+secondary+ba08+factor(state)+factor(year),data)

#outputting regression results in table format

stargazer(model1,model2,model3,model4,model5,model6,model7, model8, type="html",title = "Table 3: OLS Regression Results",out="C:\\Users\\Kevin Meng\\OneDrive\\Desktop\\mymodel1.htm")
Table 3: OLS Regression Results
Dependent variable:
log(fatalityrate)
(1) (2) (3) (4) (5) (6) (7) (8)
age -0.064*** -0.043*** -0.042*** -0.042*** -0.041*** -0.040*** -0.041*** 0.091***
(0.006) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.018)
sb_useage -0.503*** -0.537*** -0.557*** -0.737*** -0.869*** -0.807*** -0.141**
(0.059) (0.063) (0.064) (0.073) (0.089) (0.091) (0.063)
speed65 0.043 0.037 0.084*** 0.077*** 0.090*** -0.056***
(0.026) (0.026) (0.027) (0.027) (0.027) (0.020)
speed70 0.067* 0.083** 0.080** 0.101*** 0.073***
(0.034) (0.034) (0.034) (0.034) (0.017)
primary 0.156*** 0.243*** 0.225*** 0.003
(0.033) (0.047) (0.047) (0.040)
secondary 0.082** 0.058* -0.014
(0.032) (0.033) (0.016)
ba08 -0.096*** -0.044**
(0.031) (0.017)
factor(state)AL -0.537***
(0.105)
factor(state)AR -0.478***
(0.117)
factor(state)AZ -0.431***
(0.096)
factor(state)CA -0.626***
(0.078)
factor(state)CO -0.615***
(0.084)
factor(state)CT -1.294***
(0.130)
factor(state)DC -0.917***
(0.131)
factor(state)DE -0.777***
(0.105)
factor(state)FL -0.796***
(0.162)
factor(state)GA -0.578***
(0.078)
factor(state)HI -0.794***
(0.096)
factor(state)IA -0.800***
(0.127)
factor(state)ID -0.345***
(0.074)
factor(state)IL -0.780***
(0.099)
factor(state)IN -0.814***
(0.101)
factor(state)KS -0.732***
(0.106)
factor(state)KY -0.597***
(0.103)
factor(state)LA -0.323***
(0.075)
factor(state)MA -1.352***
(0.123)
factor(state)MD -0.817***
(0.096)
factor(state)ME -0.924***
(0.123)
factor(state)MI -0.735***
(0.094)
factor(state)MN -0.986***
(0.095)
factor(state)MO -0.697***
(0.114)
factor(state)MS -0.199**
(0.089)
factor(state)MT -0.485***
(0.100)
factor(state)NC -0.572***
(0.108)
factor(state)ND -1.001***
(0.111)
factor(state)NE -0.814***
(0.108)
factor(state)NH -0.999***
(0.099)
factor(state)NJ -1.173***
(0.122)
factor(state)NM -0.237***
(0.083)
factor(state)NV -0.397***
(0.096)
factor(state)NY -0.908***
(0.120)
factor(state)OH -0.882***
(0.106)
factor(state)OK -0.780***
(0.106)
factor(state)OR -0.696***
(0.120)
factor(state)PA -0.972***
(0.136)
factor(state)RI -1.444***
(0.129)
factor(state)SC -0.366***
(0.093)
factor(state)SD -0.631***
(0.105)
factor(state)TN -0.519***
(0.109)
factor(state)TX -0.497***
(0.078)
factor(state)UT -0.252***
(0.045)
factor(state)VA -0.882***
(0.094)
factor(state)VT -0.787***
(0.102)
factor(state)WA -0.888***
(0.094)
factor(state)WI -0.909***
(0.104)
factor(state)WV -0.631***
(0.137)
factor(state)WY -0.487***
(0.081)
factor(year)1984 0.002
(0.055)
factor(year)1985 -0.022
(0.051)
factor(year)1986 -0.008
(0.051)
factor(year)1987 0.004
(0.055)
factor(year)1988 -0.028
(0.056)
factor(year)1989 -0.110*
(0.057)
factor(year)1990 -0.164***
(0.057)
factor(year)1991 -0.226***
(0.059)
factor(year)1992 -0.316***
(0.060)
factor(year)1993 -0.328***
(0.062)
factor(year)1994 -0.355***
(0.063)
factor(year)1995 -0.382***
(0.066)
factor(year)1996 -0.421***
(0.067)
factor(year)1997 -0.438***
(0.069)
Constant -1.634*** -2.169*** -2.201*** -2.224*** -2.225*** -2.247*** -2.211*** -6.070***
(0.198) (0.246) (0.247) (0.246) (0.242) (0.240) (0.239) (0.532)
Observations 765 556 556 556 556 556 556 556
R2 0.144 0.189 0.193 0.198 0.230 0.239 0.252 0.919
Adjusted R2 0.143 0.186 0.189 0.193 0.223 0.231 0.243 0.908
Residual Std. Error 0.264 (df = 763) 0.235 (df = 553) 0.235 (df = 552) 0.234 (df = 551) 0.230 (df = 550) 0.229 (df = 549) 0.227 (df = 548) 0.079 (df = 484)
F Statistic 128.814*** (df = 1; 763) 64.417*** (df = 2; 553) 43.979*** (df = 3; 552) 34.102*** (df = 4; 551) 32.840*** (df = 5; 550) 28.741*** (df = 6; 549) 26.430*** (df = 7; 548) 77.762*** (df = 71; 484)
Note: p<0.1; p<0.05; p<0.01

From the results in Table 3, after controlling for all other variables in model 8, it is observed that the fatality rate decreases by approximately 9% for every 1 year average age increase in the USA which measure is statistically significant at the 99% confidence interval which proves this paper’s hypothesis. As the age increases, one is bound to gain more driving experience, less likely to drive during the weekend and nighttime and less likely to engage in distractive risky behavior like driving while texting. These factors are the main reasons according to the CDC that young drivers struggle with however, they start to fade with more experience as shown in the regression results.

Approximately 90% (Adjusted R-squared) of the variation in fatality rate is explained by the variation of our independent variables.

When Seat belt usage increases by 1 percentage point, fatality rate reduces by 14.1% and this measure is statistically significant at the 99% confidence level. This is inline with the findings of the Center for Disease Control and Cohen & Einav (2001) that seat belts do indeed reduce road fatalities. From our model, in agreement with Cohen & Einav (2001), there is no evidence for The Risk Compensation Theory. According to Dr. Vincent Ho, “people tend to adjust their behavior in response to perceived change in risk”. This theory does not hold when it comes to seat belts. The usage of seat belts does not have an indirect adverse effect that encourages careless driving.

Conclusion

In conclusion, there is statistical evidence at the 99% level of confidence that an increase in the average of drivers reduces the car fatality rate in the USA after controlling for all other variables that explain car fatality rates in the literature. This is so because younger drivers are more inexperienced and tend to partake in risky behavior while driving, acts which are expected to reduce as they mature.

References

Aarts, Letty & Schagen, Ingrid. (2006). Driving speed and the risk of road crashes: A review. Accident; analysis and prevention. 38. 215-24. 10.1016/j.aap.2005.07.004.

Center for Disease Control. (N.D). Teen Drivers: Get the Facts | Motor Vehicle Safety | CDC Injury Center. Centers for Disease Control and Prevention. Retrieved July 5, 2022, from https://www.cdc.gov/transportationsafety/teen_drivers/teendrivers_factsheet.html

Cohen, A., & Einav, L. (2001, October). The Effects Of Mandatory Seat Belt Laws On Driving Behavior And Traffic Fatalities. Harvard Law School. Retrieved July 5, 2022, from http://www.law.harvard.edu/programs/olin_center/papers/pdf/341.pdf

Kufera JA, Soderstrom CA, Dischinger PC, Ho SM, Shepard A. Crash culpability and the role of driver blood alcohol levels. Annu Proc Assoc Adv Automot Med. 2006;50:91-106. PMID: 16968631; PMCID: PMC3217472.

Safe Corner & Ho, V. (N.D). Safety Corner What is the Risk Compensation Theory? The Risk Compensation Theory suggests that people tend to adjust their behav. HKARMS. Retrieved July 5, 2022, from https://www.hkarms.org/Safety_Corner/2014-02%20What%20is%20the%20risk%20compensation%20theory.pdf

                                              END