Abstract

Using a logistic regression model that controls for neighbourhood characteristics and traffic changes over time in Toronto, along with collision level controls, I find a statistically significant result on the effect of an uncontrolled intersection on major pedestrian injuries, but not for the effect of stop signs. Uncontrolled intersections, on average, have odds that a major pedestrian injury occurs at an uncontrolled intersection is 1.05 times more than traffic signal intersections, which is 5% more, excluding any controls. This rises to roughly 2.6 times more, or 160% more when including collision level controls, neighbourhood and time fixed effects. Furthermore, aggressive driving and pedestrian condition play an important role in reducing major injuries. On average, the odds of a major injury occurring increase by roughly 3.6 times and 3.9 times when driving aggressively and at minimum, the pedestrian has been drinking, respectively. Policy recommendations based on this data set and analysis include higher penalties for aggressive drivers, and with recent cannabis legalization perhaps applying similar public consumption laws to alcohol should reduce major pedestrian injuries. However, more research should be conducted for each policy recommendation.

Introduction

The economic question of traffic controls can be simplified down to one key point. Which is, whether a stop sign will decrease major pedestrian injuries, as the cost associated with implementing a stop sign at a traffic intersection is less than it would be to construct a traffic signal. Generally, one would expect uncontrolled intersections more dangerous, as the motorist is given more freedom in their decision making in hopes to improve traffic flow. The trade-off over an uncontrolled intersection is that a stop sign would decrease traffic flow, which might be more costly compared to statistically decreasing major injuries. However, one must also be conscious of the healthcare benefits towards decreasing major injuries. Due to the public nature of Canada’s healthcare system, a decrease in major pedestrian injuries would allow for a reduction of wait times and, moreover, saving on healthcare expenditure. Thus, an important consideration in the injury rate due to collisions occurring at an uncontrolled intersection.

In this paper, I aim to find whether uncontrolled intersections are causal in the outcome of a major pedestrian injury in Toronto. The method I will employ is conditional on observables, there should be a causal effect of uncontrolled intersections on major pedestrian injuries. I will use a logistic model, and an ordinary least square model as a benchmark, to estimate the causal effect of uncontrolled intersections on major injuries. Furthermore, I will use neighbourhood effects to control for different traffic characteristics to aid in controlling for heterogeneity. Moreover, time fixed effects will also be considered to control for changes in traffic characteristics over time. The idea here is, conditional on above, traffic controls should be as good as randomly assigned to areas of the city. Another important variable here which is missing is the driver’s skill, or rather ability. I have decided to use the driver involved in the accident’s age as a proxy for driver’s ability. The rationale for this is driving on the public roads should not be overly difficult to do, and thus experience should be important. Therefore, the older you are, the more experienced the driver should be, generally. This, however, is not entirely true as the drivers in this data set could have obtained their driver’s license at different times, so the experience level need not depend on age specifically. A better proxy could have been the driver test score for each driver, however, I do not have access to this data.

The main results are significant, uncontrolled intersections on average cause major pedestrian injuries. On average, if an incident occurs at an uncontrolled intersection, the odds of a major injury occurring to the pedestrian increase by 1.05 times, or 5% more relative to collisions occurring at traffic signals, and roughly 2.6 times after including collision level controls, neighbourhood and time fixed effects. Moreover, two large predictors of major pedestrian injuries are driver negligence and pedestrians substance use.

df <- read.csv("~/Desktop/York Econ/ECON5280/Pedestrians.csv")
df.ped <- df %>% dplyr::filter(INVTYPE== "Pedestrian")
df.driver <- df %>% dplyr::filter(INVTYPE== "Driver") %>% 
  dplyr::select(ACCNUM, INVAGE) %>% 
  rename(driver.age=INVAGE)
df.driver.age.ped <- na.omit(left_join(df.ped, df.driver, by = "ACCNUM"))
df.working.not.incl.fatal <- df.driver.age.ped %>% 
  dplyr::filter(INVAGE!="unknown",
                INVAGE!="Over 95",
                driver.age!="unknown",
                driver.age!="Over 95",
                driver.age!="0 to 4",
                driver.age!="10 to 14",
                driver.age!="5 to 9",
                INJURY!="Fatal",
                PEDCOND %in% c("Ability Impaired, Alcohol", 
                               "Ability Impaired, Alcohol Over .80", 
                               "Ability Impaired, Drugs", 
                               "Had Been Drinking" ,"Normal"), 
                Hood_Name!="Other",
                TRAFFCTL %in% c("No Control", "Stop Sign", "Traffic Signal"),
                VISIBILITY %in% c("Clear", "Snow", "Rain"),
                RDSFCOND!= "Loose Sand or Gravel", 
                RDSFCOND!= "Other") %>% 
  dplyr::select(YEAR, DATE, driver.age, TRAFFCTL, VISIBILITY, LIGHT,
         RDSFCOND, INVAGE, INJURY, PEDCOND, PEDACT, 
         SPEEDING, AG_DRIV, ALCOHOL, Hood_Name)
df1 <- df.working.not.incl.fatal %>% 
  separate(DATE, "Date", sep=10) %>% 
  mutate(Major.inj = if_else(INJURY=="Major", 1, 0),
         Traff.ctl = fct_collapse(TRAFFCTL, 
                                  "Stop" = c("Stop Sign"),
                                  "Uncontrolled" = c("No Control"),
                                  "Signal" = c("Traffic Signal")),
         Ped.sub = fct_collapse(PEDCOND,
                                      "1"= c("Ability Impaired, Alcohol", 
                                             "Ability Impaired, Alcohol Over .80", 
                                             "Ability Impaired, Drugs", "Had Been Drinking"),
                                      "0"= c("Normal")),
         Alcohol = if_else(ALCOHOL=="Yes", 1, 0),
         Agg.drive = if_else(AG_DRIV=="Yes", 1, 0),
         Speeding = if_else(SPEEDING=="Yes", 1, 0),
         Road.cond = fct_collapse(RDSFCOND,
                                   "Dry"= c("Dry"),
                                   "Wet"=c("Wet"),
                                   "Icy"=c("Ice", "Slush", "Packed Snow", "Loose Snow")),
         Year = as_factor(YEAR),
         vis = as_factor(VISIBILITY), 
         Light = if_else(LIGHT=="Daylight" | LIGHT== "Daylight artifical", 1, 0)) %>% 
  rename(Ped.age=INVAGE,
         Driver.age=driver.age,
         Month=Date)
df1 <- dplyr::select(df1, Ped.age, Driver.age, Month, Year, Major.inj, Traff.ctl,
                  Ped.sub, Alcohol, Agg.drive, Speeding, 
                  Road.cond, vis, Light, Hood_Name)
df1$Month <- month(ymd(df1$Month))
df1$Month <- factor(df1$Month)

Data

The data is taken from the city of Toronto’s Police Service Public Safety Data Portal. The data set includes pedestrians who were involved in a vehicle collision. The data set is observed at the time of the accident and includes variables such as longitude, latitude, and neighbourhood. Moreover, the data includes variables such as age, street, district, traffic control type, fatality, injury, visibility, if the motorist had any alcohol in their system, speeding, aggressive driving, road condition, and if the pedestrian was under the influence of any alcohol or drugs at the time of accident. First, I filter the data to remove any undesirable values that are uninteresting, or in preparation for the empirical model. Important to note, fatalities are also filtered out in this section as I only want to look at injuries.

The data for which the summary statistics and empirical model will use is named ‘df1’. The above code is cleaning the data which generates the treatment variable, type of traffic control. This variable takes on three values, stop sign, uncontrolled and traffic signal. The data on the others is not of importance in pursing my specific research question. Moreover, the dependent variable is ‘Major.inj’, which is nothing but a dummy variable that takes on 1 if the pedestrian received a major injury and 0 if the pedestrian received an injury less than. In relation, a problem with this data set and research question is that the majority of pedestrians who do not receive any type of injury are less likely to report this to the police. In the data frame including pedestrian and driver information, ‘df.driver.age.ped’, the injury variable had a summary of 319 fatalities, 1649 major, 37 minimal, 86 minor and 15 people had no injuries. Therefore, to maximize the variation in the data set, when the dependent variable takes on 0, it includes pedestrians who had minimal, minor and no injuries. The importance of the research question is not diminished as the increased wait times and cost of major injuries is still of a concern, especially due to the relative cheapness of implementing a stop sign. Furthermore, the definition of major, minimal, and minor is not specified anywhere and I will have to make do with this structure.

Continuing, using factor collapse, the variable regarding the condition of the pedestrian is generated, which takes on 1 if the pedestrian is inebriated in any way due to drugs or alcohol and 0 if the condition is normal. Speeding, aggressive driving, and alcohol are turned into dummy variables. A factor collapse for road conditions is also done to compile all snow-related road conditions to make use of weather effects as controls for this will be necessary. Lastly, a dummy variable for daylight is created to control for driving in the night.

table.traff.ctl <- df1 %>% 
  group_by("Type of Traffic Control" =Traff.ctl) %>% 
  summarize(
    "Number of Major Injuries" = sum(Major.inj),
    "Average Major Injuries" =round(mean(Major.inj),4),
    "Number of Alcohol Related Incidences" = sum(as.numeric(Alcohol)),
    "Number of Accidents involving Aggressive Driving" = sum(Agg.drive))
table.traff.ctl <- t(table.traff.ctl)
kable(table.traff.ctl, caption="Traffic Control Statistics") #%>% kable_styling(latex_options = c("hold_position"))
Traffic Control Statistics
Type of Traffic Control Uncontrolled Stop Signal
Number of Major Injuries 393 74 516
Average Major Injuries 0.9357 0.9487 0.8990
Number of Alcohol Related Incidences 27 4 8
Number of Accidents involving Aggressive Driving 121 62 412


Table 1 includes summary statistics that are grouped by traffic control. Most collisions are recorded at uncontrolled and where a traffic signal is located. Only 74 major injuries are recorded at stop signs while uncontrolled and signal have 393 and 516, respectively. Moreover, uncontrolled intersections have, on average, proportionally less major injuries occurring compared to stop signs, and the least is intersections with traffic signals. However, this is most likely due to the number of observations. Furthermore, the number of alcohol-related incidences is also higher at uncontrolled intersections. Intuitively, the combined effect of the poor decision making with alcohol the degree of freedom the motorist has in deciding to stop at an uncontrolled intersection explains this. Lastly, the number of aggressive driving deemed collisions is higher at traffic signal controlled intersections.

table.hood <- df1 %>% 
  mutate(Hood = fct_lump(Hood_Name, n=30)) %>% 
  group_by(Hood) %>% 
  summarize(
    Total.Injuries= length(Major.inj),
    prob.inj = mean(Major.inj)) %>% 
  filter(Hood!="Other")

p1 <- ggplot(table.hood) +
  geom_point(aes(x= Hood, y = prob.inj , size = Total.Injuries, color = Total.Injuries)) +
  coord_flip() +
  labs(x = "Neighbourhood", y = "Average Probability of Major Injury", title = "Neighbourhood 
       by Average Major Injury", col = "Total # of Incidents") + 
  theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "bottom",
          legend.box = "vertical", plot.title = element_text(size = 12, hjust = 0.5, 
                                                             color = "black", face = "bold"))
ggplotly(p1)

The above figure is used to more efficiently show the heterogeneity of the average proportion of major injuries by the 30 most frequently observed neighbourhoods. This is of importance, as pedestrian traffic is related to population density, thus, different outcomes will be more often observed at neighbourhoods with similar characteristics. The outliers here, are Milliken, Niagara, Junction Area, and Banbury-Don Mills as they have the lowest proportion of major injuries. The size of the points represents the total number of incidents observed. Unfortunately for us, in this data set, York University Heights has never had a non-major injury.

hood.by.traff.long <- df1 %>%
  mutate(Hood = fct_lump(Hood_Name, n=15)) %>% 
  group_by(Hood) %>% 
  count(Traff.ctl) %>% 
  rename(Traffic.Control = Traff.ctl,
         Number = n) %>% 
  filter(Hood!="Other")

p2 <- ggplot(hood.by.traff.long) +
  geom_col(aes(x=Hood, y=Number, fill= Traffic.Control), position="dodge") + 
  coord_flip() +
  labs(x = "Neighbourhood", y = "Frequency", 
title = "Collisions of Each Traffic Control Type by Neighbourhood",
fill = "Traffic Control") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position= "bottom", plot.title = 
          element_text(size = 14, hjust = 0.5, color = "black", face = "bold")) +
  scale_fill_brewer(palette = "Pastel1")
ggplotly(p2)

Figure 2 displays the different types of traffic controls in each neighbourhood in Toronto for each observed collision. I have decided to use only the top 15 most frequently observed neighbourhoods for this figure. Waterfront-Communities have the most collisions observed at a traffic signal. As expected, the top 15 most observed neighbourhoods have similar collisions statistics. Both uncontrolled and signal traffic controls have similar frequencies in neighbourhoods, suggesting traffic signals and uncontrolled intersections might have similar collision numbers within neighbourhoods of similar traffic characteristics. The importance of this figure is to demonstrate, while some neighbourhoods traffic controls have a similar pattern, it is still not uniformly distributed. Therefore, providing strong evidence in controlling neighbourhoods due to unwanted endogeneity.

Methods

The empirical model is formally written as such,

\[\begin{align} Y_{iht} = \alpha_{0} + \gamma_{t} + \gamma_{h} + \beta D_{i} + X'_{i}~\delta. \end{align}\]

The above equation is a logistic model and will be estimated using maximum likelihood estimation. \(Y_{iht}\), is the dummy which represents the observed major injury of pedestrian \(i\), at time \(t\), in neighbourhood \(h\). \(\gamma_{t}\) is the dummy variable controlling for year and month, \(t\), and \(\gamma_{h}\) is the dummy controlling for each neighbourhood \(h\), where each collision is observed. \(\beta\) is the coefficient for \(D\) which is the treatment effect, traffic control type. In this regression, \(D\) will take on values for uncontrolled and stop sign, as I want to find a difference in effect from these variables. The constant will include the effect of a traffic signal. Furthermore, the vector \(X'\) has collision level characteristics for each pedestrian, \(i\), at time, \(t\), and neighbourhood, \(h\), to control for endogeneity. Moreover, I will use the standard errors from the logistic model as there is no need to correct for heteroskedasticity in non-linear models, however, I will use robust standard errors for the OLS model. Lastly, the logistic regression model has problems with quasi-separation, to rectify this, I will use Bayes GLM.

As stated above, the treatment is the type of intersection and the outcome is if the pedestrian injury was major. Endogeneity is of most concern, as the type of intersection is not randomly assigned. Intuitively, traffic signals are implemented in busy intersections which would lead to inefficient flow of traffic and increase the number in accidents, therefore, more motorists and pedestrians. Stop signs could be thought of as a weaker form of a traffic signal, which preserves some of the efficient flow of traffic while decreasing time stopped. Therefore, neighbourhoods in which collisions occur must be included to control for the organization of traffic control type based on neighbourhood characteristics. Furthermore, construction is ongoing in Toronto, thus, to capture these changes that occur over time a year and month dummies are included to capture the change to intersections done throughout time.

Unfortunately, just these major controls are not enough to find a causal effect for the outcomes at each traffic control type. I will make use of the other variables as controls in hopes to remove the endogeneity that occurs from traffic control and observed major injury. Variables such as age, red light, alcohol, aggressive driving, speeding, pedestrian substance use, road conditions, and visibility are all used as controls to control for collision level characteristics. Collision level characteristics are characteristics of each accident that could affect the driver or pedestrian at the moment of collision. Therefore, I will make use of the conditional independence assumption in hopes to find a causal effect of uncontrolled intersections on an observed major injury.

The conditional independence assumption (CIA) here could be violated if instead, I used the district dummies which are only broken down into four categories, which, therefore, do not capture the characteristics at the smallest level. Moreover, years do not capture the relatively quick implementation of traffic signals or stop signs in Toronto, which is why I include a month dummy variable to control for more immediate traffic changes that occurred during this period. Excluding months would have possibly violated the conditional independence assumption, as some variation is leftover in the unobservable term. Lastly, the driver’s ability would violate CIA if not accurately controlled. If the driver’s age is not a proxy for driver’s ability, this would also cause CIA to fail.

Results

df1$Traff.ctl <- factor(df1$Traff.ctl) %>% relevel(ref = "Signal")
df1$Ped.sub <-  factor(df1$Ped.sub) %>% relevel(ref="0")
df1$Speeding <- factor(df1$Speeding) %>% relevel(ref="0")
df1$Alcohol <- factor(df1$Alcohol) %>% relevel(ref="0")
df1$Agg.drive <- factor(df1$Agg.drive) %>% relevel(ref="0")
df1$Road.cond <- factor(df1$Road.cond) %>% relevel(ref="Dry")
df1$vis <- factor(df1$vis) %>% relevel(ref="Clear")
df1$Light <- factor(df1$Light) %>% relevel(ref="1")
df1$Year <- factor(df1$Year) %>% relevel(ref="2009")
df1$Month <- factor(df1$Month) %>% relevel(ref="10")
df1$Driver.age <- factor(df1$Driver.age) %>% relevel(ref="35 to 39")
df1$Ped.age <- factor(df1$Ped.age) %>% relevel(ref="25 to 29")
df1$Hood_Name <- factor(df1$Hood_Name) %>% relevel(ref="Waterfront Communities-The Island (77)")
RHS <- list(
  "Traff.ctl",
  "Traff.ctl + Ped.age + Driver.age + Alcohol + Agg.drive + 
  Speeding + Road.cond + Ped.sub + vis + Light",
  "Traff.ctl + Hood_Name + Month + Year",
  "Traff.ctl + Ped.age + Driver.age + Hood_Name + Month + Year  + Alcohol + 
  Agg.drive + Speeding + Road.cond + Ped.sub + vis + Light "
)
reg1 <- lm(Major.inj ~ Traff.ctl, df1)
reg2 <- lm(Major.inj ~ Traff.ctl + Ped.age + Driver.age + 
             Hood_Name + Month + Year  + Alcohol + 
             Agg.drive + Speeding +  Road.cond + vis + Light + Ped.sub, df1)
se1 <- sqrt(diag(vcovHC(reg1, type="HC1")))
se2 <- sqrt(diag(vcovHC(reg2, type="HC1")))
mod2 <- lapply(RHS, function(x)bayesglm(paste0("Major.inj","~",x), data=df1, family = "binomial"))
se.glm.1 <- sqrt(diag(vcov(mod2[[1]])))
se.glm.2 <- sqrt(diag(vcov(mod2[[2]])))
se.glm.3 <- sqrt(diag(vcov(mod2[[3]])))
se.glm.4 <- sqrt(diag(vcov(mod2[[4]])))
se.glm.list <- lapply(1:4, function(i)sqrt(diag(vcov(mod2[[i]])))) %>% unlist()
for(i in 1:4){
  class(mod2[[i]]) <- c("glm","lm")
  mod2[[i]]$call[1] <- quote(lm())
}
stargazer(reg1, reg2, mod2,
          type="text",
          se=list(se1, se2, se.glm.1,se.glm.2,se.glm.3,se.glm.4),
          keep = c("Traff.ctl", "Alcohol1", 
                   "Speeding1", "Agg.drive1", "Ped.sub1", "Constant"), 
          keep.stat = c("aic", "rsq", "adj.rsq", "n"), 
          add.lines = list(c("Controls", "None", 
                             "All", "None", "Collision", 
                             "Hood and Time", "All"),
                           c("Type", "OLS", "OLS", "Logistic", 
                             "Logistic", "Logistic", "Logistic")),
          covariate.labels = c("Uncontrolled", "Stop Sign",
"Alcohol", "Aggressive Drive", "Speeding", "Pedestrian Substance Use"),
          dep.var.labels   = "Major Injury",
column.sep.width = "-5pt",
          header=FALSE)
## 
## ====================================================================================
##                                              Dependent variable:                    
##                          -----------------------------------------------------------
##                                                 Major Injury                        
##                            (1)      (2)      (3)       (4)         (5)        (6)   
## ------------------------------------------------------------------------------------
## Uncontrolled             0.037**  0.052**  0.480**   0.634**     0.505*     0.946** 
##                          (0.017)  (0.023)  (0.240)   (0.284)     (0.281)    (0.377) 
##                                                                                     
## Stop Sign                 0.050*   0.019    0.675     0.543       0.589      0.471  
##                          (0.028)  (0.032)  (0.501)   (0.530)     (0.559)    (0.607) 
##                                                                                     
## Alcohol                            -0.049            -0.842                  -0.876 
##                                   (0.056)            (0.541)                (0.678) 
##                                                                                     
## Aggressive Drive                  0.071***          0.894***                1.273***
##                                   (0.024)            (0.283)                (0.379) 
##                                                                                     
## Speeding                           -0.030            -0.739*                 -0.262 
##                                   (0.038)            (0.386)                (0.533) 
##                                                                                     
## Pedestrian Substance Use          0.083***          1.446***                1.358** 
##                                   (0.028)            (0.524)                (0.579) 
##                                                                                     
## Constant                 0.899*** 0.814*** 2.192***  1.020**    3.107***    2.096***
##                          (0.013)  (0.086)  (0.138)   (0.416)     (0.526)    (0.742) 
##                                                                                     
## ------------------------------------------------------------------------------------
## Controls                   None     All      None   Collision Hood and Time   All   
## Type                       OLS      OLS    Logistic Logistic    Logistic    Logistic
## Observations              1,072    1,072    1,072     1,072       1,072      1,072  
## R2                        0.005    0.255                                            
## Adjusted R2               0.003    0.085                                            
## Akaike Inf. Crit.                          613.817   605.854     767.888    765.760 
## ====================================================================================
## Note:                                                    *p<0.1; **p<0.05; ***p<0.01

The results from the regression are shown above. The first two models are estimated using OLS while the following are logit models. Both models should provide different estimates, as the dependent variable has a mean of above 0.9. Theoretically, the logistic model should be more accurate, however, as a benchmark, I have included the linear probability model. The result is, uncontrolled intersections, on average lead to a statistically significant increase in major pedestrian injuries for all models but the logistic model only controlling for neighbourhood and time fixed effects. Moreover, after controlling for time, neighbourhood and collision level characteristics, the conclusion does not change and the likelihood of major injuries to pedestrians at uncontrolled intersections only grows.

The different controls are listed below the regression table. Looking at the OLS model (1), there is a positive and statistically significant effect due to uncontrolled intersections. This can be interpreted as uncontrolled intersections, on average, lead to a 3.7% more major pedestrian injuries over intersections with traffic signals at the 95% confidence level. Interestingly, the estimate for intersections with stop signs is also positive at 5%, however, only significant at the 90% confidence level. Model (2) now includes all the covariates, time and neighbourhood fixed effect. This now returns an estimate for uncontrolled intersections interpreted as, on average, they lead to 5.2% more pedestrian major injuries over traffic signal intersections observed at the reference group of best possible collision level controls, and the most commonly observed time, neighbourhood and age controls. Interesting results also include positive and highly statistically significant estimates for aggressive driving and if at minimum the pedestrian has been drinking, at 7.1% and 8.3% respectively.

The logistic regression models start from the model (3) to (6). Model (3) has a straight forward interpretation as the coefficient of major pedestrian injuries at a traffic signal is the constant. On average, the odds that a major pedestrian injury occurs at a traffic light are 8.95 times (\(e^{2.192}\)) that of a minor injury, and the probability is 0.8995 (\(\frac{e^{2.192}}{1+e^{2.192}}\)). Furthermore, the coefficient on uncontrolled intersections is 0.048 larger in magnitude, which is interpreted as, the odds that a major pedestrian injury occurs at an uncontrolled intersection is 1.05 times more than traffic signal intersections, which is 5% more. The coefficient for stop sign intersections is statistically insignificant, thus nothing can be said about this effect. After including, separately, collision level controls, neighbourhood and time controls alleviate some of the negative bias on uncontrolled intersection collisions from the simple logistic model. After including all controls, and thus holding all variables constant, the odds that a pedestrian will receive a major injury at an uncontrolled intersection will be 2.58 times or 158% more than observing a major injury at a traffic signal representing optimal collision level control outcomes (clear driving conditions, no alcohol, no aggressive driving, no pedestrian substance use, and so on, and at the most frequently observed driver age group, pedestrian age group, year, month, and neighbourhood) at the 95% confidence level. For a more practical interpretation, if 100 major pedestrian injuries are observed at a traffic signal, there will be 258 major pedestrian injuries observed at an uncontrolled intersection. The log of odds ratio is still however statistically insignificant for stop signs. It is important to note, the logistic model with only the collision level controls has the lowest AIC, and thus, the most accurate model for prediction. However, in this paper, I am only interested in the parameter estimates for the treatment variable. It seems controlling for neighbourhood and time effects reduces the negative bias caused by only including collision level controls.

Furthermore, other interesting results are that aggressive driving increases the odds of a major injury by 3.57 times or 257%, this is also statistically significant at the 99% confidence level holding all variables constant. Similarly, on the other side, the effect on major pedestrian injuries whether the pedestrian has been at minimum drinking is on average, 285% or 3.9 times had they not have been at minimum drinking, this, however, is statistically significant only to the 95% confidence level. Interestingly, variables such as if any alcohol in the driver’s system show no statistical significance. Intuitively, the results thus far are correct, however, there does not seem to be a causal effect for stop sign intersections, or either CIA failed.

Conclusion

Unfortunately, the log odds for major pedestrian injuries occurring at stop signs were statistically insignificant for most of the models. Which, does not aid in answering the economic question of reducing major pedestrian injuries at uncontrolled traffic intersections to improve public safety, and reduce healthcare expenditure. However, it is worth noting that the number of observations for collisions occurring at stop signs was very low and this would be due to data and not the model itself. The logistic regression model did show statistically significant results for uncontrolled intersections to cause more major injuries to pedestrians. This paper demonstrates, controlling for collision level, neighbourhood and time effects, uncontrolled intersections put pedestrian’s safety at risk. It is, however, important to note that the economic question of increasing pedestrian safety through implementing traffic depends on other factors as well. It would be wrong to say they do not have a place in the city of Toronto. The benefit of saving a statistical life could be lower than that of the overall construction cost and the costs due to increased wait times for drivers, traffic congestion and increasing pollution. Toronto is currently plagued with congestion due to traffic and to add to this by implementing traffic signals at uncontrolled intersections must be carefully thought out and must be taken into consideration. Furthermore, increasing the penalties for aggressive driving offenses would be something to take away from this paper as it substantially increases the likelihood of major pedestrian injuries that would be due to driver negligence, as this would reduce aggressive driving and the likelihood of a pedestrian being struck by one. Secondly, on the pedestrian’s side, this could be a motive for restricting open alcohol outdoors as allowing the consumption of alcohol outside could put the pedestrian at an increased risk of being in a collision and the outcome being a major injury. Furthermore, with the recent legalization of cannabis, the laws surrounding cannabis use are relatively new. Seemingly, legislation for cannabis use has been treated similar to alcohol. Whether this is correct or not is outside the scope of this paper, however, using this assumption, there could be evidence to suggest limiting cannabis use in public would be consistent with alcohol public-drinking laws. It is important to note that this data set does not include any collisions made after the legalization date of cannabis in Canada, however, if we assume pedestrian decision making under the influence of cannabis will be similar to alcohol, then there could be evidence to suggest limiting the use of cannabis in public as it would have a similar impact on pedestrian safety. However, more research should be done to conclude the policy recommendations.