Source Dataset:

https://vincentarelbundock.github.io/Rdatasets/csv/AER/USSeatBelts.csv (I have also added an extra column/attribute that holds the year in which seatbelt laws first came into effect in various states. Source: https://en.wikipedia.org/wiki/Seat_belt_laws_in_the_United_States)

Source Description:

Balanced panel data for the years 1983–1997 from 50 US States, plus the District of Columbia, for assessing traffic fatalities and seat belt usage.

Question for Analysis:

This dataset contains information about various measures implemented by US States to decrease the average number of fatalities caused by road accidents, including enforcing traffic safety laws like: minimum drinking age of 21, maximum permissible blood alcohol level, wearing seatbelts, setting speed limit of 65mph, setting speed limit of 70 mph etc.

Has the enforcement of any of these laws, especially those about seatbelts, actually resulted in a decrease in fatalities? __________________________________________________________

Data Exploration & Wrangling:

Here is a look at a section of the first few rows of the data:

##   X state year miles fatalities seatbelt speed65 speed70 drinkage alcohol
## 1 1    AK 1983  3358 0.04466945       NA      no      no      yes      no
## 2 2    AK 1984  3589 0.03733630       NA      no      no      yes      no
## 3 3    AK 1985  3840 0.03307291       NA      no      no      yes      no
## 4 4    AK 1986  4008 0.02519960       NA      no      no      yes      no
## 5 5    AK 1987  3900 0.01948718       NA      no      no      yes      no
## 6 6    AK 1988  3841 0.02525384       NA      no      no      yes      no
##   income      age enforce lawenforcedfrom
## 1  17973 28.23497      no            1991
## 2  18093 28.34354      no            1991
## 3  18925 28.37282      no            1991
## 4  18466 28.39665      no            1991
## 5  18021 28.45325      no            1991
## 6  18447 28.85142      no            1991
Rename four data columns:

(“state” to “stateabbr”,“miles”to “mtmperyear”,“fatalities” to “fatalitiesperyear”, and “drinkage” to “drinkingagelaw”)

##       X stateabbr year mtmperyear fatalitiesperyear seatbelt speed65 speed70
## 760 760        WY 1992       6217        0.01898022     0.66     yes      no
## 761 761        WY 1993       6770        0.01772526     0.67     yes      no
## 762 762        WY 1994       6689        0.02152788     0.70     yes      no
## 763 763        WY 1995       7044        0.02413402     0.71     yes     yes
## 764 764        WY 1996       7360        0.01942935     0.72     yes     yes
## 765 765        WY 1997       7576        0.01808342     0.75     yes     yes
##     drinkingagelaw alcohol income      age   enforce lawenforcedfrom
## 760            yes      no  18704 33.96467 secondary            1989
## 761            yes      no  19535 34.21438 secondary            1989
## 762            yes      no  19865 34.45578 secondary            1989
## 763            yes      no  20685 34.76661 secondary            1989
## 764            yes      no  21524 35.07435 secondary            1989
## 765            yes      no  22596 35.38646 secondary            1989
Add a column (absfreq) to the dataset, calculated as the product of two other columns:

(Absolute frequency of fatalities per year) = (million traffic miles per year) x (fatalities per million traffic miles per year)

mydata$absfreq <- mydata$mtmperyear * mydata$fatalitiesperyear
head(mydata)
##   X stateabbr year mtmperyear fatalitiesperyear seatbelt speed65 speed70
## 1 1        AK 1983       3358        0.04466945       NA      no      no
## 2 2        AK 1984       3589        0.03733630       NA      no      no
## 3 3        AK 1985       3840        0.03307291       NA      no      no
## 4 4        AK 1986       4008        0.02519960       NA      no      no
## 5 5        AK 1987       3900        0.01948718       NA      no      no
## 6 6        AK 1988       3841        0.02525384       NA      no      no
##   drinkingagelaw alcohol income      age enforce lawenforcedfrom absfreq
## 1            yes      no  17973 28.23497      no            1991     150
## 2            yes      no  18093 28.34354      no            1991     134
## 3            yes      no  18925 28.37282      no            1991     127
## 4            yes      no  18466 28.39665      no            1991     101
## 5            yes      no  18021 28.45325      no            1991      76
## 6            yes      no  18447 28.85142      no            1991      97
Summarise the US Seat Belts data:
##        X        stateabbr              year        mtmperyear    
##  Min.   :  1   Length:765         Min.   :1983   Min.   :  3099  
##  1st Qu.:192   Class :character   1st Qu.:1986   1st Qu.: 11401  
##  Median :383   Mode  :character   Median :1990   Median : 30319  
##  Mean   :383                      Mean   :1990   Mean   : 41448  
##  3rd Qu.:574                      3rd Qu.:1994   3rd Qu.: 52312  
##  Max.   :765                      Max.   :1997   Max.   :285612  
##                                                                  
##  fatalitiesperyear     seatbelt        speed65            speed70         
##  Min.   :0.008327   Min.   :0.0600   Length:765         Length:765        
##  1st Qu.:0.017341   1st Qu.:0.4200   Class :character   Class :character  
##  Median :0.021199   Median :0.5500   Mode  :character   Mode  :character  
##  Mean   :0.021490   Mean   :0.5289                                        
##  3rd Qu.:0.024774   3rd Qu.:0.6500                                        
##  Max.   :0.045470   Max.   :0.8700                                        
##                     NA's   :209                                           
##  drinkingagelaw       alcohol              income           age       
##  Length:765         Length:765         Min.   : 8372   Min.   :28.23  
##  Class :character   Class :character   1st Qu.:14266   1st Qu.:34.39  
##  Mode  :character   Mode  :character   Median :17624   Median :35.39  
##                                        Mean   :17993   Mean   :35.14  
##                                        3rd Qu.:21080   3rd Qu.:36.13  
##                                        Max.   :35863   Max.   :39.17  
##                                                                       
##    enforce          lawenforcedfrom    absfreq    
##  Length:765         Min.   :1984    Min.   :  44  
##  Class :character   1st Qu.:1986    1st Qu.: 255  
##  Mode  :character   Median :1987    Median : 640  
##                     Mean   :1988    Mean   : 847  
##                     3rd Qu.:1991    3rd Qu.:1039  
##                     Max.   :1997    Max.   :5504  
## 

Display Aggregates: Average Seatbelt Usage Rates and Average Fatalities grouped by State:

(Absolute frequency of fatalities per year) = (million traffic miles per year) x (fatalities per million traffic miles per year)


Analysis:

This boxplot above indicates that US seatbelt usage rates kept increasing from 1983 to 1997, with very few outliers.


Analysis:

The boxplot above indicates that traffic miles covered in the US had been very slowly but steadily increasing from 1983 to 1997. The histogram shows most of the values distributed over a smaller range, thus reinforcing that the overall rate of change in annual traffic miles covered must be very small.


Scatter Plot & Line Plots:

Analysis:

The above Scatter Plot gives an indication that annual fatalities have been on a downward trend during the period from 1983 to 1997. Since most states started enforcing seatbelt laws after 1986, this can be interpreted as the increasing enforcement of seatbelt laws could have been a factor in the decrease of fatalities. However, since all states did not implement these laws at the same time, the correlation is not very clear.

Analysis:

On taking subsets of the data only for the states of California & Mississippi and analysing the above line plots, the correlation between enforcement of seatbelt laws and the downward trend in fatalities during the period from 1983 to 1997, is visible.

Analysis:

From the above line plot, the correlation between a maximum speed limit of 65 mph and the downward trend in fatalities during the period from 1983 to 1997, is visible.


Conclusion:

From the graphs seen above:
1. Annual fatalities per million traffic miles kept decreasing from 1983 to 1997
2. Seatbelt Usage Rate increased slowly from 1983 to 1997
3. Enforcement of seatbelt laws in California and Mississippi seems to have been a significant factor in the decrease in fatalities in both states
4. Enforcement of maximum speed limit of 65 mph in California seems to have been a significant factor in the decrease in fatalities in that state.

Thus we can conclude that enforcement of traffic safety laws like enforcement of seatbelt usage and setting specific maximum speed limits, was effective in decreasing the annual traffic fatality rate during the period from 1983 to 1997.