The table of content:

1. Introduction
2. Question for analysis
3. Description of data used
  3.1 Load data and summary of the data
  3.2 Format
  3.3 Applying mean(), meadian() functions
4. Transforming data
  4.1 New dataset grouped by state
  4.2 New dataset grouped by year
5. Visualize the data
  5.1 Scatter Plot for main dataset
  5.2 Pie charts
  5.3 Box plot
  5.4 Histograms and scatterplots for subsets
    5.4.1  Data per Year
    5.4.2  Data per State
6. Conclusion

1. Introduction

This is the final project for the Bridge course to demonstrate the skills learned through the 3 weeks course. The dataset “Drunk Driving Laws and Traffic Deaths” will be used for the analysis. The main functions will be applied to analyze the dataset such as mean(), median(), mode(), summary() as well as graphic tools (ggplot2 library).

2. Question for analysis

Does the fatality rate due to drunk driving depends on the well-being?

3. Description of data used

Dataset “Drunk Driving Laws and Traffic Deaths” represents a panel of 48 observations from 1982 to 1988 conducted in each State of the United States.
Sourse: https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/Fatality.csv
Files located: https://raw.githubusercontent.com/ex-pr/Data-set-week-3/main/Fatality.csv

3.1 Load data and summary of the data

The dataset was loaded from the Github repository. Only columns from 2 to 11 are used. The first column was removed as it distracts from the main information. The column names were changed for a better understanding of the information they contain. The next step is to show a summary for the dataset for the understanding of the values it contains.

fatality <- read.csv("https://raw.githubusercontent.com/ex-pr/Data-set-week-3/main/Fatality.csv", header=TRUE, sep=",")
fatality <- fatality[, 2:11]
colnames(fatality) <- c("State", "Year", "Fatality_rate", "Beer_tax", "Legal_drink_age", "Jail_sentence", "Community_service", "Miles_per_driver", "Unemployment_rate", "Income")
summary(fatality)
##     State                Year      Fatality_rate       Beer_tax      
##  Length:336         Min.   :1982   Min.   :0.8212   Min.   :0.04331  
##  Class :character   1st Qu.:1983   1st Qu.:1.6237   1st Qu.:0.20885  
##  Mode  :character   Median :1985   Median :1.9560   Median :0.35259  
##                     Mean   :1985   Mean   :2.0404   Mean   :0.51326  
##                     3rd Qu.:1987   3rd Qu.:2.4179   3rd Qu.:0.65157  
##                     Max.   :1988   Max.   :4.2178   Max.   :2.72076  
##  Legal_drink_age Jail_sentence      Community_service  Miles_per_driver
##  Min.   :18.00   Length:336         Length:336         Min.   : 4.576  
##  1st Qu.:20.00   Class :character   Class :character   1st Qu.: 7.183  
##  Median :21.00   Mode  :character   Mode  :character   Median : 7.796  
##  Mean   :20.46                                         Mean   : 7.891  
##  3rd Qu.:21.00                                         3rd Qu.: 8.504  
##  Max.   :21.00                                         Max.   :26.148  
##  Unemployment_rate     Income     
##  Min.   : 2.400    Min.   : 9514  
##  1st Qu.: 5.475    1st Qu.:12086  
##  Median : 7.000    Median :13763  
##  Mean   : 7.347    Mean   :13880  
##  3rd Qu.: 8.900    3rd Qu.:15175  
##  Max.   :18.000    Max.   :22193

3.2 Format

A dataframe contains :
State - state Postal code
Year - year from 1982 to 1988
Fatality_rate - traffic fatality rate (deaths per 10000)
Beer_tax - tax on case of beer
Legal_drink_age - minimum legal drinking age
Jail_sentence - mandatory jail sentence ?
Community_service - mandatory community service ?
Miles_per_driver - average miles per driver
Unemployment_ _rate - unemployment rate
Income - per capita personal income

3.3 Applying mean(), meadian() functions

The goal is to calculate the mean/median fatality, beer tax, age of a drinker, miles per driver, income per family in 48 states through 7 years, and what drink age is the most common if a state had a jail sentence or community service for a drunk driver.

mean_fatality <- mean(fatality[,3])
median_fatality <- median(fatality[,3])
mean_beertax <- mean(fatality[,4])
median_beertax <- median(fatality[,4])
mean_miles <- mean(fatality[,8])
median_miles <- median(fatality[,8])
mean_unempl <- mean(fatality[,9])
median_unempl <- median(fatality[,9])
mean_income <- mean(fatality[,10])
median_income <- median(fatality[,10])
mode_jail <- names(which.max(table(fatality[,6])))
mode_community <- names(which.max(table(fatality[,7])))
mode_drinkage <- names(which.max(table(fatality[,5])))

print(sprintf("Mean fatality rate is %f, median fatality rate is %f, mean beer tax is %f, median beer tax is %f, mean miles per driver is %f, median miles per driver is %f", mean_fatality, median_fatality, mean_beertax, median_beertax,mean_miles, median_miles))
## [1] "Mean fatality rate is 2.040444, median fatality rate is 1.955955, mean beer tax is 0.513256, median beer tax is 0.352589, mean miles per driver is 7.890754, median miles per driver is 7.796219"
print(sprintf("Mean unemployment rate is %f, median unemployment rate is %f, mean income per family is %f, median income per family is %f", mean_unempl, median_unempl, mean_income, median_income))
## [1] "Mean unemployment rate is 7.346726, median unemployment rate is 7.000000, mean income per family is 13880.184533, median income per family is 13763.128906"
print(sprintf("Most common is %s to jail sentence for drunk driving. Most common is %s to community service for drunk driving. The most common drink age is %s", mode_jail, mode_community, mode_drinkage))
## [1] "Most common is no to jail sentence for drunk driving. Most common is no to community service for drunk driving. The most common drink age is 21"

4. Transforming data

4.1 New dataset grouped by state

The new data set will group observations for 7 years for 1 state in 1 row. Each row will include state name, mean fatality rate for 7 years, mean income, mean beer tax, mean miles per driver, mean unemployment rate, the summary fatality rate for 7 years. Also, the data is shown in descending order where the state with the maximum summary fatality at the top and with a minimum at the bottom.

fatality_state <- data.frame(fatality %>% 
 group_by(State)%>% 
  summarize(fatal_state = mean(Fatality_rate), income_state = mean(Income), beertax_state=mean(Beer_tax), miles_state=mean(Miles_per_driver), unempl_state=mean(Unemployment_rate), sum_fatal=sum(Fatality_rate)))

fatality_state <- data.frame(fatality_state%>%                                      
  arrange(desc(sum_fatal)))

summary(fatality_state)
##     State            fatal_state     income_state   beertax_state    
##  Length:48          Min.   :1.110   Min.   : 9951   Min.   :0.04817  
##  Class :character   1st Qu.:1.661   1st Qu.:12054   1st Qu.:0.21161  
##  Mode  :character   Median :1.974   Median :13616   Median :0.37183  
##                     Mean   :2.040   Mean   :13880   Mean   :0.51326  
##                     3rd Qu.:2.402   3rd Qu.:15144   3rd Qu.:0.64710  
##                     Max.   :3.653   Max.   :19516   Max.   :2.44051  
##   miles_state      unempl_state      sum_fatal     
##  Min.   : 5.130   Min.   : 4.100   Min.   : 7.771  
##  1st Qu.: 7.361   1st Qu.: 5.861   1st Qu.:11.630  
##  Median : 7.904   Median : 7.271   Median :13.815  
##  Mean   : 7.891   Mean   : 7.347   Mean   :14.283  
##  3rd Qu.: 8.263   3rd Qu.: 8.639   3rd Qu.:16.814  
##  Max.   :10.593   Max.   :13.200   Max.   :25.572

To find the states with the maximum mean fatality rate, summary fatality rate, income, beer tax and unemployment rate, the code below is used

fatality_state[which.max(fatality_state$fatal_state),]
##   State fatal_state income_state beertax_state miles_state unempl_state
## 1    NM    3.653197     11682.84     0.3824027    9.239428     8.785714
##   sum_fatal
## 1  25.57238
fatality_state[which.max(fatality_state$sum_fatal),]
##   State fatal_state income_state beertax_state miles_state unempl_state
## 1    NM    3.653197     11682.84     0.3824027    9.239428     8.785714
##   sum_fatal
## 1  25.57238
fatality_state[which.max(fatality_state$income_state),]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 42    CT    1.463509     19515.82      0.231545    7.247288     4.642857
##    sum_fatal
## 42  10.24456
fatality_state[which.max(fatality_state$beertax_state),]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 13    GA    2.401569     13316.89      2.440507    9.089214     6.428571
##    sum_fatal
## 13  16.81098
fatality_state[which.max(fatality_state$unempl_state),]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 16    WV    2.300624     10812.34     0.4272953    6.585874         13.2
##    sum_fatal
## 16  16.10437

To find the states with the minimum mean fatality rate, summary fatality rate, income, beer tax and unemployment rate, the code below is used

fatality_state[which.min(fatality_state$fatal_state),]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 48    RI    1.110077     14713.41     0.1562781    6.007946     5.657143
##    sum_fatal
## 48   7.77054
fatality_state[which.min(fatality_state$income_state),]
##   State fatal_state income_state beertax_state miles_state unempl_state
## 5    MS    2.761846      9950.87      1.047007    7.368382     10.71429
##   sum_fatal
## 5  19.33292
fatality_state[which.min(fatality_state$beertax_state),]
##   State fatal_state income_state beertax_state miles_state unempl_state
## 2    WY    3.217534      13452.9    0.04816791    10.59269     7.357143
##   sum_fatal
## 2  22.52274
fatality_state[which.min(fatality_state$unempl_state),]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 31    NH    1.798824     16281.71     0.6470018    7.917257          4.1
##    sum_fatal
## 31  12.59177

7 states with the highest summary fatality rate

fatality_state[1:10, ]
##    State fatal_state income_state beertax_state miles_state unempl_state
## 1     NM    3.653197     11682.84    0.38240269    9.239428     8.785714
## 2     WY    3.217534     13452.90    0.04816791   10.592690     7.357143
## 3     MT    2.903021     12044.86    0.32661333    9.270159     7.828572
## 4     SC    2.821669     11394.35    1.84964757    8.200333     7.285714
## 5     MS    2.761846      9950.87    1.04700737    7.368382    10.714286
## 6     NV    2.745260     15685.50    0.20064095    8.023373     7.600000
## 7     AZ    2.705900     13535.99    0.31104035    7.742038     7.128571
## 8     ID    2.571667     11551.72    0.36125929    8.000849     8.171429
## 9     FL    2.477799     14737.28    1.11561312    7.824188     6.442857
## 10    AR    2.435336     11066.19    0.59057525    7.411801     8.857143
##    sum_fatal
## 1   25.57238
## 2   22.52274
## 3   20.32115
## 4   19.75168
## 5   19.33292
## 6   19.21682
## 7   18.94130
## 8   18.00167
## 9   17.34459
## 10  17.04735

4.2 New dataset grouped by year

The new data set will group observations for 48 states for 1 year per 1 row. Each row will include year, mean fatality rate for 48 states in this year, mean income, mean beer tax, mean miles per driver, mean unemployment rate, the summary fatality rate for 48 states in this year. Also, the data is shown in descending order where the state with the maximum summary fatality at the top and with a minimum at the bottom.

fatality_year <- data.frame(fatality %>%  
 group_by(Year)%>% 
  summarize(fatal_year = mean(Fatality_rate), income_year = mean(Income), beertax_year=mean(Beer_tax), miles_year=mean(Miles_per_driver), unempl_year=mean(Unemployment_rate), sum_fatality=sum(Fatality_rate)))

fatality_year <- data.frame(fatality_year%>%                                  
  arrange(desc(sum_fatality)))

summary(fatality_year)
##       Year        fatal_year     income_year     beertax_year   
##  Min.   :1982   Min.   :1.974   Min.   :12998   Min.   :0.4798  
##  1st Qu.:1984   1st Qu.:2.012   1st Qu.:13345   1st Qu.:0.5019  
##  Median :1985   Median :2.061   Median :13843   Median :0.5169  
##  Mean   :1985   Mean   :2.040   Mean   :13880   Mean   :0.5133  
##  3rd Qu.:1986   3rd Qu.:2.067   3rd Qu.:14368   3rd Qu.:0.5299  
##  Max.   :1988   Max.   :2.089   Max.   :14894   Max.   :0.5324  
##    miles_year     unempl_year     sum_fatality   
##  Min.   :7.227   Min.   :5.456   Min.   : 94.74  
##  1st Qu.:7.563   1st Qu.:6.570   1st Qu.: 96.60  
##  Median :7.960   Median :7.060   Median : 98.91  
##  Mean   :7.891   Mean   :7.347   Mean   : 97.94  
##  3rd Qu.:8.153   3rd Qu.:8.250   3rd Qu.: 99.23  
##  Max.   :8.616   Max.   :9.271   Max.   :100.28

To find the year with the maximum mean fatality rate, summary fatality rate, income, beer tax and unemployment rate, the code below is used

fatality_year[which.max(fatality_year$fatal_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 1 1982   2.089106    12998.26    0.5302734   7.227225    9.266667     100.2771
fatality_year[which.max(fatality_year$income_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 2 1988   2.069594    14893.53    0.4798154    8.61583     5.45625     99.34052
fatality_year[which.max(fatality_year$beertax_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 6 1983   2.007846    13108.08     0.532393   7.384729    9.270833     96.37663
fatality_year[which.max(fatality_year$unempl_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 6 1983   2.007846    13108.08     0.532393   7.384729    9.270833     96.37663

To find the year with the minimum mean fatality rate, summary fatality rate, income, beer tax and unemployment rate, the code below is used

fatality_year[which.min(fatality_year$fatal_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 7 1985   1.973671    13842.82    0.5169272   7.740698    7.060417      94.7362
fatality_year[which.min(fatality_year$income_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 1 1982   2.089106    12998.26    0.5302734   7.227225    9.266667     100.2771
fatality_year[which.min(fatality_year$beertax_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 2 1988   2.069594    14893.53    0.4798154    8.61583     5.45625     99.34052
fatality_year[which.min(fatality_year$unempl_year),]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 2 1988   2.069594    14893.53    0.4798154    8.61583     5.45625     99.34052

Years in order with the highest summary fatality rate at the top

fatality_year[,]
##   Year fatal_year income_year beertax_year miles_year unempl_year sum_fatality
## 1 1982   2.089106    12998.26    0.5302734   7.227225    9.266667    100.27708
## 2 1988   2.069594    14893.53    0.4798154   8.615830    5.456250     99.34052
## 3 1986   2.065071    14186.30    0.5086639   8.016382    6.918750     99.12341
## 4 1987   2.060696    14549.79    0.4951288   8.290278    6.220833     98.91339
## 5 1984   2.017122    13582.51    0.5295902   7.960133    7.233333     96.82188
## 6 1983   2.007846    13108.08    0.5323930   7.384729    9.270833     96.37663
## 7 1985   1.973671    13842.82    0.5169272   7.740698    7.060417     94.73620

5. Visualize the data

5.1 Scatter Plot for main dataset

The scatter plots were built to see if there is a dependence between the fatality rate and other columns in the data set.
The first plot shows the dependency between the mean income per family and the mean fatality rate. If the income increases, then the fatality rate goes down.
In the second plot, there is no linear dependence between the unemployment rate and fatality rate, but the general dependence is observed, the fatality rate increases if the unemployment rate goes up.
In the third plot, we see that the beer tax doesn’t affect the fatality rate. The beer tax stays below 1 for most fatality cases.
In the fourth plot, we observe logarithmic dependence. The more miles per driver, the higher the fatality rate following logarithmic law.

ggplot(fatality, aes(x=Fatality_rate, y=Income)) + geom_point(color="cornflowerblue", size = 2, alpha=.8) + scale_x_continuous("Fatality rate") + scale_y_continuous("Income") + theme_minimal()

ggplot(fatality, aes(x=Fatality_rate, y=Unemployment_rate)) + geom_point(color="red", size = 3, alpha=.7) + scale_x_continuous("Fatality rate") + scale_y_continuous("Unemployment rate")

ggplot(fatality, aes(x=Fatality_rate, y=Beer_tax)) + geom_point(color="green", size = 2, alpha=.8) +scale_x_continuous("Fatality rate") + scale_y_continuous("Beer tax") + theme_minimal()

ggplot(fatality, aes(x=Fatality_rate, y=Miles_per_driver)) + geom_point(color="blue", size = 1, alpha=.8) + scale_x_continuous("Fatality rate") + scale_y_continuous("Miles per driver")

5.2 Pie charts

By building the pie chart, we see that most of the states didn’t have any jail sentence or community service for drunk drive between 1982 and 1988, so people could drive drunk without real consequences until they get in a crash with a fatal end.

plotdata <- fatality %>%
  count(Jail_sentence) %>%
  arrange(desc(Jail_sentence)) %>%
  mutate(prop = round(n*100/sum(n), 1),
         lab.ypos = cumsum(prop) - 0.5*prop)

plotdata$label <- paste0(plotdata$Jail_sentence, "\n",
                         round(plotdata$prop), "%")

ggplot(plotdata, 
       aes(x = "", 
           y = prop, 
           fill = Jail_sentence)) +
  geom_bar(width = 1, 
           stat = "identity", 
           color = "black") +
  geom_text(aes(y = lab.ypos, label = label), 
            color = "black") +
  coord_polar("y", 
              start = 0, 
              direction = -1) +
  theme_void() +
  theme(legend.position = "FALSE") +
  labs(title = "Jail sentence?")

plotdata <- fatality %>%
  count(Community_service) %>%
  arrange(desc(Community_service)) %>%
  mutate(prop = round(n*100/sum(n), 1),
         lab.ypos = cumsum(prop) - 0.5*prop)

plotdata$label <- paste0(plotdata$Community_service, "\n",
                         round(plotdata$prop), "%")

ggplot(plotdata, 
       aes(x = "", 
           y = prop, 
           fill = Community_service)) +
  geom_bar(width = 1, 
           stat = "identity", 
           color = "black") +
  geom_text(aes(y = lab.ypos, label = label), 
            color = "black") +
  coord_polar("y", 
              start = 0, 
              direction = -1) +
  theme_void() +
  theme(legend.position = "FALSE") +
  labs(title = "Community service?")

5.3 Box plot

At the box plot, we see that the median of the fatality rate doesn’t change much from year to year.

boxplot(Fatality_rate~Year,data=fatality, main="Fatality per year Data",
   xlab="Year", ylab="Fatality per year") 

5.4 Histograms and scatterplots for subsets

5.4.1 Data per Year

The goal is to check how the data changed from 1982 to 1988.
The first graph shows that the total fatality rate in 48 states stays at almost the same level from year to year.
The second graph shows some increase in income per family for 7 years and the beer tax goes down on the third graph. Similarly, the unemployment rate goes down in the fourth graph. The positive change was due to the improvement in economics.

ggplot(fatality_year, aes(factor(Year, labels = c("1982",
                                 "1983",
                                 "1984",
                                 "1985",
                                 "1986",
                                 "1987",
                                 "1988")), sum_fatality)) + geom_bar(stat = "identity", fill = "cornflowerblue") + scale_x_discrete("Year") + scale_y_continuous("Fatality per year") + coord_flip() +  theme_minimal()

ggplot(fatality_year, aes(factor(Year, labels = c("1982",
                                 "1983",
                                 "1984",
                                 "1985",
                                 "1986",
                                 "1987",
                                 "1988")), income_year)) + geom_bar(stat = "identity", fill = "pink") + scale_x_discrete("Year") + scale_y_continuous("Income") +  theme_minimal()

ggplot(fatality_year, aes(factor(Year, labels = c("1982",
                                 "1983",
                                 "1984",
                                 "1985",
                                 "1986",
                                 "1987",
                                 "1988")), beertax_year)) + geom_bar(stat = "identity", fill = "lightgreen") + scale_x_discrete("Year") + scale_y_continuous("Beer Tax")

ggplot(fatality_year, aes(factor(Year, labels = c("1982",
                                 "1983",
                                 "1984",
                                 "1985",
                                 "1986",
                                 "1987",
                                 "1988")), unempl_year)) + geom_bar(stat = "identity", fill = "blue") +scale_x_discrete("Year") + scale_y_continuous("Unemployment rate")

5.4.2 Data per State

As it was discussed in 4.2, we see 10 states with the highest fatality rate (mean for 7 years): NM, WY, MT, SC, MS, NV, AZ, ID, FL, AR at the first graph.
We can check if the states with the highest fatality rate have the lowest income in graph 2. The lowest mean income per family: MS, WV, AR, UT, AL, SC, KY, ID, NM, SD. We can see that states with an income below average have a higher fatality rate.
In the third graph, we can see how the beer tax changed from state to state. The states with the lowest beer tax: WY, NJ, CA, NY, WI, DE, RI, IL, CO, KY. People can buy alcohol cheaper.
In the fourth graph, the unemployment rate from state to state is shown. The states with the highest unemployment rate: WY, LA, MI, MS, AL, KY, OH, IL, WA, AR. We can notice that these are almost the same states mentioned above for the highest fatality rate and lowest income.

ggplot(fatality_state, aes(State, fatal_state)) + geom_bar(stat = "identity", fill = "cornflowerblue") + scale_x_discrete("State") + scale_y_continuous("Fatality per state") + theme(axis.text.x = element_text(angle = 90))

ggplot(fatality_state, aes(State, income_state)) + geom_bar(stat = "identity", fill = "darkblue") + scale_x_discrete("State") + scale_y_continuous("Income") + theme(axis.text.x = element_text(angle = 90))

ggplot(fatality_state, aes(State, beertax_state)) + geom_bar(stat = "identity", fill = "lightgreen") + scale_x_discrete("State") + scale_y_continuous("Beer Tax") + theme(axis.text.x = element_text(angle = 90)) 

ggplot(fatality_state, aes(State, unempl_state)) + geom_bar(stat = "identity", fill = "lightblue") + scale_x_discrete("State") + scale_y_continuous("Unemployment rate") + theme(axis.text.x = element_text(angle = 90))


We can check how the fatality rate depends on the values on columns.
At the first graph, we see that that if income decrease, the fatality rate goes up.
At the second graph, the higher unemployment rate, the higher the fatality rate.
At the third graph, there is slight dependence between fatality rate and beer tax. Fatality rate increases enormously while beer tax stays between 0 and 1. But most of the fatality cases happened at states with low beer tax.
Also, at the fourth graph we see that the higher miles per drive, the higher fatality rate.

ggplot(fatality_state, aes(x=fatal_state, y=income_state)) + geom_point(color="cornflowerblue", size = 2, alpha=.8) + scale_x_continuous("Fatality rate per state") + scale_y_continuous("Income") + theme_minimal()

ggplot(fatality_state, aes(x=fatal_state, y=unempl_state)) + geom_point() + geom_point(color="cornflowerblue", size = 2, alpha=.8) + scale_x_continuous("Fatality rate per state") + scale_y_continuous("Unemployment rate") + theme_minimal()

ggplot(fatality_state, aes(x=fatal_state, y=beertax_state)) + geom_point() + geom_point(color="cornflowerblue", size = 2, alpha=.8) + scale_x_continuous("Fatality rate per state") + scale_y_continuous("Beer tax") + theme_minimal()

ggplot(fatality_state, aes(x=fatal_state, y=miles_state)) + geom_point() + geom_point(color="cornflowerblue", size = 2, alpha=.8) + scale_x_continuous("Fatality rate per state") + scale_y_continuous("Miles per driver") + theme_minimal()

6. Conlusion

After analyzing the columns of the data set, we noticed that fatality depends on the unemployment rate, income, the beer tax, miles per drive. All columns in the data set to affect the fatality rate. Columns such as unemployment rate, income shows the well-being of the state. In the states with the worse economic situation, more people died from drunk driving. Also, the fact that in most states there was almost no punishment - kept people forgetting about any boundaries. Even though the data used for analysis is more than 30 years old, the results can still be applied to the current times with minor adjustments.