[Dataset] (https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/amis.csv)
In a study into the effect that warning signs have on speeding patterns, Cambridgeshire County Council considered 14 pairs of locations. The locations were paired to account for factors such as traffic volume and type of road. One site in each pair had a sign erected warning of the dangers of speeding and asking drivers to slow down. No action was taken at the second site. Three sets of measurements were taken at each site. Each set of measurements was nominally of the speeds of 100 cars but not all sites have exactly 100 measurements. These speed measurements were taken before the erection of the sign, shortly after the erection of the sign, and again after the sign had been in place for some time.
amis
This data frame contains the following columns:
speed
Speeds of cars (in miles per hour).
period
A numeric column indicating the time that the reading was taken. A value of 1 indicates a reading taken before the sign was erected, a 2 indicates a reading taken shortly after erection of the sign and a 3 indicates a reading taken after the sign had been in place for some time.
warning
A numeric column indicating whether the location of the reading was chosen to have a warning sign erected. A value of 1 indicates presence of a sign and a value of 2 indicates that no sign was erected.
pair
A numeric column giving the pair number at which the reading was taken. Pairs were numbered from 1 to 14.
#install.packages("RCurl")
library(RCurl)
gitHubUrl <- getURL("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/boot/amis.csv")
speedingPattern <- read.csv(text = gitHubUrl)
head(speedingPattern)
## X speed period warning pair
## 1 1 26 1 1 1
## 2 2 26 1 1 1
## 3 3 26 1 1 1
## 4 4 26 1 1 1
## 5 5 27 1 1 1
## 6 6 28 1 1 1
str(speedingPattern)
## 'data.frame': 8437 obs. of 5 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ speed : int 26 26 26 26 27 28 28 28 28 29 ...
## $ period : int 1 1 1 1 1 1 1 1 1 1 ...
## $ warning: int 1 1 1 1 1 1 1 1 1 1 ...
## $ pair : int 1 1 1 1 1 1 1 1 1 1 ...
summary(speedingPattern)
## X speed period warning
## Min. : 1 Min. :19.00 Min. :1.000 Min. :1.000
## 1st Qu.:2110 1st Qu.:33.00 1st Qu.:1.000 1st Qu.:1.000
## Median :4219 Median :37.00 Median :2.000 Median :2.000
## Mean :4219 Mean :37.82 Mean :2.004 Mean :1.507
## 3rd Qu.:6328 3rd Qu.:42.00 3rd Qu.:3.000 3rd Qu.:2.000
## Max. :8437 Max. :67.00 Max. :3.000 Max. :2.000
## pair
## Min. : 1.000
## 1st Qu.: 4.000
## Median : 8.000
## Mean : 7.559
## 3rd Qu.:11.000
## Max. :14.000
The dataset contains speeding measurements taken at different locations (pairs) which are numbered from 1 to 14. For each pair some measurements don’t have any warning sign and some have warning sign. Now let’s analyze the speed for various situations for location 1.
Dataset for pair 1 with no warning sign
no_warning_pair_1 <- speedingPattern[which(speedingPattern$warning == 2 & speedingPattern$pair == 1), ]
summary(no_warning_pair_1)
## X speed period warning pair
## Min. : 201.0 Min. :21.00 Min. :1 Min. :2 Min. :1
## 1st Qu.: 275.8 1st Qu.:32.00 1st Qu.:1 1st Qu.:2 1st Qu.:1
## Median : 350.5 Median :36.00 Median :2 Median :2 Median :1
## Mean :2117.2 Mean :36.36 Mean :2 Mean :2 Mean :1
## 3rd Qu.:5725.2 3rd Qu.:40.00 3rd Qu.:3 3rd Qu.:2 3rd Qu.:1
## Max. :5800.0 Max. :56.00 Max. :3 Max. :2 Max. :1
Dataset for pair 1 before warning sign
before_warning_pair_1 <- speedingPattern[which(speedingPattern$period == 1 & speedingPattern$warning == 1 & speedingPattern$pair == 1), ]
summary(before_warning_pair_1)
## X speed period warning pair
## Min. : 1.00 Min. :26.00 Min. :1 Min. :1 Min. :1
## 1st Qu.: 25.75 1st Qu.:31.00 1st Qu.:1 1st Qu.:1 1st Qu.:1
## Median : 50.50 Median :34.00 Median :1 Median :1 Median :1
## Mean : 50.50 Mean :34.53 Mean :1 Mean :1 Mean :1
## 3rd Qu.: 75.25 3rd Qu.:37.25 3rd Qu.:1 3rd Qu.:1 3rd Qu.:1
## Max. :100.00 Max. :49.00 Max. :1 Max. :1 Max. :1
Dataset for pair 1 after warning sign. Here the data set contains speed measurements right after the warning sign is erected and reading taken after the sign had been in place for some time
after_warning_pair_1 <- speedingPattern[which((speedingPattern$period == 2 | speedingPattern$period == 3) & speedingPattern$warning == 1 & speedingPattern$pair == 1), ]
summary(after_warning_pair_1)
## X speed period warning pair
## Min. : 101.0 Min. :21.00 Min. :2.0 Min. :1 Min. :1
## 1st Qu.: 150.8 1st Qu.:32.00 1st Qu.:2.0 1st Qu.:1 1st Qu.:1
## Median :2900.5 Median :35.00 Median :2.5 Median :1 Median :1
## Mean :2900.5 Mean :35.37 Mean :2.5 Mean :1 Mean :1
## 3rd Qu.:5650.2 3rd Qu.:38.00 3rd Qu.:3.0 3rd Qu.:1 3rd Qu.:1
## Max. :5700.0 Max. :51.00 Max. :3.0 Max. :1 Max. :1
Let’s plot the speed for various situations for pair 1
Box plot for data set with no warning
require(ggplot2)
ggplot(no_warning_pair_1,aes(y=speed, x=1)) + geom_boxplot()
Box plot for data set with before warning
ggplot(before_warning_pair_1,aes(y=speed, x=1)) + geom_boxplot()
Box plot for data set with after warning
ggplot(after_warning_pair_1,aes(y=speed, x=1)) + geom_boxplot()
Histogram for speed of all pairs without warning sign
speedingPatternNoWarning <- speedingPattern[which(speedingPattern$warning == 2), ]
ggplot(data=speedingPatternNoWarning) + geom_histogram(aes(x=speed))
Histogram for speed of all pairs before warning sign
speedingPatternBeforeWarning <- speedingPattern[which(speedingPattern$warning == 1 & speedingPattern$period == 1), ]
ggplot(data=speedingPatternBeforeWarning) + geom_histogram(aes(x=speed))
Histogram for speed of all pairs after warning sign
speedingPatternAfterWarning <- speedingPattern[which(speedingPattern$warning == 1 & (speedingPattern$period == 2 | speedingPattern$period == 3)), ]
ggplot(data=speedingPatternAfterWarning) + geom_histogram(aes(x=speed))
Scatterplot for pair vs speed for different periods
ggplot(speedingPattern, aes(x=pair, y=speed)) + geom_point(aes(color=period))
Barplot of speed vs count
ggplot(speedingPattern, aes(x=factor(speed))) + geom_bar(stat="count", width=0.7, fill="steelblue") + theme_minimal()
We need to analyze if erecting warning signs has any effect on the speed of the vehicles. To do that we need to create a new data frame for mean speeds.
pairs <- unique(speedingPattern$pair)
no_warning <- rep(0, length(pairs))
before_warning <- rep(0, length(pairs))
after_warning <- rep(0, length(pairs))
for (i in pairs) {
no_warning[i] <- mean(speedingPattern[which(speedingPattern$warning == 2 & speedingPattern$pair == i), "speed"])
before_warning[i] <- mean(speedingPattern[which(speedingPattern$period == 1 & speedingPattern$warning == 1 & speedingPattern$pair == i), "speed"])
after_warning[i] <- mean(speedingPattern[which((speedingPattern$period == 2 | speedingPattern$period == 3) & speedingPattern$warning == 1 & speedingPattern$pair == i), "speed"])
}
mean_speeds <- data.frame(pair = pairs, no_warning = no_warning, before_warning = before_warning, after_warning = after_warning)
print(mean_speeds)
## pair no_warning before_warning after_warning
## 1 1 36.36000 34.53 35.37000
## 2 2 33.69000 38.95 38.93000
## 3 3 37.66667 32.22 32.69543
## 4 4 36.85614 36.38 35.73077
## 5 5 39.35000 36.61 36.61749
## 6 6 39.63333 40.13 37.31500
## 7 7 41.92667 29.86 28.78000
## 8 8 38.35667 38.35 40.15500
## 9 9 38.81000 38.05 36.46500
## 10 10 35.83103 33.95 35.47000
## 11 11 34.37750 34.62 34.87500
## 12 12 42.48667 38.57 37.97500
## 13 14 46.36000 39.62 41.84500
## 14 13 45.32667 39.30 41.46000
We add a new column to the data frame for percentage change in speed before and after signs are erected.
mean_speeds$perc_change <- ((mean_speeds$after_warning - mean_speeds$before_warning) / mean_speeds$before_warning) * 100
print(mean_speeds)
## pair no_warning before_warning after_warning perc_change
## 1 1 36.36000 34.53 35.37000 2.43266725
## 2 2 33.69000 38.95 38.93000 -0.05134788
## 3 3 37.66667 32.22 32.69543 1.47557875
## 4 4 36.85614 36.38 35.73077 -1.78458155
## 5 5 39.35000 36.61 36.61749 0.02044889
## 6 6 39.63333 40.13 37.31500 -7.01470222
## 7 7 41.92667 29.86 28.78000 -3.61687877
## 8 8 38.35667 38.35 40.15500 4.70664928
## 9 9 38.81000 38.05 36.46500 -4.16557162
## 10 10 35.83103 33.95 35.47000 4.47717231
## 11 11 34.37750 34.62 34.87500 0.73656846
## 12 12 42.48667 38.57 37.97500 -1.54264973
## 13 14 46.36000 39.62 41.84500 5.61585058
## 14 13 45.32667 39.30 41.46000 5.49618321
pairs_with_reduced_speed <- nrow(mean_speeds[which(mean_speeds$perc_change < 0), ])
print(pairs_with_reduced_speed)
## [1] 6
print(nrow(mean_speeds[which(mean_speeds$perc_change < 0), ]) / length(pairs))
## [1] 0.4285714
print(mean_speeds[which.min(mean_speeds$perc_change), ])
## pair no_warning before_warning after_warning perc_change
## 6 6 39.63333 40.13 37.315 -7.014702
The experiment shows less than half of the locations have shown effect on speed with warning sign. Location 6 has the maximum reduction (7.015 %) in speed after warning sign is erected.