Importing the Bridges Dataset

Data Source: https://archive.ics.uci.edu/ml/datasets/Pittsburgh+Bridges

bridges<-read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/bridges/bridges.data.version1", header= FALSE, sep=",")

Renaming the Columns

colnames(bridges) <- c("IDENTIF","RIVER","LOCATION","ERECTED","PURPOSE","LENGTH","LANES",
                       "CLEAR-G","T-OR-D","MATERIAL","SPAN","REL-L","TYPE")
head(bridges)

Subsetting the Data

bridges_subset <- subset(bridges, PURPOSE=="HIGHWAY", select=c(ERECTED,PURPOSE,MATERIAL))
bridges_subset

Summarizing the Data

summary(bridges_subset)
##     ERECTED         PURPOSE    MATERIAL 
##  Min.   :1818   AQUEDUCT: 0   ?    : 2  
##  1st Qu.:1890   HIGHWAY :71   IRON : 7  
##  Median :1923   RR      : 0   STEEL:51  
##  Mean   :1912   WALK    : 0   WOOD :11  
##  3rd Qu.:1945                           
##  Max.   :1986

Histogram of Erected Bridges - Timeline

This histogram shows the amount of highways built in each grouping of 10 years. Overall, the distribution of highways built over the years is left skewed - the majority of bridges were built prior to 1930. The historgram is also bimodal, showing bursts in highway construction between 1890 - 1900 and 1920 - 1930.

hist(bridges_subset$ERECTED, breaks = 20)

Year vs Highway Material Boxplot

The boxplot below shows the median years that each highway material was used. Based on the plot, it seems like the prefered highway material was wood until 1875, followed by iron until 1900 and steel from 1900 onwards.

require(MASS) 
## Loading required package: MASS
data(iris) 
boxplot(bridges_subset$ERECTED ~ bridges_subset$MATERIAL, at=rank(tapply(bridges_subset$ERECTED, bridges_subset$MATERIAL, median)))