Charlie Rosemond - 082919

Read in comma-delimited file as a data frame

Upon downloading the bridges.data.version2 file (https://archive.ics.uci.edu/ml/machine-learning-databases/bridges/bridges.data.version2) and uploading it to github, I read the raw file into R as a data frame called “bridges”. I then check its top six rows.

bridges <- read.csv("https://raw.githubusercontent.com/chrosemo/data607_fall19_week1/master/bridges.data.version2", header = FALSE)
head(bridges)
##   V1 V2 V3     V4       V5     V6 V7 V8      V9  V10    V11 V12  V13
## 1 E1  M  3 CRAFTS  HIGHWAY      ?  2  N THROUGH WOOD  SHORT   S WOOD
## 2 E2  A 25 CRAFTS  HIGHWAY MEDIUM  2  N THROUGH WOOD  SHORT   S WOOD
## 3 E3  A 39 CRAFTS AQUEDUCT      ?  1  N THROUGH WOOD      ?   S WOOD
## 4 E5  A 29 CRAFTS  HIGHWAY MEDIUM  2  N THROUGH WOOD  SHORT   S WOOD
## 5 E6  M 23 CRAFTS  HIGHWAY      ?  2  N THROUGH WOOD      ?   S WOOD
## 6 E7  A 27 CRAFTS  HIGHWAY  SHORT  2  N THROUGH WOOD MEDIUM   S WOOD

Update the data frame with appropriate column names

I update the column names of “bridges” to reflect the names noted in the bridges.names file (https://archive.ics.uci.edu/ml/machine-learning-databases/bridges/bridges.names) and again check the top six rows.

colnames(bridges) <- c("identifier", "river", "location", "erected", "purpose", "length", "lanes", "clear_g", "t_or_d", "material", "span", "rel_l", "type")
head(bridges)
##   identifier river location erected  purpose length lanes clear_g  t_or_d
## 1         E1     M        3  CRAFTS  HIGHWAY      ?     2       N THROUGH
## 2         E2     A       25  CRAFTS  HIGHWAY MEDIUM     2       N THROUGH
## 3         E3     A       39  CRAFTS AQUEDUCT      ?     1       N THROUGH
## 4         E5     A       29  CRAFTS  HIGHWAY MEDIUM     2       N THROUGH
## 5         E6     M       23  CRAFTS  HIGHWAY      ?     2       N THROUGH
## 6         E7     A       27  CRAFTS  HIGHWAY  SHORT     2       N THROUGH
##   material   span rel_l type
## 1     WOOD  SHORT     S WOOD
## 2     WOOD  SHORT     S WOOD
## 3     WOOD      ?     S WOOD
## 4     WOOD  SHORT     S WOOD
## 5     WOOD      ?     S WOOD
## 6     WOOD MEDIUM     S WOOD

Subset the data frame

Lastly, I subset the data frame, creating a new one called “bridges_working” that keeps four columns. I then check the new data frame’s bottom six rows.

bridges_working <- bridges[c("identifier", "river", "material", "span")]
tail(bridges_working)
##     identifier river material   span
## 103        E85     M    STEEL   LONG
## 104        E84     A    STEEL MEDIUM
## 105        E91     O    STEEL   LONG
## 106        E90     M    STEEL   LONG
## 107       E100     O        ?      ?
## 108       E109     A        ?      ?