1. For this assignment, I downloaded the Pittsburgh bridges dataset from

https://archive.ics.uci.edu/ml/datasets/Pittsburgh+Bridges. Uploaded the file to my github repository and used the curl package to retrieve the dataset from my repository.

library(curl)
## Warning: package 'curl' was built under R version 3.2.2
bridges <- read.csv(curl("https://raw.githubusercontent.com/isrini/SI_IS607/master/bridges.data.version1"), header = FALSE)
head(bridges)
##   V1 V2 V3   V4       V5   V6 V7 V8      V9  V10    V11 V12  V13
## 1 E1  M  3 1818  HIGHWAY    ?  2  N THROUGH WOOD  SHORT   S WOOD
## 2 E2  A 25 1819  HIGHWAY 1037  2  N THROUGH WOOD  SHORT   S WOOD
## 3 E3  A 39 1829 AQUEDUCT    ?  1  N THROUGH WOOD      ?   S WOOD
## 4 E5  A 29 1837  HIGHWAY 1000  2  N THROUGH WOOD  SHORT   S WOOD
## 5 E6  M 23 1838  HIGHWAY    ?  2  N THROUGH WOOD      ?   S WOOD
## 6 E7  A 27 1840  HIGHWAY  990  2  N THROUGH WOOD MEDIUM   S WOOD

2. The data set includes 108 rows and 13 columns. It does not have the headers to describe the columns.

Lets add the column names or headers to the data set using the following code and then displaying the data using a function from knitr package.

colnames(bridges) <- c("identifier", "river", "location", "erected","purpose", "length", "lanes", "clear-g", "T_or_D", "material", "span", "rel_l", "type")

rownames(bridges) <- NULL
library(knitr)
## Warning: package 'knitr' was built under R version 3.2.2
kable(head(bridges, 20))
identifier river location erected purpose length lanes clear-g T_or_D material span rel_l type
E1 M 3 1818 HIGHWAY ? 2 N THROUGH WOOD SHORT S WOOD
E2 A 25 1819 HIGHWAY 1037 2 N THROUGH WOOD SHORT S WOOD
E3 A 39 1829 AQUEDUCT ? 1 N THROUGH WOOD ? S WOOD
E5 A 29 1837 HIGHWAY 1000 2 N THROUGH WOOD SHORT S WOOD
E6 M 23 1838 HIGHWAY ? 2 N THROUGH WOOD ? S WOOD
E7 A 27 1840 HIGHWAY 990 2 N THROUGH WOOD MEDIUM S WOOD
E8 A 28 1844 AQUEDUCT 1000 1 N THROUGH IRON SHORT S SUSPEN
E9 M 3 1846 HIGHWAY 1500 2 N THROUGH IRON SHORT S SUSPEN
E10 A 39 1848 AQUEDUCT ? 1 N DECK WOOD ? S WOOD
E11 A 29 1851 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E12 A 39 1853 RR ? 2 N DECK WOOD ? S WOOD
E14 M 6 1856 HIGHWAY 1200 2 N THROUGH WOOD MEDIUM S WOOD
E13 A 33 1856 HIGHWAY ? 2 N THROUGH WOOD ? S WOOD
E15 A 28 1857 RR ? 2 N THROUGH WOOD ? S WOOD
E16 A 25 1859 HIGHWAY 1030 2 N THROUGH IRON MEDIUM S-F SUSPEN
E17 M 4 1863 RR 1000 2 N THROUGH IRON MEDIUM ? SIMPLE-T
E18 A 28 1864 RR 1200 2 N THROUGH IRON SHORT S SIMPLE-T
E19 A 29 1866 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E20 A 32 1870 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E21 M 16 1874 RR ? 2 ? THROUGH IRON ? ? SIMPLE-T

3. Data transformation:

To transform column ‘river’ data to their full abbreviation values, the below function is being used.

bridges$river <- ifelse(bridges$river=="A", "Allegheny", 
              ifelse(bridges$river=="M", "Monongahela",
              ifelse(bridges$river=="O", "Ohio", "N/A")
                     
))

4. Data transformation:

To transform column ‘purpose’ data for ‘RR’ to its full abbreviation, the below code is being used.

bridges$purpose <- gsub("RR", "RAILROAD", bridges$purpose)

5. View the transformed data

kable(head(bridges, 20))
identifier river location erected purpose length lanes clear-g T_or_D material span rel_l type
E1 Monongahela 3 1818 HIGHWAY ? 2 N THROUGH WOOD SHORT S WOOD
E2 Allegheny 25 1819 HIGHWAY 1037 2 N THROUGH WOOD SHORT S WOOD
E3 Allegheny 39 1829 AQUEDUCT ? 1 N THROUGH WOOD ? S WOOD
E5 Allegheny 29 1837 HIGHWAY 1000 2 N THROUGH WOOD SHORT S WOOD
E6 Monongahela 23 1838 HIGHWAY ? 2 N THROUGH WOOD ? S WOOD
E7 Allegheny 27 1840 HIGHWAY 990 2 N THROUGH WOOD MEDIUM S WOOD
E8 Allegheny 28 1844 AQUEDUCT 1000 1 N THROUGH IRON SHORT S SUSPEN
E9 Monongahela 3 1846 HIGHWAY 1500 2 N THROUGH IRON SHORT S SUSPEN
E10 Allegheny 39 1848 AQUEDUCT ? 1 N DECK WOOD ? S WOOD
E11 Allegheny 29 1851 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E12 Allegheny 39 1853 RAILROAD ? 2 N DECK WOOD ? S WOOD
E14 Monongahela 6 1856 HIGHWAY 1200 2 N THROUGH WOOD MEDIUM S WOOD
E13 Allegheny 33 1856 HIGHWAY ? 2 N THROUGH WOOD ? S WOOD
E15 Allegheny 28 1857 RAILROAD ? 2 N THROUGH WOOD ? S WOOD
E16 Allegheny 25 1859 HIGHWAY 1030 2 N THROUGH IRON MEDIUM S-F SUSPEN
E17 Monongahela 4 1863 RAILROAD 1000 2 N THROUGH IRON MEDIUM ? SIMPLE-T
E18 Allegheny 28 1864 RAILROAD 1200 2 N THROUGH IRON SHORT S SIMPLE-T
E19 Allegheny 29 1866 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E20 Allegheny 32 1870 HIGHWAY 1000 2 N THROUGH WOOD MEDIUM S WOOD
E21 Monongahela 16 1874 RAILROAD ? 2 ? THROUGH IRON ? ? SIMPLE-T