607_Week2Assignment
Your task is to choose one dataset, then study the data and its associated description of the data (i.e. “data dictionary”). You should take the data, and create an R data frame with a subset of the columns (and if you like rows) in the dataset.Your deliverable is the R code to perform these transformation tasks.
Reading bridge data
bridgev1file<-read.csv("https://archive.ics.uci.edu/ml/machine-learning-databases/bridges/bridges.data.version2", header= FALSE, sep=",",na.strings = "?")
head (bridgev1file)
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
## 1 E1 M 3 CRAFTS HIGHWAY <NA> 2 N THROUGH WOOD SHORT S WOOD
## 2 E2 A 25 CRAFTS HIGHWAY MEDIUM 2 N THROUGH WOOD SHORT S WOOD
## 3 E3 A 39 CRAFTS AQUEDUCT <NA> 1 N THROUGH WOOD <NA> S WOOD
## 4 E5 A 29 CRAFTS HIGHWAY MEDIUM 2 N THROUGH WOOD SHORT S WOOD
## 5 E6 M 23 CRAFTS HIGHWAY <NA> 2 N THROUGH WOOD <NA> S WOOD
## 6 E7 A 27 CRAFTS HIGHWAY SHORT 2 N THROUGH WOOD MEDIUM S WOOD
Read data dictionary (simplied)
bridge_dictionay <- read.table("./data_dictionary.txt",row.names = 1, sep=":")
print(bridge_dictionay)
## V2
## IDENTIFIER
## RIVER A, M ,O
## LOCATION -
## ERECTED CRAFTS ,EMERGING ,MATURE ,MODERN
## PURPOSE WALK, AQUEDUCT, RR, HIGHWAY
## LENGTH SHORT, MEDIUM, LONG
## LANES 1, 2, 4, 6
## CLEAR-G N, G
## T-OR-D THROUGH, DECK
## MATERIAL WOOD, IRON, STEEL
## SPAN SHORT, MEDUIM, LONG
## REL-L S,S-F, F
## TYPE WOOD, SUSPEN, SIMPLE-T, ARCH
Apply headers to data by reading from data_distionary
names(bridgev1file) <- row.names(bridge_dictionay)
head(bridgev1file)
## IDENTIFIER RIVER LOCATION ERECTED PURPOSE LENGTH LANES CLEAR-G
## 1 E1 M 3 CRAFTS HIGHWAY <NA> 2 N
## 2 E2 A 25 CRAFTS HIGHWAY MEDIUM 2 N
## 3 E3 A 39 CRAFTS AQUEDUCT <NA> 1 N
## 4 E5 A 29 CRAFTS HIGHWAY MEDIUM 2 N
## 5 E6 M 23 CRAFTS HIGHWAY <NA> 2 N
## 6 E7 A 27 CRAFTS HIGHWAY SHORT 2 N
## T-OR-D MATERIAL SPAN REL-L TYPE
## 1 THROUGH WOOD SHORT S WOOD
## 2 THROUGH WOOD SHORT S WOOD
## 3 THROUGH WOOD <NA> S WOOD
## 4 THROUGH WOOD SHORT S WOOD
## 5 THROUGH WOOD <NA> S WOOD
## 6 THROUGH WOOD MEDIUM S WOOD
I expected all ‘HIGHWAYS’ with ‘LONG’ length are made up of ‘IRON’ or ‘STEEL’,but there are few HIGHWAYS made up of ‘WOOD’
bridge_sub_df <- subset(bridgev1file, select=c(5,6,7,10))
print(bridge_sub_df)
## PURPOSE LENGTH LANES MATERIAL
## 1 HIGHWAY <NA> 2 WOOD
## 2 HIGHWAY MEDIUM 2 WOOD
## 3 AQUEDUCT <NA> 1 WOOD
## 4 HIGHWAY MEDIUM 2 WOOD
## 5 HIGHWAY <NA> 2 WOOD
## 6 HIGHWAY SHORT 2 WOOD
## 7 AQUEDUCT MEDIUM 1 IRON
## 8 HIGHWAY MEDIUM 2 IRON
## 9 AQUEDUCT <NA> 1 WOOD
## 10 HIGHWAY MEDIUM 2 WOOD
## 11 RR <NA> 2 WOOD
## 12 HIGHWAY MEDIUM 2 WOOD
## 13 HIGHWAY <NA> 2 WOOD
## 14 RR <NA> 2 WOOD
## 15 HIGHWAY MEDIUM 2 IRON
## 16 RR MEDIUM 2 IRON
## 17 RR MEDIUM 2 IRON
## 18 HIGHWAY MEDIUM 2 WOOD
## 19 HIGHWAY MEDIUM 2 WOOD
## 20 RR <NA> 2 IRON
## 21 HIGHWAY MEDIUM NA STEEL
## 22 HIGHWAY MEDIUM 4 WOOD
## 23 RR <NA> 2 STEEL
## 24 RR <NA> 2 STEEL
## 25 RR <NA> 2 STEEL
## 26 RR MEDIUM 2 STEEL
## 27 RR <NA> 2 STEEL
## 28 HIGHWAY MEDIUM 2 STEEL
## 29 HIGHWAY MEDIUM 2 STEEL
## 30 HIGHWAY <NA> 2 IRON
## 31 RR MEDIUM 2 STEEL
## 32 RR LONG 2 STEEL
## 33 HIGHWAY MEDIUM NA IRON
## 34 HIGHWAY <NA> 2 IRON
## 35 HIGHWAY MEDIUM 2 STEEL
## 36 HIGHWAY <NA> 2 IRON
## 37 RR MEDIUM 2 STEEL
## 38 HIGHWAY <NA> 2 STEEL
## 39 AQUEDUCT MEDIUM 1 WOOD
## 40 HIGHWAY <NA> 2 STEEL
## 41 HIGHWAY <NA> 2 IRON
## 42 HIGHWAY LONG 2 STEEL
## 43 HIGHWAY <NA> 2 STEEL
## 44 HIGHWAY MEDIUM 2 STEEL
## 45 RR LONG 2 STEEL
## 46 RR LONG NA STEEL
## 47 RR LONG 2 STEEL
## 48 HIGHWAY MEDIUM 2 STEEL
## 49 HIGHWAY LONG 2 STEEL
## 50 RR <NA> 2 STEEL
## 51 HIGHWAY MEDIUM 2 STEEL
## 52 RR MEDIUM 2 STEEL
## 53 RR LONG 2 STEEL
## 54 RR MEDIUM 2 STEEL
## 55 RR MEDIUM NA STEEL
## 56 RR MEDIUM 2 STEEL
## 57 RR SHORT 4 STEEL
## 58 RR MEDIUM NA STEEL
## 59 HIGHWAY MEDIUM NA STEEL
## 60 HIGHWAY <NA> NA STEEL
## 61 HIGHWAY MEDIUM 2 STEEL
## 62 RR MEDIUM 2 STEEL
## 63 HIGHWAY MEDIUM 2 STEEL
## 64 RR <NA> NA STEEL
## 65 RR LONG NA STEEL
## 66 RR LONG 2 STEEL
## 67 HIGHWAY MEDIUM 4 STEEL
## 68 RR LONG 2 STEEL
## 69 RR LONG 2 STEEL
## 70 WALK <NA> NA STEEL
## 71 HIGHWAY SHORT 4 STEEL
## 72 HIGHWAY LONG 4 STEEL
## 73 HIGHWAY SHORT 4 STEEL
## 74 HIGHWAY SHORT 4 STEEL
## 75 HIGHWAY MEDIUM 2 STEEL
## 76 HIGHWAY MEDIUM NA STEEL
## 77 HIGHWAY LONG 4 STEEL
## 78 HIGHWAY MEDIUM 4 STEEL
## 79 HIGHWAY LONG 4 STEEL
## 80 HIGHWAY LONG 2 STEEL
## 81 HIGHWAY SHORT 4 STEEL
## 82 HIGHWAY LONG 2 STEEL
## 83 HIGHWAY MEDIUM 4 STEEL
## 84 HIGHWAY MEDIUM 4 STEEL
## 85 HIGHWAY MEDIUM 4 STEEL
## 86 HIGHWAY MEDIUM 4 STEEL
## 87 HIGHWAY MEDIUM 4 STEEL
## 88 HIGHWAY MEDIUM 4 STEEL
## 89 RR SHORT 2 STEEL
## 90 HIGHWAY MEDIUM 2 STEEL
## 91 HIGHWAY LONG 2 STEEL
## 92 HIGHWAY <NA> NA STEEL
## 93 RR <NA> NA STEEL
## 94 HIGHWAY MEDIUM 2 STEEL
## 95 HIGHWAY SHORT 4 STEEL
## 96 HIGHWAY LONG 4 STEEL
## 97 HIGHWAY MEDIUM 4 STEEL
## 98 HIGHWAY LONG 4 STEEL
## 99 HIGHWAY SHORT NA STEEL
## 100 HIGHWAY MEDIUM 2 STEEL
## 101 HIGHWAY MEDIUM 6 STEEL
## 102 HIGHWAY SHORT 4 STEEL
## 103 HIGHWAY LONG 4 STEEL
## 104 HIGHWAY SHORT 6 STEEL
## 105 HIGHWAY LONG 6 STEEL
## 106 HIGHWAY SHORT 6 STEEL
## 107 HIGHWAY <NA> NA <NA>
## 108 HIGHWAY <NA> NA <NA>