The notation, women{datasets}, indicates that a data object by the name women is in the datasets package. This package is preloaded when R is invoked. Explain the difference between c(women) and c(as.matrix(women)) using the women{datasets}.
women #內建於R裡面"dataset" package的資料
## height weight
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
## 7 64 132
## 8 65 135
## 9 66 139
## 10 67 142
## 11 68 146
## 12 69 150
## 13 70 154
## 14 71 159
## 15 72 164
names(women) #列出women的標題欄
## [1] "height" "weight"
head(women) #列出前六個觀察值
## height weight
## 1 58 115
## 2 59 117
## 3 60 120
## 4 61 123
## 5 62 126
## 6 63 129
class(women) #了解一下women資料結構
## [1] "data.frame"
結論:women是一個data.frame # str()
str(women)
## 'data.frame': 15 obs. of 2 variables:
## $ height: num 58 59 60 61 62 63 64 65 66 67 ...
## $ weight: num 115 117 120 123 126 129 132 135 139 142 ...
str(women$height)
## num [1:15] 58 59 60 61 62 63 64 65 66 67 ...
str(women$weight)
## num [1:15] 115 117 120 123 126 129 132 135 139 142 ...
summary(women)
## height weight
## Min. :58.0 Min. :115.0
## 1st Qu.:61.5 1st Qu.:124.5
## Median :65.0 Median :135.0
## Mean :65.0 Mean :136.7
## 3rd Qu.:68.5 3rd Qu.:148.0
## Max. :72.0 Max. :164.0
結論: 1.當不了解資料結構時,可以使用str()或summary()。 2.str()和summary()的不同如上,str偏向資料的型態描述,summary偏向資料的統計描述。
c(women)
## $height
## [1] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
##
## $weight
## [1] 115 117 120 123 126 129 132 135 139 142 146 150 154 159 164
str(c(women))
## List of 2
## $ height: num [1:15] 58 59 60 61 62 63 64 65 66 67 ...
## $ weight: num [1:15] 115 117 120 123 126 129 132 135 139 142 ...
依據column列出women的資料 1.有height和weight兩個變項 2.有15個觀察值,兩變項中沒有miss data
class(c(women))
## [1] "list"
1.c(women)是由height和weight組成的兩個list。
c(as.matrix(women))
## [1] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 115 117 120 123
## [20] 126 129 132 135 139 142 146 150 154 159 164
str(as.matrix(women))
## num [1:15, 1:2] 58 59 60 61 62 63 64 65 66 67 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:2] "height" "weight"
1.as.matrix(women)是二維矩陣的結構。 2.是一堆數字 num 排成的15*2的矩陣 [1:15, 1:2],然後列了一些前面的數據最代表,屬性附上了行列名稱(dimnames)用一個list加註在這個矩陣上。 3.?attr列出的特殊屬性代表有:class, comment, dim, dimnames, names, row.names, tsp…,這裡是dimnames。
class(as.matrix(women))
## [1] "matrix" "array"
1.as.matrix(women)是 matrix,也是array(二維的matrix是最簡單的array)。
c(women)和as.matrix(women)雖然資料一樣,但對於R來講,兩者呈現資料方式不同,會影響後面資料處理的做法和判斷。(影響層面多大,我目前還不清楚…)