Use help to examine the coding scheme for the mother’s race variable in the birthwt{MASS} dataset. The MASS comes with the base R installation but is not automatically loaded when R is invoked. How many black mothers are there in this data frame? What does the following R command do? c(“White”, “Black”, “Other”)[birthwt$race] ## birthwt{MASS}
library(MASS)
data(birthwt, package = "MASS")
dta<-birthwt
head(dta)
## low age lwt race smoke ptl ht ui ftv bwt
## 85 0 19 182 2 0 0 0 1 0 2523
## 86 0 33 155 3 0 0 0 0 3 2551
## 87 0 20 105 1 1 0 0 0 1 2557
## 88 0 21 108 1 1 0 0 1 2 2594
## 89 0 18 107 1 1 0 0 1 0 2600
## 91 0 21 124 3 0 0 0 0 0 2622
str(dta)
## 'data.frame': 189 obs. of 10 variables:
## $ low : int 0 0 0 0 0 0 0 0 0 0 ...
## $ age : int 19 33 20 21 18 21 22 17 29 26 ...
## $ lwt : int 182 155 105 108 107 124 118 103 123 113 ...
## $ race : int 2 3 1 1 1 3 1 3 1 1 ...
## $ smoke: int 0 0 1 1 1 0 0 0 1 1 ...
## $ ptl : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ht : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ui : int 1 0 0 1 1 0 0 0 0 0 ...
## $ ftv : int 0 3 1 2 0 0 1 1 1 0 ...
## $ bwt : int 2523 2551 2557 2594 2600 2622 2637 2637 2663 2665 ...
結論: 1.birthw屬於data.frame 2.有189觀察值,共10個變項
class(dta$race)
## [1] "integer"
typeof(dta$race)
## [1] "integer"
factor(dta$race)
## [1] 2 3 1 1 1 3 1 3 1 1 3 3 3 3 1 1 2 1 3 1 3 1 1 3 3 1 1 1 2 2 2 1 2 1 2 1 1
## [38] 1 1 1 2 1 2 1 1 1 1 3 1 3 1 3 1 1 3 3 3 3 3 3 3 3 3 1 3 3 3 3 1 2 1 3 3 2
## [75] 1 2 1 1 2 1 1 1 3 3 3 3 3 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 3 1 3 2 1 1 1 2 1
## [112] 3 1 1 1 3 1 3 1 3 1 3 1 1 1 1 1 1 1 1 3 1 2 3 3 3 3 2 3 1 1 1 3 3 1 1 2 1
## [149] 3 3 3 1 1 1 1 3 2 1 2 3 1 3 3 3 2 1 3 3 1 1 2 2 2 3 3 1 1 1 1 2 3 3 1 3 1
## [186] 3 3 2 1
## Levels: 1 2 3
結論: 1.race屬於integer(整數) 2.以factor「類別變數」表示資料型態,共分為1,2,3三種(category variable)
dta$n_race<-c("White", "Black", "Other")[dta$race]
class(dta$n_race)
## [1] "character"
typeof(dta$n_race)
## [1] "character"
factor(dta$n_race)
## [1] Black Other White White White Other White Other White White Other Other
## [13] Other Other White White Black White Other White Other White White Other
## [25] Other White White White Black Black Black White Black White Black White
## [37] White White White White Black White Black White White White White Other
## [49] White Other White Other White White Other Other Other Other Other Other
## [61] Other Other Other White Other Other Other Other White Black White Other
## [73] Other Black White Black White White Black White White White Other Other
## [85] Other Other Other White White White White Other White White White White
## [97] White White White White White White Other White Other Black White White
## [109] White Black White Other White White White Other White Other White Other
## [121] White Other White White White White White White White White Other White
## [133] Black Other Other Other Other Black Other White White White Other Other
## [145] White White Black White Other Other Other White White White White Other
## [157] Black White Black Other White Other Other Other Black White Other Other
## [169] White White Black Black Black Other Other White White White White Black
## [181] Other Other White Other White Other Other Black White
## Levels: Black Other White
結論: 1.創造dta$n_race新的變數,以race 1=“White”, 2=“Black”, 3=“Other”帶入 2.race屬於character 3.以factor「類別變數」表示資料型態,共分為Black, Other, White三種(會以英文字母順序排列)
table(dta$n_race)
##
## Black Other White
## 26 67 96
結論: 1.黑人有26個。