Loading Data
library("dslabs")
data("murders")
df <- murders
Q1: Use the function str to examine the structure of the murders
object. We can see that this object is a data frame with 51 rows and fve
columns. Which of the following best describes the variables represented
in this data frame?
str(df)
## 'data.frame': 51 obs. of 5 variables:
## $ state : chr "Alabama" "Alaska" "Arizona" "Arkansas" ...
## $ abb : chr "AL" "AK" "AZ" "AR" ...
## $ region : Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...
## $ population: num 4779736 710231 6392017 2915918 37253956 ...
## $ total : num 135 19 232 93 1257 ...
Q2: What are the column names used by the data frame for these fve
variables?
column_name <- names(murders)
column_name
## [1] "state" "abb" "region" "population" "total"
Q3: Use the accessor $ to extract the state abbreviations and assign
them to the object a. What is the class of this object?
a <- murders$abb
a
## [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "DC" "FL" "GA" "HI" "ID" "IL" "IN"
## [16] "IA" "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH"
## [31] "NJ" "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT"
## [46] "VT" "VA" "WA" "WV" "WI" "WY"
class(a)
## [1] "character"
Q4: Now use the square brackets to extract the state abbreviations
and assign them to the object b. Use the identical function to determine
if a and b are the same.
b <- murders["abb"]
b
## abb
## 1 AL
## 2 AK
## 3 AZ
## 4 AR
## 5 CA
## 6 CO
## 7 CT
## 8 DE
## 9 DC
## 10 FL
## 11 GA
## 12 HI
## 13 ID
## 14 IL
## 15 IN
## 16 IA
## 17 KS
## 18 KY
## 19 LA
## 20 ME
## 21 MD
## 22 MA
## 23 MI
## 24 MN
## 25 MS
## 26 MO
## 27 MT
## 28 NE
## 29 NV
## 30 NH
## 31 NJ
## 32 NM
## 33 NY
## 34 NC
## 35 ND
## 36 OH
## 37 OK
## 38 OR
## 39 PA
## 40 RI
## 41 SC
## 42 SD
## 43 TN
## 44 TX
## 45 UT
## 46 VT
## 47 VA
## 48 WA
## 49 WV
## 50 WI
## 51 WY
c <- a == b
c
## abb
## [1,] TRUE
## [2,] TRUE
## [3,] TRUE
## [4,] TRUE
## [5,] TRUE
## [6,] TRUE
## [7,] TRUE
## [8,] TRUE
## [9,] TRUE
## [10,] TRUE
## [11,] TRUE
## [12,] TRUE
## [13,] TRUE
## [14,] TRUE
## [15,] TRUE
## [16,] TRUE
## [17,] TRUE
## [18,] TRUE
## [19,] TRUE
## [20,] TRUE
## [21,] TRUE
## [22,] TRUE
## [23,] TRUE
## [24,] TRUE
## [25,] TRUE
## [26,] TRUE
## [27,] TRUE
## [28,] TRUE
## [29,] TRUE
## [30,] TRUE
## [31,] TRUE
## [32,] TRUE
## [33,] TRUE
## [34,] TRUE
## [35,] TRUE
## [36,] TRUE
## [37,] TRUE
## [38,] TRUE
## [39,] TRUE
## [40,] TRUE
## [41,] TRUE
## [42,] TRUE
## [43,] TRUE
## [44,] TRUE
## [45,] TRUE
## [46,] TRUE
## [47,] TRUE
## [48,] TRUE
## [49,] TRUE
## [50,] TRUE
## [51,] TRUE
Q5: We saw that the region column stores a factor. You can
corroborate this by typing:
fctr <- class(murders$region)
length(levels(murders$region))
## [1] 4