library(dslabs)
data(murders)
attach(murders)
library(dplyr)
select(murders,population)
## population
## 1 4779736
## 2 710231
## 3 6392017
## 4 2915918
## 5 37253956
## 6 5029196
## 7 3574097
## 8 897934
## 9 601723
## 10 19687653
## 11 9920000
## 12 1360301
## 13 1567582
## 14 12830632
## 15 6483802
## 16 3046355
## 17 2853118
## 18 4339367
## 19 4533372
## 20 1328361
## 21 5773552
## 22 6547629
## 23 9883640
## 24 5303925
## 25 2967297
## 26 5988927
## 27 989415
## 28 1826341
## 29 2700551
## 30 1316470
## 31 8791894
## 32 2059179
## 33 19378102
## 34 9535483
## 35 672591
## 36 11536504
## 37 3751351
## 38 3831074
## 39 12702379
## 40 1052567
## 41 4625364
## 42 814180
## 43 6346105
## 44 25145561
## 45 2763885
## 46 625741
## 47 8001024
## 48 6724540
## 49 1852994
## 50 5686986
## 51 563626
filter(murders,total < 135)
## state abb region population total
## 1 Alaska AK West 710231 19
## 2 Arkansas AR South 2915918 93
## 3 Colorado CO West 5029196 65
## 4 Connecticut CT Northeast 3574097 97
## 5 Delaware DE South 897934 38
## 6 District of Columbia DC South 601723 99
## 7 Hawaii HI West 1360301 7
## 8 Idaho ID West 1567582 12
## 9 Iowa IA North Central 3046355 21
## 10 Kansas KS North Central 2853118 63
## 11 Kentucky KY South 4339367 116
## 12 Maine ME Northeast 1328361 11
## 13 Massachusetts MA Northeast 6547629 118
## 14 Minnesota MN North Central 5303925 53
## 15 Mississippi MS South 2967297 120
## 16 Montana MT West 989415 12
## 17 Nebraska NE North Central 1826341 32
## 18 Nevada NV West 2700551 84
## 19 New Hampshire NH Northeast 1316470 5
## 20 New Mexico NM West 2059179 67
## 21 North Dakota ND North Central 672591 4
## 22 Oklahoma OK South 3751351 111
## 23 Oregon OR West 3831074 36
## 24 Rhode Island RI Northeast 1052567 16
## 25 South Dakota SD North Central 814180 8
## 26 Utah UT West 2763885 22
## 27 Vermont VT Northeast 625741 2
## 28 Washington WA West 6724540 93
## 29 West Virginia WV South 1852994 27
## 30 Wisconsin WI North Central 5686986 97
## 31 Wyoming WY West 563626 5
filter(murders,total < 135, region=="South")
## state abb region population total
## 1 Arkansas AR South 2915918 93
## 2 Delaware DE South 897934 38
## 3 District of Columbia DC South 601723 99
## 4 Kentucky KY South 4339367 116
## 5 Mississippi MS South 2967297 120
## 6 Oklahoma OK South 3751351 111
## 7 West Virginia WV South 1852994 27
sort(total)
## [1] 2 4 5 5 7 8 11 12 12 16 19 21 22 27 32
## [16] 36 38 53 63 65 67 84 93 93 97 97 99 111 116 118
## [31] 120 135 142 207 219 232 246 250 286 293 310 321 351 364 376
## [46] 413 457 517 669 805 1257
index <- order(total)
index
## [1] 46 35 30 51 12 42 20 13 27 40 2 16 45 49 28 38 8 24 17 6 32 29 4 48 7
## [26] 50 9 37 18 22 25 1 15 41 43 3 31 47 34 21 36 26 19 14 11 23 39 33 10 44
## [51] 5
####The 46th entry of total is the smallest, so order(x) starts with 46. The next smallest is the 35th entry, so the second entry is 3 and so on.
total[index] == sort(total)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE
state[1:6]
## [1] "Alabama" "Alaska" "Arizona" "Arkansas" "California"
## [6] "Colorado"
abb[1:6]
## [1] "AL" "AK" "AZ" "AR" "CA" "CO"
index <- order(total)
abb[index]
## [1] "VT" "ND" "NH" "WY" "HI" "SD" "ME" "ID" "MT" "RI" "AK" "IA" "UT" "WV" "NE"
## [16] "OR" "DE" "MN" "KS" "CO" "NM" "NV" "AR" "WA" "CT" "WI" "DC" "OK" "KY" "MA"
## [31] "MS" "AL" "IN" "SC" "TN" "AZ" "NJ" "VA" "NC" "MD" "OH" "MO" "LA" "IL" "GA"
## [46] "MI" "PA" "NY" "FL" "TX" "CA"
index <- sort(total,decreasing = TRUE)
abb[which.max(total)]
## [1] "CA"
In R, arithmetic operations on vectors occur element-wise. For a quick example, suppose we have height in inches:
inches <- c(69, 62, 66, 70, 70, 73, 67, 73, 67, 70)
and want to convert to centimeters. Notice what happens when we multiply inches by 2.54:
inches *2.54
## [1] 175.26 157.48 167.64 177.80 177.80 185.42 170.18 185.42 170.18 177.80
In the line above, we multiplied each element by 2.54. Similarly, if for each entry we want to compute how many inches taller or shorter than 69 inches, the average height for males, we can subtract it from every entry like this:
inches - 70
## [1] -1 -8 -4 0 0 3 -3 3 -3 0
This operation also applies on two same length vectors
murder_rate <- murders$total / murders$population
convert following city temperature form Fahrenheit to Celsius \(C=\frac{5\times(F-32)}{9}\)
temp_C <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
city_temps <- data.frame(name = city, temperature = temp_C)
Suppose we want to look up California’s murder rate. The function which tells us which entries of a logical vector are TRUE.
ind <- which(murders$state == "California")
murder_rate[ind]
## [1] 3.374138e-05
If instead of just one state we want to find out the murder rates for several states, say New York, Florida, and Texas, we can use the function match.
ind <- match(c("New York", "Florida", "Texas"), murders$state)
If rather than an index we want a logical that tells us whether or not each element of a first vector is in a second, we can use the function %in%.
c("Boston", "Dakota", "Washington") %in% murders$state
## [1] FALSE FALSE TRUE