Load Library and Data

library(dslabs)
data(murders)

Q1: Compute the per 100,000 murder rate for each state and store it in an object called murder_rate. Then use logical operators to create a logical vector named low that tells us which entries of murder_rate are lower than 1.

murder_rate <- murders$total / murders$population * 100000
murder_rate
##  [1]  2.8244238  2.6751860  3.6295273  3.1893901  3.3741383  1.2924531
##  [7]  2.7139722  4.2319369 16.4527532  3.3980688  3.7903226  0.5145920
## [13]  0.7655102  2.8369608  2.1900730  0.6893484  2.2081106  2.6732010
## [19]  7.7425810  0.8280881  5.0748655  1.8021791  4.1786225  0.9992600
## [25]  4.0440846  5.3598917  1.2128379  1.7521372  3.1104763  0.3798036
## [31]  2.7980319  3.2537239  2.6679599  2.9993237  0.5947151  2.6871225
## [37]  2.9589340  0.9396843  3.5977513  1.5200933  4.4753235  0.9825837
## [43]  3.4509357  3.2013603  0.7959810  0.3196211  3.1246001  1.3829942
## [49]  1.4571013  1.7056487  0.8871131
low <- murder_rate < 1
low
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [13]  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
## [25] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
## [37] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE
## [49] FALSE FALSE  TRUE

Q2: Now use the results from the previous exercise and the function which to determine the indices of murder_rate associated with values lower than 1.

murder_rate <- murders$total/murders$population*100000

low <- murder_rate < 1

which(low)
##  [1] 12 13 16 20 24 30 35 38 42 45 46 51

Q3: Use the results from the previous exercise to report the names of the states with murder rates lower than 1.

murder_rate <- murders$total/murders$population*100000

low <- murder_rate < 1

murders$state[low]
##  [1] "Hawaii"        "Idaho"         "Iowa"          "Maine"        
##  [5] "Minnesota"     "New Hampshire" "North Dakota"  "Oregon"       
##  [9] "South Dakota"  "Utah"          "Vermont"       "Wyoming"

Q4: Now extend the code from exercise 2 and 3 to report the states in the Northeast with murder rates lower than 1. Hint: use the previously defned logical vector low and the logical operator &.

murder_rate <- murders$total/murders$population*100000

low <- murder_rate < 1

ind <- low & murders$region=='Northeast'

murders$state[ind] 
## [1] "Maine"         "New Hampshire" "Vermont"

Q5: In a previous exercise we computed the murder rate for each state and the average of these numbers. How many states are below the average?

murder_rate <- murders$total/murders$population*100000

avg <- mean(murder_rate)

sum(murder_rate<avg)
## [1] 27

Q6: Use the match function to identify the states with abbreviations AK, MI, and IA. Hint: start by defining an index of the entries of murders$abb that match the three abbreviations, then use the [ operator to extract the states.

abbs <- c('AK','MI','IA')

ind <- match(abbs , murders$abb)

murders$state[ind]
## [1] "Alaska"   "Michigan" "Iowa"

Q7: Use the %in% operator to create a logical vector that answers the question: which of the following are actual abbreviations: MA, ME, MI, MO, MU ?

abbs <- c('MA', 'ME', 'MI', 'MO', 'MU')

abbs%in%murders$abb
## [1]  TRUE  TRUE  TRUE  TRUE FALSE

Q8: Extend the code you used in exercise 7 to report the one entry that is not an actual abbreviation. Hint: use the ! operator, which turns FALSE into TRUE and vice versa, then which to obtain an index.

abbs <- c("MA", "ME", "MI", "MO", "MU") 

ind <- which(!abbs%in%murders$abb)

abbs[ind]
## [1] "MU"