Load Library and Data
library(dslabs)
data(murders)
Question 01: Compute the per 100,000 murder rate for each state and
store it in an object called murder_rate. Then use logical operators to
create a logical vector named low that tells us which entries of
murder_rate are lower than 1.
murder_rate <- murders$total / murders$population * 100000
murder_rate
## [1] 2.8244238 2.6751860 3.6295273 3.1893901 3.3741383 1.2924531
## [7] 2.7139722 4.2319369 16.4527532 3.3980688 3.7903226 0.5145920
## [13] 0.7655102 2.8369608 2.1900730 0.6893484 2.2081106 2.6732010
## [19] 7.7425810 0.8280881 5.0748655 1.8021791 4.1786225 0.9992600
## [25] 4.0440846 5.3598917 1.2128379 1.7521372 3.1104763 0.3798036
## [31] 2.7980319 3.2537239 2.6679599 2.9993237 0.5947151 2.6871225
## [37] 2.9589340 0.9396843 3.5977513 1.5200933 4.4753235 0.9825837
## [43] 3.4509357 3.2013603 0.7959810 0.3196211 3.1246001 1.3829942
## [49] 1.4571013 1.7056487 0.8871131
low <- murder_rate < 1
low
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## [13] TRUE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
## [25] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
## [37] FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE
## [49] FALSE FALSE TRUE
Question 02: Now use the results from the previous exercise and the
function which to determine the indices of murder_rate associated with
values lower than 1.
murder_rate <- murders$total/murders$population*100000
low <- murder_rate < 1
which(low)
## [1] 12 13 16 20 24 30 35 38 42 45 46 51
Question 03: Use the results from the previous exercise to report
the names of the states with murder rates lower than 1.
murder_rate <- murders$total/murders$population*100000
low <- murder_rate < 1
murders$state[low]
## [1] "Hawaii" "Idaho" "Iowa" "Maine"
## [5] "Minnesota" "New Hampshire" "North Dakota" "Oregon"
## [9] "South Dakota" "Utah" "Vermont" "Wyoming"
Question 04: Now extend the code from exercise 2 and 3 to report the
states in the Northeast with murder rates lower than 1. Hint: use the
previously defned logical vector low and the logical operator
&.
murder_rate <- murders$total/murders$population*100000
low <- murder_rate < 1
ind <- low & murders$region=='Northeast'
murders$state[ind]
## [1] "Maine" "New Hampshire" "Vermont"
Question 05: In a previous exercise we computed the murder rate for
each state and the average of these numbers. How many states are below
the average?
murder_rate <- murders$total/murders$population*100000
avg <- mean(murder_rate)
sum(murder_rate<avg)
## [1] 27
Question 06: Use the match function to identify the states with
abbreviations AK, MI, and IA. Hint: start by defining an index of the
entries of murders$abb that match the three abbreviations, then use the
[ operator to extract the states.
abbs <- c('AK','MI','IA')
ind <- match(abbs , murders$abb)
murders$state[ind]
## [1] "Alaska" "Michigan" "Iowa"
Question 07: Use the %in% operator to create a logical vector that
answers the question: which of the following are actual abbreviations:
MA, ME, MI, MO, MU ?
abbs <- c('MA', 'ME', 'MI', 'MO', 'MU')
abbs%in%murders$abb
## [1] TRUE TRUE TRUE TRUE FALSE
Question 08: Extend the code you used in exercise 7 to report the
one entry that is not an actual abbreviation. Hint: use the ! operator,
which turns FALSE into TRUE and vice versa, then which to obtain an
index.
abbs <- c("MA", "ME", "MI", "MO", "MU")
ind <- which(!abbs%in%murders$abb)
abbs[ind]
## [1] "MU"