Questions 1 - 4 in Section 3.3 Exercises on page 52
3.3.1 What is the sum of the first 100 positive integers? The formula for the sum of integers 1 through n is n(n + 1)/2. Define n = 100 and then use R to compute the sum of 1 through 100 using the formula.
What is the sum?
sum_n <- function(n){
n*(n+1)/2
}
sum_n(100)
## [1] 5050
3.3.2 Now use the same formula to compute the sum of the integers from 1 through 1,000.
# Question 2
sum_n(1000)
## [1] 500500
3.3.3
Look at the result of typing the following code into R: n <- 1000 x <- seq(1, n) sum(x)
Based on the result, what do you think the functions seq and sum do? You can use the help system: A. sum creates a list of numbers and seq adds them up. *B. seq creates a list of numbers and sum adds them up. C. seq computes the difference between two arguments and sum computes the sum of 1 through 1000. D. sum always returns the same number.
n <- 1000
x <- seq(1, n)
sum(x)
## [1] 500500
# B, Seq creates a list of Numbers and Sum adds them up.
3.3.4
In math and programming, we say that we evaluate a function when we replace the argument with a given number. So if we type sqrt(4), we evaluate the sqrt function. In R, you can evaluate a function inside another function. The evaluations happen from the inside out. Use one line of code to compute the log, in base 10, of the square root of 100.
log(sqrt(100), 10)
## [1] 1
Questions 1 & 3 in Section 3.5 Exercises on page 58
library(dslabs)
data(murders)
str(murders)
## 'data.frame': 51 obs. of 5 variables:
## $ state : chr "Alabama" "Alaska" "Arizona" "Arkansas" ...
## $ abb : chr "AL" "AK" "AZ" "AR" ...
## $ region : Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...
## $ population: num 4779736 710231 6392017 2915918 37253956 ...
## $ total : num 135 19 232 93 1257 ...
3.5.1
Which of the following best describes the variables represented in this data frame?
C. The state name, the abbreviation of the state name, the state’s region, and the state’s population and total number of murders for 2010.
3.5.3 Use the accessor $ to extract the state abbreviations and assign them to the object a. What is the class of this object?
a <- murders$abb
class(a)
## [1] "character"
Questions 1 - 6, 9 &12 in Section 3.8 Exercises on page 63
temp <- c(35, 88, 42, 84, 81, 30)
city <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro", "San Juan", "Toronto")
names(temp)<-city
temp
## Beijing Lagos Paris Rio de Janeiro San Juan
## 35 88 42 84 81
## Toronto
## 30
temp[1:3]
## Beijing Lagos Paris
## 35 88 42
temp[c("Paris", "San Juan")]
## Paris San Juan
## 42 81
12:73
## [1] 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## [26] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
## [51] 62 63 64 65 66 67 68 69 70 71 72 73
a <- seq(1, 10, 0.5)
class(a)
## [1] "numeric"
x <- c("1", "3", "5")
as.numeric(x)
## [1] 1 3 5
Questions 1, 2, 5 & 7 in Section 3.10 Exercises on page 66
library(dslabs)
data("murders")
pop =murders$population
sort(pop)[1]
## [1] 563626
order(pop)[1]
## [1] 51
Questions 1 - 6 in Section 3.14 Exercises on page 71
Start by loading the library and data.
murders["murder_rate"] = murders$total / murders$population * 100000
murders["low"] = murders$murder_rate < 1
murders
## state abb region population total murder_rate low
## 1 Alabama AL South 4779736 135 2.8244238 FALSE
## 2 Alaska AK West 710231 19 2.6751860 FALSE
## 3 Arizona AZ West 6392017 232 3.6295273 FALSE
## 4 Arkansas AR South 2915918 93 3.1893901 FALSE
## 5 California CA West 37253956 1257 3.3741383 FALSE
## 6 Colorado CO West 5029196 65 1.2924531 FALSE
## 7 Connecticut CT Northeast 3574097 97 2.7139722 FALSE
## 8 Delaware DE South 897934 38 4.2319369 FALSE
## 9 District of Columbia DC South 601723 99 16.4527532 FALSE
## 10 Florida FL South 19687653 669 3.3980688 FALSE
## 11 Georgia GA South 9920000 376 3.7903226 FALSE
## 12 Hawaii HI West 1360301 7 0.5145920 TRUE
## 13 Idaho ID West 1567582 12 0.7655102 TRUE
## 14 Illinois IL North Central 12830632 364 2.8369608 FALSE
## 15 Indiana IN North Central 6483802 142 2.1900730 FALSE
## 16 Iowa IA North Central 3046355 21 0.6893484 TRUE
## 17 Kansas KS North Central 2853118 63 2.2081106 FALSE
## 18 Kentucky KY South 4339367 116 2.6732010 FALSE
## 19 Louisiana LA South 4533372 351 7.7425810 FALSE
## 20 Maine ME Northeast 1328361 11 0.8280881 TRUE
## 21 Maryland MD South 5773552 293 5.0748655 FALSE
## 22 Massachusetts MA Northeast 6547629 118 1.8021791 FALSE
## 23 Michigan MI North Central 9883640 413 4.1786225 FALSE
## 24 Minnesota MN North Central 5303925 53 0.9992600 TRUE
## 25 Mississippi MS South 2967297 120 4.0440846 FALSE
## 26 Missouri MO North Central 5988927 321 5.3598917 FALSE
## 27 Montana MT West 989415 12 1.2128379 FALSE
## 28 Nebraska NE North Central 1826341 32 1.7521372 FALSE
## 29 Nevada NV West 2700551 84 3.1104763 FALSE
## 30 New Hampshire NH Northeast 1316470 5 0.3798036 TRUE
## 31 New Jersey NJ Northeast 8791894 246 2.7980319 FALSE
## 32 New Mexico NM West 2059179 67 3.2537239 FALSE
## 33 New York NY Northeast 19378102 517 2.6679599 FALSE
## 34 North Carolina NC South 9535483 286 2.9993237 FALSE
## 35 North Dakota ND North Central 672591 4 0.5947151 TRUE
## 36 Ohio OH North Central 11536504 310 2.6871225 FALSE
## 37 Oklahoma OK South 3751351 111 2.9589340 FALSE
## 38 Oregon OR West 3831074 36 0.9396843 TRUE
## 39 Pennsylvania PA Northeast 12702379 457 3.5977513 FALSE
## 40 Rhode Island RI Northeast 1052567 16 1.5200933 FALSE
## 41 South Carolina SC South 4625364 207 4.4753235 FALSE
## 42 South Dakota SD North Central 814180 8 0.9825837 TRUE
## 43 Tennessee TN South 6346105 219 3.4509357 FALSE
## 44 Texas TX South 25145561 805 3.2013603 FALSE
## 45 Utah UT West 2763885 22 0.7959810 TRUE
## 46 Vermont VT Northeast 625741 2 0.3196211 TRUE
## 47 Virginia VA South 8001024 250 3.1246001 FALSE
## 48 Washington WA West 6724540 93 1.3829942 FALSE
## 49 West Virginia WV South 1852994 27 1.4571013 FALSE
## 50 Wisconsin WI North Central 5686986 97 1.7056487 FALSE
## 51 Wyoming WY West 563626 5 0.8871131 TRUE
which(murders$low)
## [1] 12 13 16 20 24 30 35 38 42 45 46 51
murders$state[which(murders$low)]
## [1] "Hawaii" "Idaho" "Iowa" "Maine"
## [5] "Minnesota" "New Hampshire" "North Dakota" "Oregon"
## [9] "South Dakota" "Utah" "Vermont" "Wyoming"
murders$state[which(murders$low & murders$region == "Northeast")]
## [1] "Maine" "New Hampshire" "Vermont"
Questions 1 - 3 in Section 3.16 Exercises on page 74
library(dslabs)
data(murders)
population_in_millions <- murders$population/10^6
total_gun_murders <- murders$total
plot(population_in_millions, total_gun_murders)
Keep in mind that many states have populations below 5 million and are bunched up. We may gain further insights from making this plot in the log scale. Transform the variables using the log10 transformation and then plot them.
plot(log10(population_in_millions), log10(total_gun_murders))
2. Create a histogram of the state populations.
hist(murders$population)
hist(with(murders, log10(population)))
boxplot((population/1000000)~region, murders)