Swirl

Lesson 1

| You can type the first two letters of the variable name,
| then hit the Tab key (possibly more than once). Most
| programming environments will provide a list of variables
| that you've created that begin with 'my'. This is called
| auto-completion and can be quite handy when you have many
| variables in your workspace. Give it a try. (If
| auto-completion doesn't work for you, just type my_div and
| press Enter.)

> my_div
[1] 3.478505 3.181981 2.146460

Lesson 2

| Delete the 'testdir' directory that you just left (and everything
| in it)

> unlink("testdir", recursive = TRUE)

Lesson 3

| Finally, let's say that rather than repeating the vector (0, 1, 2)
| over and over again, we want our vector to contain 10 zeros, then
| 10 ones, then 10 twos. We can do this with the `each` argument. Try
| rep(c(0, 1, 2), each = 10).

> rep(c(0, 1, 2), each = 10)
 [1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

Q1

Q1.a

what is the output of v1 + v2?

[1]  7 24  6
Warning message:
In v1 + v2 :
  longer object length is not a multiple of shorter object length

Q1.b

what is the output of v1 + v3?

[1]  3  6  5  8  7 10

Q1.c

what is the output of v2 + v3?

[1]  3  6  5  8  7 10

Q1.d

Explain the differences in these answers in relation to error warnings, and unexpected behavior in R

v1 + v2 gave a warning message that the longer object length was not a multiple of the shorter object length, so each number of v1 was not added an even number of times to the numbers of v2. Since the others added evenly, this message did not appear

Q2.

Make a vector ‘v4’ of: eight, five, nine, one, negative three, zero, and twelve

a.

use indexing and the sum() function to add the second, fourth, and seventh entry in v4

> sum(v4[2], v4[4], v4[7])
[1] 18

b.

combine indexing with the sum() function to add the second, fourth and seventh entry in v4

> sum(v4[2], v4[4], v4[7])
[1] 18

c.

sample at random, two elements v4 [hint, use the sample() function]

> sample(v4, 2)
[1] 9 1

Q3.

Use R to generate a vector of all integers between 6 and 10

a.

using the colon operator

> v5 <- 6:10

b.

using the seq function

> v6 <- seq(6,10)

Q4.

Use the rep function in R to generate a vector with all integers between 6 and 10 repeated four times

a.

Cycling through 6, 7, 8, 9 ,and 10

> rep(v5, 4)
 [1]  6  7  8  9 10  6  7  8  9 10  6  7  8  9 10  6  7  8 9 10

b.

First repeating 6 four times, then repeating 7 four times, etc..

> rep(v5, each=4)
 [1]  6  6  6  6  7  7  7  7  8  8  8  8  9  9  9  9 10 10 10 10

Q5: Chapter 1. Question 14.

‘Identify whether the following variables are numerical or categorical. If numerical, state whether the variable is discrete or continuous. If categorical, state whether the categories have a natural order (ordinal) or not, (nominal)’. Answer with reference to the variable [i.e. The height of this building is a continuous, numerical variable].

a. Number of sexual partners in a year

The number of sexual partners in a year is a discrete, numerical variable

b. Petal area of rose flowers

Petal area of rose flowers is a continuous, numerical variable

c. Heartbeats per minute of a Tour de France cyclist, averaged over the duration of the race

The number of heartbeats per minute of a Tour de France cyclist, averaged over the duration of the race is a discrete, numerical variable

d. Birth weight

Birth weight is a continuous numerical variable

e. Stage of fruit ripeness (e.g. underripe, ripe, or overripe)

Stage of fruit ripeness is a categorical variable with ordinal categories

f. Angle of flower orientation relative to the position of the sun

Angle of flower orientation relative to the position of the sun is a continuous, numerical variable

g. Tree species

Tree species is a categorical variable with nominal categories

h. Year of birth

Year of birth is a discrete numerical category

i. Gender

Gender is a categorical variable with nominal categories

Q6. Chapter 1, Questions 23 and 24

23.

The main problem here is sampling bias. They should have reinforced the cockpit.

24.

a.

The two variables are leaf type and number of leaves removed. Leaf type is a categorical variable and is the explanatory variable here and the number of leaves removed is numerical and is the response variable here.

b.

These are the leaves that were removed, not a random sample of the total population of leaves, but specifically those that were removed.

c.

This would effect the accuracy of the estimate.

d.

The individual ants were randomly sampled in this experiment.