vectors : used for performing calculations on multiple entities in datasets vectors is the essential building block for handling multiple items in R vectors a collection of observations concerning a single variable c() is the function for creating a vector
myvec <- c(1,3,1,42)
Vector entries can be calculations or previously stored items (including vectors themselves)
foo <- 32.1
foo
## [1] 32.1
myvec2 <- c (3, -3 , 2, 3.45 , 1e+03 , 64^0.5 ,2 + (3-1.1)/9.44 ,foo)
myvec2
## [1] 3.000000 -3.000000 2.000000 3.450000 1000.000000 8.000000
## [7] 2.201271 32.100000
myvec3 <- c(myvec,myvec2)
myvec3
## [1] 1.000000 3.000000 1.000000 42.000000 3.000000
## [6] -3.000000 2.000000 3.450000 1000.000000 8.000000
## [11] 2.201271 32.100000
Inbuilt functions can be used to manipulate vectors as shown:
mean (myvec3)
## [1] 91.22927
max (myvec3)
## [1] 1000
min(myvec3)
## [1] -3
typeof(myvec3)
## [1] "double"
length(myvec3)
## [1] 12
Important functions associated with R vectors seq() , rep() , sort() , length()
3:27
## [1] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [24] 26 27
You can also provide either a previously stored value or a (strictly parenthesized) calculation when using the colon operator:
foo1 <- 5.3
foo1
## [1] 5.3
bar <- foo1 : (-47+1.5)
bar
## [1] 5.3 4.3 3.3 2.3 1.3 0.3 -0.7 -1.7 -2.7 -3.7 -4.7
## [12] -5.7 -6.7 -7.7 -8.7 -9.7 -10.7 -11.7 -12.7 -13.7 -14.7 -15.7
## [23] -16.7 -17.7 -18.7 -19.7 -20.7 -21.7 -22.7 -23.7 -24.7 -25.7 -26.7
## [34] -27.7 -28.7 -29.7 -30.7 -31.7 -32.7 -33.7 -34.7 -35.7 -36.7 -37.7
## [45] -38.7 -39.7 -40.7 -41.7 -42.7 -43.7 -44.7
summary(bar)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -44.7 -32.2 -19.7 -19.7 -7.2 5.3
seq()Creation of sequence with seq() allows more flexible creation of sequences as shown below:
seq ( from = 3 , to = 27, by=3 )
## [1] 3 6 9 12 15 18 21 24 27
seq ( from = 3 , to = -27 , by = -3)
## [1] 3 0 -3 -6 -9 -12 -15 -18 -21 -24 -27
Note that, if you are increasing (or decreasing) by even numbers and your sequence ends in an odd number, the final number won’t be included. Instead of providing a by value, however, you can specify a length.out() value to produce a vector with that many numbers, evenly spaced between the from and to values
seq ( from = 3 , to = 28 , length.out = 40)
## [1] 3.000000 3.641026 4.282051 4.923077 5.564103 6.205128 6.846154
## [8] 7.487179 8.128205 8.769231 9.410256 10.051282 10.692308 11.333333
## [15] 11.974359 12.615385 13.256410 13.897436 14.538462 15.179487 15.820513
## [22] 16.461538 17.102564 17.743590 18.384615 19.025641 19.666667 20.307692
## [29] 20.948718 21.589744 22.230769 22.871795 23.512821 24.153846 24.794872
## [36] 25.435897 26.076923 26.717949 27.358974 28.000000
By setting length.out() to 40, you make the program print exactly 40 evenly spaced numbers from 3 to 27.
For decreasing sequences, the use of by must be negative. Here’s an example:
foo1
## [1] 5.3
myseq <- seq (from = foo1, to =(-47+1.5), by = -2.4)
myseq
## [1] 5.3 2.9 0.5 -1.9 -4.3 -6.7 -9.1 -11.5 -13.9 -16.3 -18.7
## [12] -21.1 -23.5 -25.9 -28.3 -30.7 -33.1 -35.5 -37.9 -40.3 -42.7 -45.1
This code uses the previously stored object foo as the value for from and uses the parenthesized calculation (-47+1.5) as the to value.
Given those values (that is, with foo being greater than (-47+1.5)), the sequence can progress only in negative steps; directly above, we set by to be -2.4
The use of length.out to create decreasing sequences, however, remains the same (it would make no sense to specify a “negative length”).
For the same from and to values, you can create a decreasing sequence of length 5 easily, as shown here:
myseq2 <- seq (from = foo1 , to =(-47+1.5), length.out = 5)
myseq2
## [1] 5.3 -7.4 -20.1 -32.8 -45.5
rep()Sequences are extremely useful, but sometimes you may want simply to repeat a certain value. You do this using rep()
rep (x = 1 , times =4)
## [1] 1 1 1 1
rep ( c(3, 62,8.3), times = 3)
## [1] 3.0 62.0 8.3 3.0 62.0 8.3 3.0 62.0 8.3
rep ( c(3, 62,8.3), each = 2)
## [1] 3.0 3.0 62.0 62.0 8.3 8.3
rep (c(3 , 62, 8.3), times =3 , each =2)
## [1] 3.0 3.0 62.0 62.0 8.3 8.3 3.0 3.0 62.0 62.0 8.3 8.3 3.0 3.0
## [15] 62.0 62.0 8.3 8.3
The rep function is given a single value or a vector of values as its argument x, as well as a value for the arguments times and each.
The value for times provides the number of times to repeat x, and each provides the number of times to repeat each element of x. In the first line directly above, you simply repeat a single value four times.
The other examples first use rep and times on a vector to repeat the entire vector,then use each to repeat each member of the vector, and finally use both times and each to do both at once.
rep() with no specifiactionIf neither times nor each is specified, R’s default is to treat the values of times and each as 1 so that a call of rep(x=c(3,62,8.3)) will just return the originally supplied x with no changes.
rep()As with seq(), you can include the result of rep in a vector of the same data type, as shown in the following example:
foo2 <- 4
foo2
## [1] 4
c(3 , 8.3 , rep (x=32, times = foo2), seq (from = -2 , to = 1 , length.out = foo2 +1))
## [1] 3.00 8.30 32.00 32.00 32.00 32.00 -2.00 -1.25 -0.50 0.25 1.00
Here, I’ve constructed a vector where the third to sixth entries (inclusive) are governed by the evaluation of a rep() command—the single value 32 repeated foo2 times (where foo is stored as 4). The last five entries are the result of an evaluation of seq(), namely a sequence from −2 to 1 of length foo2+1 (5).
sort()sort (x=c(2.5,-1,-10,3.44), decreasing = FALSE)
## [1] -10.00 -1.00 2.50 3.44
sort (x=c(2.5,-1,-10,3.44), decreasing = TRUE)
## [1] 3.44 2.50 -1.00 -10.00
foo3 <- seq ( from = 4.3 , to = 5.5 , length.out = 8)
foo3
## [1] 4.300000 4.471429 4.642857 4.814286 4.985714 5.157143 5.328571 5.500000
bar1 <- sort(x = foo3, decreasing = TRUE )
bar1
## [1] 5.500000 5.328571 5.157143 4.985714 4.814286 4.642857 4.471429 4.300000
sort(x=c(foo3,bar1 ), decreasing = FALSE)
## [1] 4.300000 4.300000 4.471429 4.471429 4.642857 4.642857 4.814286
## [8] 4.814286 4.985714 4.985714 5.157143 5.157143 5.328571 5.328571
## [15] 5.500000 5.500000
length()#Finding A Vector Length with Length
length ( x = c(3,2,8,1))
## [1] 4
length(5:13)
## [1] 9
foo4 <- 4
bar <- c (3,8.3,rep(x=32,times=foo4),seq(from = -2 ,to = 1, length.out = foo4 +1 ))
length(bar)
## [1] 11
The index corresponds to the position of a value within a vector, and that’s precisely why the first value always has a [1] next to it These indexes allow you to retrieve specific elements from a vector, which is known as subsetting. You can access individual elements by asking R to return the values of myvec at specific locations, done by entering the name of the vector followed by the position in square brackets.
myvec <- c(5,-2.3, 4,4,4,6,8,10,40221,-8)
length(x=myvec)
## [1] 10
myvec[1]
## [1] 5
myvec[6]
## [1] 6
foo <- myvec[2]
foo
## [1] -2.3
# Calling the last element
myvec[length(x=myvec)]
## [1] -8
you could extract the second-to-last element by subtracting 1 from the length;
myvec.len <- length(x=myvec)
myvec.len
## [1] 10
bar <- myvec[myvec.len-1]
bar
## [1] 40221
Using your knowledge of sequences, you can use the colon notation with the length of the specific vector to obtain all possible indexes for extracting a particular element in the vector:
1:myvec.len
## [1] 1 2 3 4 5 6 7 8 9 10
Removing the first element:
a <- myvec[-1]
a
## [1] -2.3 4.0 4.0 4.0 6.0 8.0 10.0 40221.0 -8.0
Removing the second element:
baz <- myvec[-2]
baz
## [1] 5 4 4 4 6 8 10 40221 -8
Removing multiple elements
baz[-c(3:6)]
## [1] 5 4 10 40221 -8
Again, the index in the square brackets can be the result of an appropriate calculation, like so:
qux <- myvec[-(myvec.len-1)]
qux
## [1] 5.0 -2.3 4.0 4.0 4.0 6.0 8.0 10.0 -8.0
Using the square-bracket operator to extract or delete values from a vector does not change the original vector you are subsetting unless you explicitly overwrite the vector with the subsetted version.
For instance, in this example, qux is a new vector defined as myvec without its second-to-lastentry, but in your workspace, myvec itself remains unchanged. In other words, subsetting vectors in this way simply returns the requested elements, which can be assigned to a new object if you want, but doesn’t alter the original object in the workspace.
Now, suppose you want to piece myvec back together from qux and bar.You can call something like this:
#Recostructing myvec
c(qux[-length(x=qux)],bar,qux[length(x=qux)])
## [1] 5.0 -2.3 4.0 4.0 4.0 6.0 8.0 10.0
## [9] 40221.0 -8.0
R is not limited to do things one by one , they can be done in multiple Vector of indices colon operator creates a sequence of indices
myvec[c(1,3,5)]
## [1] 5 4 4
foo1 <- myvec[1:4]
foo1
## [1] 5.0 -2.3 4.0 4.0
The order of the returned elements depends entirely upon the index vector supplied in the square brackets
length(x=foo1):2
## [1] 4 3 2
foo1[length(x=foo1):2]
## [1] 4.0 4.0 -2.3
You can also use rep() to repeat an index
indexes <- c(4,rep(x=2,times=3),1,1,2,3:1)
indexes
## [1] 4 2 2 2 1 1 2 3 2 1
foo1[indexes]
## [1] 4.0 -2.3 -2.3 -2.3 5.0 5.0 -2.3 4.0 -2.3 5.0
This is now something a little more general than strictly “subsetting”— by using an index vector, you can create an entirely new vector of any length consisting of some or all of the elements in the original vector. This index vector can contain the desired element positions in any order and can repeat indexes
You can also return the elements of a vector after deleting more than one element.
foo1[-c(1,3)]
## [1] -2.3 4.0
Note that it is not possible to mix positive and negative indexes in a single index vector.