2.6 Vectorized Operations
2.6.1 Vector In, Vector Out
First, create 2 vectors u and v using the concatenate function. Then,
we investigate whether each entry of u is greater than the corresponding
entry of v.
u <- c(5,2,8)
v <- c(1,3,9)
u>v
[1] TRUE FALSE FALSE
Below create two vectors y and z and compare them, this time using
the <= operator.
y <- c(5,1,4,9)
z <- c(25,3,4,1)
y <= z
[1] TRUE TRUE TRUE FALSE
Above, the > function was applied to u[1] and v[1], resulting in
TRUE, then to u[2] and v[2], resulting in FALSE, and so on.
A key point is that if an R function uses vectorized operations, it,
too, is vectorized, thus enabling a potential speedup. Here is an
example:
w <- function(x) return(x+1)
w(u)
[1] 6 3 9
We created a function w which takes an input and increases it by 1.
If the input is an integer x, the output is just x + 1. When entering a
vector, each vector entry is increased by 1. Just like we applied our
function to the vector us above, we do the same with v.
w(v)
[1] 2 4 10
Here, w() uses +, which is vectorized, so w() is vectorized as well.
As you can see, there is an unlimited number of vectorized functions, as
complex ones are built up from simpler ones.
Our next function m takes an input value, squares it and adds 1 to
the result.
m <- function(x) return(x**2+1)
We apply the new function to the vector u.
m(u)
[1] 26 5 65
We apply m to the vector v as well.
m(v)
[1] 2 10 82
Now create a function that returns x^2 - 1:
n <- function(x) return(x**2 - 1)
We apply the function to the vector y:
n(y)
[1] 24 0 15 80
Next, let’s apply the function for rounding to the nearest integer to
a new example vector y. To create the new y, we overwrite the old y that
we created above.
y <- c(1.2,3.9,0.4)
z <- round(y)
z
[1] 1 4 0
The round function rounds the input to the nearest integer. 1.2 is
getting rounded to 1, 3.9 to 4, and 0.4 to 0.
Let’s apply the function for rounding to the nearest integer to an
example vector r<-c(0.25,0.9,2.1,4.7,5.1):
r <- c(0.25, 0.9, 2.1, 4.7, 5.1)
round(r)
[1] 0 1 2 5 5
We get the integer vector (0, 1, 2, 5, 5).
Since we know that R has no scalars, let’s consider vectorized
functions that appear to have scalar arguments.
f<-function(x,c) return((x+c)^2)
f(1:3,0)
[1] 1 4 9
Recall that for a 3D vector x and a scalar c, x + c = (x[1] + c, x[2]
+ c, x[3] + c).
f(1:3,1)
[1] 4 9 16
Create a function that returns (x-c)^2 and calculate f(1:3,0) and
f(1:3,1)
f2 <- function(x,c) return((x - c)^2)
f2(1:3,1)
[1] 0 1 4
2.6.2 Vector In, Matrix Out
The vectorized functions we’ve been working with so far have scalar
return values. Calling sqrt() on a number gives us a number. If we apply
this function to an eight-element vector, we get eight numbers, thus
another eightelement vector, as output. But what if our function itself
is vector-valued, as z12() is here:
z12 <- function(z) return(c(z,z^2))
Applying z12() to 5, say, gives us the two-element vector (5,25). If
we apply this function to an eight-element vector, it produces 16
numbers:
x <- 1:8
z12(x)
[1] 1 2 3 4 5 6 7 8 1 4 9 16 25 36 49 64
Compute z12(x2) where x2<-1:5. Explain the output.
x2 <- c(1:5)
z12(x2)
[1] 1 2 3 4 5 1 4 9 16 25
z12 returns the vector 1:5 first and then returns the squared values
of x2. c(1, 4, 9, 16, 25) is just the squared vector x2.
It might be more natural to have these arranged as an 8-by-2 matrix,
which we can do with the matrix function:
matrix(z12(x),ncol=2)
[,1] [,2]
[1,] 1 1
[2,] 2 4
[3,] 3 9
[4,] 4 16
[5,] 5 25
[6,] 6 36
[7,] 7 49
[8,] 8 64
Create a similar matrix using z12(x2):
matrix(z12(x2),ncol=2)
[,1] [,2]
[1,] 1 1
[2,] 2 4
[3,] 3 9
[4,] 4 16
[5,] 5 25
2.7 NA and NULL Values
2.7.1 Using NA
In many of R’s statistical functions, we can instruct the function to
skip over any missing values, or NAs. Here is an example:
x <- c(88,NA,12,168,13)
x
[1] 88 NA 12 168 13
When calculating the mean without further specification, the NA value
is taken into account. Since NA + x = NA for all real numbers x, the
mean is also evaluated as NA.
mean(x)
[1] NA
We have to remove the NA values with the NA.rm function in order to
get a representative mean.
mean(x,na.rm=T)
[1] 70.25
If we write NULL instead of NA, we don`t have to make special
removements. This is due to the fact that a NULL value is interpreted as
an empty value (None), while NA is an unknown missing value that
exists.
x <- c(88,NULL,12,168,13)
mean(x)
[1] 70.25
Let us now work with the mode:
x <- c(5,NA,12)
mode(x[1])
[1] "numeric"
5 is a numeric value.
mode(x[2])
[1] "numeric"
In this case, NA is also seen as a numeric value.
Now create a vector with characters and NA:
y <- c("abc","def",NA)
mode(y[2])
[1] "character"
“def” is interpreted as a character, exactly as we would expect.
mode(y[3])
[1] "character"
NA, on the other hand, is interpreted as a character as well this
time. This is because the vector has character values as entries, so NA
is also interpreted as a character.
LS0tCnRpdGxlOiAiTGVjdHVyZSAzIEFjdGl2aXR5IDQiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCioqMi42IFZlY3Rvcml6ZWQgT3BlcmF0aW9ucyoqCgoqKioyLjYuMSBWZWN0b3IgSW4sIFZlY3RvciBPdXQqKioKCkZpcnN0LCBjcmVhdGUgMiB2ZWN0b3JzIHUgYW5kIHYgdXNpbmcgdGhlIGNvbmNhdGVuYXRlIGZ1bmN0aW9uLiBUaGVuLCB3ZSBpbnZlc3RpZ2F0ZSB3aGV0aGVyIGVhY2ggZW50cnkgb2YgdSBpcyBncmVhdGVyIHRoYW4gdGhlIGNvcnJlc3BvbmRpbmcgZW50cnkgb2Ygdi4KYGBge3J9CnUgPC0gYyg1LDIsOCkKdiA8LSBjKDEsMyw5KQp1PnYKYGBgCgoKQmVsb3cgY3JlYXRlIHR3byB2ZWN0b3JzIHkgYW5kIHogYW5kIGNvbXBhcmUgdGhlbSwgdGhpcyB0aW1lIHVzaW5nIHRoZSA8PSBvcGVyYXRvci4KYGBge3J9CnkgPC0gYyg1LDEsNCw5KQp6IDwtIGMoMjUsMyw0LDEpCnkgPD0gegpgYGAKCkFib3ZlLCB0aGUgPiBmdW5jdGlvbiB3YXMgYXBwbGllZCB0byB1WzFdIGFuZCB2WzFdLCByZXN1bHRpbmcgaW4gVFJVRSwgdGhlbiB0byB1WzJdIGFuZCB2WzJdLCByZXN1bHRpbmcgaW4gRkFMU0UsIGFuZCBzbyBvbi4KCkEga2V5IHBvaW50IGlzIHRoYXQgaWYgYW4gUiBmdW5jdGlvbiB1c2VzIHZlY3Rvcml6ZWQgb3BlcmF0aW9ucywgaXQsIHRvbywgaXMgdmVjdG9yaXplZCwgdGh1cyBlbmFibGluZyBhIHBvdGVudGlhbCBzcGVlZHVwLiBIZXJlIGlzIGFuIGV4YW1wbGU6CmBgYHtyfQp3IDwtIGZ1bmN0aW9uKHgpIHJldHVybih4KzEpCncodSkKYGBgCldlIGNyZWF0ZWQgYSBmdW5jdGlvbiB3IHdoaWNoIHRha2VzIGFuIGlucHV0IGFuZCBpbmNyZWFzZXMgaXQgYnkgMS4gSWYgdGhlIGlucHV0IGlzIGFuIGludGVnZXIgeCwgdGhlIG91dHB1dCBpcyBqdXN0IHggKyAxLiBXaGVuIGVudGVyaW5nIGEgdmVjdG9yLCBlYWNoIHZlY3RvciBlbnRyeSBpcyBpbmNyZWFzZWQgYnkgMS4gSnVzdCBsaWtlIHdlIGFwcGxpZWQgb3VyIGZ1bmN0aW9uIHRvIHRoZSB2ZWN0b3IgdXMgYWJvdmUsIHdlIGRvIHRoZSBzYW1lIHdpdGggdi4KYGBge3J9CncodikKYGBgCkhlcmUsIHcoKSB1c2VzICssIHdoaWNoIGlzIHZlY3Rvcml6ZWQsIHNvIHcoKSBpcyB2ZWN0b3JpemVkIGFzIHdlbGwuIEFzIHlvdSBjYW4gc2VlLCB0aGVyZSBpcyBhbiB1bmxpbWl0ZWQgbnVtYmVyIG9mIHZlY3Rvcml6ZWQgZnVuY3Rpb25zLCBhcyBjb21wbGV4IG9uZXMgYXJlIGJ1aWx0IHVwIGZyb20gc2ltcGxlciBvbmVzLgoKT3VyIG5leHQgZnVuY3Rpb24gbSB0YWtlcyBhbiBpbnB1dCB2YWx1ZSwgc3F1YXJlcyBpdCBhbmQgYWRkcyAxIHRvIHRoZSByZXN1bHQuIApgYGB7cn0KbSA8LSBmdW5jdGlvbih4KSByZXR1cm4oeCoqMisxKQpgYGAKCldlIGFwcGx5IHRoZSBuZXcgZnVuY3Rpb24gdG8gdGhlIHZlY3RvciB1LgpgYGB7cn0KbSh1KQpgYGAKCldlIGFwcGx5IG0gdG8gdGhlIHZlY3RvciB2IGFzIHdlbGwuCmBgYHtyfQptKHYpCmBgYAoKCk5vdyBjcmVhdGUgYSBmdW5jdGlvbiB0aGF0IHJldHVybnMgeF4yIC0gMToKYGBge3J9Cm4gPC0gZnVuY3Rpb24oeCkgcmV0dXJuKHgqKjIgLSAxKQpgYGAKCldlIGFwcGx5IHRoZSBmdW5jdGlvbiB0byB0aGUgdmVjdG9yIHk6CmBgYHtyfQpuKHkpCmBgYAoKCk5leHQsIGxldOKAmXMgYXBwbHkgdGhlIGZ1bmN0aW9uIGZvciByb3VuZGluZyB0byB0aGUgbmVhcmVzdCBpbnRlZ2VyIHRvIGEgbmV3IGV4YW1wbGUgdmVjdG9yIHkuIFRvIGNyZWF0ZSB0aGUgbmV3IHksIHdlIG92ZXJ3cml0ZSB0aGUgb2xkIHkgdGhhdCB3ZSBjcmVhdGVkIGFib3ZlLgpgYGB7cn0KeSA8LSBjKDEuMiwzLjksMC40KQp6IDwtIHJvdW5kKHkpCnoKYGBgClRoZSByb3VuZCBmdW5jdGlvbiByb3VuZHMgdGhlIGlucHV0IHRvIHRoZSBuZWFyZXN0IGludGVnZXIuIDEuMiBpcyBnZXR0aW5nIHJvdW5kZWQgdG8gMSwgMy45IHRvIDQsIGFuZCAwLjQgdG8gMC4KCkxldOKAmXMgYXBwbHkgdGhlIGZ1bmN0aW9uIGZvciByb3VuZGluZyB0byB0aGUgbmVhcmVzdCBpbnRlZ2VyIHRvIGFuIGV4YW1wbGUgdmVjdG9yIHI8LWMoMC4yNSwwLjksMi4xLDQuNyw1LjEpOgpgYGB7cn0KciA8LSBjKDAuMjUsIDAuOSwgMi4xLCA0LjcsIDUuMSkKcm91bmQocikKYGBgCldlIGdldCB0aGUgaW50ZWdlciB2ZWN0b3IgKDAsIDEsIDIsIDUsIDUpLgoKU2luY2Ugd2Uga25vdyB0aGF0IFIgaGFzIG5vIHNjYWxhcnMsIGxldOKAmXMgY29uc2lkZXIgdmVjdG9yaXplZCBmdW5jdGlvbnMgdGhhdCBhcHBlYXIgdG8gaGF2ZSBzY2FsYXIgYXJndW1lbnRzLgpgYGB7cn0KZjwtZnVuY3Rpb24oeCxjKSByZXR1cm4oKHgrYyleMikKZigxOjMsMCkKYGBgClJlY2FsbCB0aGF0IGZvciBhIDNEIHZlY3RvciB4IGFuZCBhIHNjYWxhciBjLCB4ICsgYyA9ICh4WzFdICsgYywgeFsyXSArIGMsIHhbM10gKyBjKS4KYGBge3J9CmYoMTozLDEpCmBgYAoKQ3JlYXRlIGEgZnVuY3Rpb24gdGhhdCByZXR1cm5zICh4LWMpXjIgYW5kIGNhbGN1bGF0ZSBmKDE6MywwKSBhbmQgZigxOjMsMSkKYGBge3J9CmYyIDwtIGZ1bmN0aW9uKHgsYykgcmV0dXJuKCh4IC0gYyleMikKZjIoMTozLDEpCmBgYAoKCioqKjIuNi4yIFZlY3RvciBJbiwgTWF0cml4IE91dCoqKgoKVGhlIHZlY3Rvcml6ZWQgZnVuY3Rpb25zIHdl4oCZdmUgYmVlbiB3b3JraW5nIHdpdGggc28gZmFyIGhhdmUgc2NhbGFyIHJldHVybiB2YWx1ZXMuIENhbGxpbmcgc3FydCgpIG9uIGEgbnVtYmVyIGdpdmVzIHVzIGEgbnVtYmVyLiBJZiB3ZSBhcHBseSB0aGlzIGZ1bmN0aW9uIHRvIGFuIGVpZ2h0LWVsZW1lbnQgdmVjdG9yLCB3ZSBnZXQgZWlnaHQgbnVtYmVycywgdGh1cyBhbm90aGVyIGVpZ2h0ZWxlbWVudCB2ZWN0b3IsIGFzIG91dHB1dC4gQnV0IHdoYXQgaWYgb3VyIGZ1bmN0aW9uIGl0c2VsZiBpcyB2ZWN0b3ItdmFsdWVkLCBhcyB6MTIoKSBpcyBoZXJlOgpgYGB7cn0KejEyIDwtIGZ1bmN0aW9uKHopIHJldHVybihjKHosel4yKSkKYGBgCgoKQXBwbHlpbmcgejEyKCkgdG8gNSwgc2F5LCBnaXZlcyB1cyB0aGUgdHdvLWVsZW1lbnQgdmVjdG9yICg1LDI1KS4gSWYgd2UgYXBwbHkgdGhpcyBmdW5jdGlvbiB0byBhbiBlaWdodC1lbGVtZW50IHZlY3RvciwgaXQgcHJvZHVjZXMgMTYgbnVtYmVyczoKYGBge3J9CnggPC0gMTo4CnoxMih4KQpgYGAKCgpDb21wdXRlIHoxMih4Mikgd2hlcmUgeDI8LTE6NS4gRXhwbGFpbiB0aGUgb3V0cHV0LgpgYGB7cn0KeDIgPC0gYygxOjUpCnoxMih4MikKYGBgCnoxMiByZXR1cm5zIHRoZSB2ZWN0b3IgMTo1IGZpcnN0IGFuZCB0aGVuIHJldHVybnMgdGhlIHNxdWFyZWQgdmFsdWVzIG9mIHgyLiBjKDEsIDQsIDksIDE2LCAyNSkgaXMganVzdCB0aGUgc3F1YXJlZCB2ZWN0b3IgeDIuCgpJdCBtaWdodCBiZSBtb3JlIG5hdHVyYWwgdG8gaGF2ZSB0aGVzZSBhcnJhbmdlZCBhcyBhbiA4LWJ5LTIgbWF0cml4LCB3aGljaCB3ZSBjYW4gZG8gd2l0aCB0aGUgbWF0cml4IGZ1bmN0aW9uOgpgYGB7cn0KbWF0cml4KHoxMih4KSxuY29sPTIpCmBgYAoKQ3JlYXRlIGEgc2ltaWxhciBtYXRyaXggdXNpbmcgejEyKHgyKToKYGBge3J9Cm1hdHJpeCh6MTIoeDIpLG5jb2w9MikKYGBgCgoKKioyLjcgTkEgYW5kIE5VTEwgVmFsdWVzKioKCjIuNy4xIFVzaW5nIE5BCgpJbiBtYW55IG9mIFLigJlzIHN0YXRpc3RpY2FsIGZ1bmN0aW9ucywgd2UgY2FuIGluc3RydWN0IHRoZSBmdW5jdGlvbiB0byBza2lwIG92ZXIgYW55IG1pc3NpbmcgdmFsdWVzLCBvciBOQXMuIEhlcmUgaXMgYW4gZXhhbXBsZToKYGBge3J9CnggPC0gYyg4OCxOQSwxMiwxNjgsMTMpCngKYGBgCgpXaGVuIGNhbGN1bGF0aW5nIHRoZSBtZWFuIHdpdGhvdXQgZnVydGhlciBzcGVjaWZpY2F0aW9uLCB0aGUgTkEgdmFsdWUgaXMgdGFrZW4gaW50byBhY2NvdW50LiBTaW5jZSBOQSArIHggPSBOQSBmb3IgYWxsIHJlYWwgbnVtYmVycyB4LCB0aGUgbWVhbiBpcyBhbHNvIGV2YWx1YXRlZCBhcyBOQS4KYGBge3J9Cm1lYW4oeCkKYGBgCgpXZSBoYXZlIHRvIHJlbW92ZSB0aGUgTkEgdmFsdWVzIHdpdGggdGhlIE5BLnJtIGZ1bmN0aW9uIGluIG9yZGVyIHRvIGdldCBhIHJlcHJlc2VudGF0aXZlIG1lYW4uCmBgYHtyfQptZWFuKHgsbmEucm09VCkKYGBgCgpJZiB3ZSB3cml0ZSBOVUxMIGluc3RlYWQgb2YgTkEsIHdlIGRvbmB0IGhhdmUgdG8gbWFrZSBzcGVjaWFsIHJlbW92ZW1lbnRzLiBUaGlzIGlzIGR1ZSB0byB0aGUgZmFjdCB0aGF0IGEgTlVMTCB2YWx1ZSBpcyBpbnRlcnByZXRlZCBhcyBhbiBlbXB0eSB2YWx1ZSAoTm9uZSksIHdoaWxlIE5BIGlzIGFuIHVua25vd24gbWlzc2luZyB2YWx1ZSB0aGF0IGV4aXN0cy4KYGBge3J9CnggPC0gYyg4OCxOVUxMLDEyLDE2OCwxMykKbWVhbih4KQpgYGAKCgpMZXQgdXMgbm93IHdvcmsgd2l0aCB0aGUgbW9kZToKYGBge3J9CnggPC0gYyg1LE5BLDEyKQptb2RlKHhbMV0pCmBgYAo1IGlzIGEgbnVtZXJpYyB2YWx1ZS4KYGBge3J9Cm1vZGUoeFsyXSkKYGBgCkluIHRoaXMgY2FzZSwgTkEgaXMgYWxzbyBzZWVuIGFzIGEgbnVtZXJpYyB2YWx1ZS4KCk5vdyBjcmVhdGUgYSB2ZWN0b3Igd2l0aCBjaGFyYWN0ZXJzIGFuZCBOQToKYGBge3J9CnkgPC0gYygiYWJjIiwiZGVmIixOQSkKYGBgCgpgYGB7cn0KbW9kZSh5WzJdKQpgYGAKImRlZiIgaXMgaW50ZXJwcmV0ZWQgYXMgYSBjaGFyYWN0ZXIsIGV4YWN0bHkgYXMgd2Ugd291bGQgZXhwZWN0LgpgYGB7cn0KbW9kZSh5WzNdKQpgYGAKTkEsIG9uIHRoZSBvdGhlciBoYW5kLCBpcyBpbnRlcnByZXRlZCBhcyBhIGNoYXJhY3RlciBhcyB3ZWxsIHRoaXMgdGltZS4gVGhpcyBpcyBiZWNhdXNlIHRoZSB2ZWN0b3IgaGFzIGNoYXJhY3RlciB2YWx1ZXMgYXMgZW50cmllcywgc28gTkEgaXMgYWxzbyBpbnRlcnByZXRlZCBhcyBhIGNoYXJhY3Rlci4=