is.na()

Basic, handy-dandy function to quickly check for missing values. It simply checks if given value is NA; returns a logical value, which you can further use in your summary/analysis or whatsoever.

Let’s create ‘x’ vector to play with (the vector made of 30 values - mix of NA and random numbers from norm distribution of mean = 1, and sd = 0.5). As always, set.seed() first, to get the same results as here.

set.seed(10)
x <- sample(c(rnorm(50, mean = 1, sd = 0.5), rep(NA, 100)),30)

Here is your vector:

##  [1]        NA 0.8717608        NA        NA        NA        NA        NA
##  [8]        NA        NA        NA        NA        NA 0.8131692        NA
## [15]        NA        NA 0.3100282        NA        NA        NA 1.3706951
## [22] 1.4172370 0.6187276        NA        NA 0.5639206        NA        NA
## [29]        NA        NA

Are any there NA values in the vector? Of course, there are some. You know that already very well but this is because you looked at the vector and it is short enough. Imagine, however, you have a very long vector which you want to explore in regard of NA. Then, is.na() comes in handy! The only thing you need to type is:

is.na(x)
##  [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [12]  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE
## [23] FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

If there are NA values, you will get TRUE, which is the case here.

You may further easily figure out what are positions of your NA values in the vector, using a combination of is.na(), and which():

which(is.na(x))
##  [1]  1  3  4  5  6  7  8  9 10 11 12 14 15 16 18 19 20 24 25 27 28 29 30

You can also calculate how many NA values are in your vector, using the fact that TRUE and FALSE have their numeric representation: TRUE = 1 and FALSE = 0. So, if you simply sum up all the TRUE’s for is.na(), you will get the number of NA values in your vector. Look at that:

sum(is.na(x))
## [1] 23

Enjoy!