Basic, handy-dandy function to quickly check for missing values. It simply checks if given value is NA; returns a logical value, which you can further use in your summary/analysis or whatsoever.
Let’s create ‘x’ vector to play with (the vector made of 30 values - mix of NA and random numbers from norm distribution of mean = 1, and sd = 0.5). As always, set.seed() first, to get the same results as here.
set.seed(10)
x <- sample(c(rnorm(50, mean = 1, sd = 0.5), rep(NA, 100)),30)
Here is your vector:
## [1] NA 0.8717608 NA NA NA NA NA
## [8] NA NA NA NA NA 0.8131692 NA
## [15] NA NA 0.3100282 NA NA NA 1.3706951
## [22] 1.4172370 0.6187276 NA NA 0.5639206 NA NA
## [29] NA NA
Are any there NA values in the vector? Of course, there are some. You know that already very well but this is because you looked at the vector and it is short enough. Imagine, however, you have a very long vector which you want to explore in regard of NA. Then, is.na() comes in handy! The only thing you need to type is:
is.na(x)
## [1] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [12] TRUE FALSE TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE
## [23] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
If there are NA values, you will get TRUE, which is the case here.
You may further easily figure out what are positions of your NA values in the vector, using a combination of is.na(), and which():
which(is.na(x))
## [1] 1 3 4 5 6 7 8 9 10 11 12 14 15 16 18 19 20 24 25 27 28 29 30
You can also calculate how many NA values are in your vector, using the fact that TRUE and FALSE have their numeric representation: TRUE = 1 and FALSE = 0. So, if you simply sum up all the TRUE’s for is.na(), you will get the number of NA values in your vector. Look at that:
sum(is.na(x))
## [1] 23
Enjoy!