2.2.1 Data Cleaning For example, your raw data might have typos or wrong characters, as in the following example, where the last element of the input vector contains the letter ‘o’, instead of the zero digit. Such data can only be read as character, so after reading it into R, you’d have something like the vector x below. Using this vector in a calculation would throw an error.
x <- c('20', '30', '4o') # raw data. we cannot use x in calculations since it is character vector. and there is a typo too.
x <- gsub('o', '0', x) # cleaning with gsub function. very very useful.
mode(x) <- 'numeric' # convert from character to numeric
x + 1 # we can now use x in calculations
## [1] 21 31 41
2.2.2 Checking input values Another example of practical use of mode is to check input values in functions. Consider the sum function, which requires arguments of type numeric or logical. It will stop your script if it receives a character, e.g. sum(10, ‘a’). Suppose this is not what you want. Instead you want a sum function that would simply return NA when it receives a character, without stopping the script. In that case you can use is.numeric() to check input values:
mysum <- function(a, b) {
if(is.numeric(a) & is.numeric(b)) a + b
else NA
}
mysum(10, 20) # Output is 30
## [1] 30
mysum(10, 'a') # Output is NA
## [1] NA
2.2.3 Subsetting Finally, an example that involves subsetting of a table. Here we convert a numeric range (2005:2010) to character, to select a few columns (representing years) of a table:
df <- data.frame(years=1991:2010, v=sample(1:10, 20000, T))
mytable <- with(df, table(v, years)) # a cross-table with counts
mytable[, as.character(2005:2010)]
## years
## v 2005 2006 2007 2008 2009 2010
## 1 102 100 107 102 99 100
## 2 94 96 91 114 103 89
## 3 86 114 93 101 102 93
## 4 109 109 108 87 100 109
## 5 98 87 97 95 87 106
## 6 105 94 96 108 93 101
## 7 94 88 95 91 97 96
## 8 115 113 112 94 95 97
## 9 101 97 104 120 116 120
## 10 96 102 97 88 108 89