Objectives


At the end of this tutorial, learners should be able to:

  • Understand the various data types in R.

  • Use appropriate R functions to create each of the data types.

  • Use the typeof and class functions to check the data type.

  • Understand missing data and other special values in R.

  • Convert between data types.

Introduction


To make the best out of the R programming language, a programmer needs a strong understanding of the basic data types and data structures, and how to operate on them. Each variable in R has an associated data type each of which requires different amounts of memory. There are specific operations that can be performed on a given data type and there are also those operations that may not be allowed. Below are the six data types in R.

  1. Double

  2. Integer

  3. Character

  4. Logical

  5. Date

  6. Complex

  7. Raw

The functions class() and typeof() will be used to examine the kind of data type or data structure. Data sets in R are mostly a combination of the above data types.

Double


This is probably the most common data type in the R programming language. A variable or a series will be stored as double if the value is numeric (that is, all values on the real number line). Below are three examples to illustrate this.

a = 56
a
## [1] 56
typeof(a) # this is stored as double even though we know it is an integer
## [1] "double"
b = 23.8503
b
## [1] 23.8503
typeof(b)
## [1] "double"
c = 12/11
c
## [1] 1.090909
typeof(c)
## [1] "double"

It is important to note that by default, numeric values are stored in R as double unless otherwise specified.

Integer


An integer is a special case of numeric data, where the values are whole numbers (i.e. without decimals). Examples include: number of family members in a household, number of tested cases, number of deaths, etc. This is a discrete variable that can never have decimals points. As mentioned earlier, numbers in R are by default stored as double (all possible values on the real number line including integers). There are two ways to declare a value as an integer in R, these are:

  • add letter L after the integer (no space).
  • use the function as.integer().

The above are demonstrated in the above syntax.

Adding the letter L at the end

b = 5L
b
## [1] 5
typeof(b)
## [1] "integer"

Using the function as.integer()

a = as.integer(5)
a
## [1] 5
typeof(a) 
## [1] "integer"

Remark


If you applied the as.integer() function on a decimal value (double), R will return only the integer part of the specified value. The code below demonstrates this.

a = 4.85276
a
## [1] 4.85276
b = as.integer(a) # this will truncate the given value
b
## [1] 4
typeof(b)
## [1] "integer"

Character


This data type is used to store text (strings) information. Text or string is any value that is not numeric, it could be alphabet(s) or alphanumeric (combination of alphabets and digits). Examples include: name (James Korir), gender (male/female), social economic status (low/middle/high), id (A2173A4EK). A character data type is declared by enclosing it in double quotation marks (single quotation marks can still be used, but the double quotation marks are recommended in R programming).

a = "G4HE27YB5"
typeof(a) 
## [1] "character"
b = "12.842" # this will be stored as a string in R
typeof(b)
## [1] "character"

Logical


A logical variable is a variable with only two values: TRUE or FALSE.

a = TRUE
a
## [1] TRUE
typeof(a)
## [1] "logical"
b = pi > 22/7
b
## [1] FALSE
typeof(b)
## [1] "logical"
c = 9/12 == 15/20
c
## [1] TRUE
typeof(c)
## [1] "logical"

Date


The date data type is used to stores dates/times. In the code block below, we first define the object dt as a character then use the function as.Date() to convert it to a date. The second argument (“%d-%m-%Y”) defines the date format as entered.

dt = "01-06-2021"
dt1 = as.Date(dt, "%d-%m-%Y")
dt1
## [1] "2021-06-01"
class(dt1) # using typeof() function will return double
## [1] "Date"

Dates and times is a very wide topic with several functions that can be used to manipulate different date/time formats. For more information on dates, visit the page: https://epirhandbook.com/en/working-with-dates.html, https://mgimond.github.io/ES218/Week02c.html

Complex


A complex value in R is defined by a real and/or an imaginary part \(i\).

z = 5 + 4i
z
## [1] 5+4i
typeof(z)
## [1] "complex"
a = sqrt(-5) # will give NaNs warning (NAN - Not A Number)
## Warning in sqrt(-5): NaNs produced
a
## [1] NaN
c = -5 + 0i # this is the correct way i.e. write it in complex form
c = sqrt(b)
c
## [1] 0

Raw


A raw data type is used to holds raw bytes.

a = charToRaw("I love R programming.")  
a
##  [1] 49 20 6c 6f 76 65 20 52 20 70 72 6f 67 72 61 6d 6d 69 6e 67 2e
typeof(a)
## [1] "raw"

Additional data types


Other data type in R include the following:

  1. NA - Missing values

  2. NaN - Not a Number e.g. sqrt(-4)

  3. Inf - Infinite values e.g. 5/0

Convert data types


Data types can be converted from one to another using the the following functions.

Function Description Example Result
as.double() Convert to double as.double("5.8236") [1] 4.4224
as.integer() Convert to integer as.integer(30/4) [1] 7
as.numeric() Convert to numeric as.numeric("4.4224") [1] 4.4224
as.character() Convert to character as.character(58) [1] 58
as.logical() Convert to logical as.logical(12) [1] TRUE
as.complex() Convert to complex as.complex(4) [1] -5+0i

STEM Research: Technology for Innovation

https://stemrecloud.com

Last edited on: 2022-06-25