We create and atomic vector die that stores 5 elements.
die <- c(1, 2, 3, 4, 5, 6)
die
## [1] 1 2 3 4 5 6
## 1 2 3 4 5 6
It is a vector since it is true.
is.vector(die)
## [1] TRUE
We create an atomic vector that stores 5 (one element).
five <- 5
five
## [1] 5
Object five is a vector.
is.vector(five)
## [1] TRUE
Function length gets or sets the length of vectors (including lists) and factors, and of any other R object for which a method has been defined. In simple terms, length returns the length of an atomic vector.
length(five)
## [1] 1
length(die)
## [1] 6
Each atomic vector stores its values as a one-dimensional vector, and each atomic vector can only store one type of data. R recognizes six basic types of atomic vectors: doubles, integers, characters, logicals, complex, and raw.
int <- 1L
text <- "ace"
do_uble <- 30 #64 bits to store
logic <- TRUE
Floating-point errors arise due to each double accuracy to about 16 significant digits. This introduces a little bit of error. In most cases, this rounding error will go unnoticed.
sqrt(2)^2 - 2
## [1] 4.440892e-16
Other types
comp <- c(1 + 1i, 1 + 2i, 1 + 3i)
comp
## [1] 1+1i 1+2i 1+3i
r_raw <- raw(3)
## 00 00 00
#Attributes The most common attributes to give an atomic vector are names, dimensions (dim), and classes. Notice how object die has no names after we created the object.
names(die)
## NULL
We assigned names to the elements.
names(die) <- c("one", "two", "three", "four", "five", "six")
names(die)
## [1] "one" "two" "three" "four" "five" "six"
We rechecked the attributes function.
attributes(die)
## $names
## [1] "one" "two" "three" "four" "five" "six"
Names do not affect the values.
names(die) <- c("uno", "dos", "tres", "quatro", "cinco", "seis")
die
## uno dos tres quatro cinco seis
## 1 2 3 4 5 6
We can also remove names.
names(die) <- NULL
#Creating n dimensional Structures
A vector is a one-dimensional array. A matrix is a two-dimensional array; therefore is the same thing as a matrix. Modifying the dim attribute of an atomic vector into either a matrix or an array with more than three dimensions.
For example we can reorganize die into a 2 × 3 matrix.
dim(die) <- c(2, 3)
R will always use the first value in dim for the number of rows and the second value for the number of columns. In general, rows always come first in R operations that deal with both rows and columns.
dim(die) <- c(3, 2)
Notice how by default R fills up each matrix by columns.
#hypercube
dim(die) <- c(1, 2, 3)
class(die)
## [1] "array"
If you’d like more control over how the data is stored, you can use one of R’s helper functions, matrix or array. They do the same thing as changing the dim attribute, but they provide extra arguments to customize the process. #Matrix Function
m <- matrix(die, nrow = 2)
m <- matrix(die, nrow = 2, byrow = TRUE)
#Array Function The array function creates an n-dimensional array.
ar <- array(c(11:14, 21:24, 31:34), dim = c(2, 2, 3))
ar
## , , 1
##
## [,1] [,2]
## [1,] 11 13
## [2,] 12 14
##
## , , 2
##
## [,1] [,2]
## [1,] 21 23
## [2,] 22 24
##
## , , 3
##
## [,1] [,2]
## [1,] 31 33
## [2,] 32 34
Notice that changing the dimensions of your object will not change the type of the object, but it will change the object’s class attribute:
dim(die) <- c(2, 3)
typeof(die)
## [1] "double"
class(die)
## [1] "matrix" "array"
Note that an object’s class attribute will not always appear when you run attributes; you may need to specifically search for it with class: attributes(die)
attributes(die)
## $dim
## [1] 2 3
You can apply class to objects that do not have a class attribute. class will return a value based on the object’s atomic type. Notice that the “class” of a double is “numeric,” an odd deviation.
class("Hello")
## [1] "character"
class(5)
## [1] "numeric"
now <- Sys.time()
now
## [1] "2022-11-22 02:13:52 UTC"
typeof(now)
## [1] "double"
class(now)
## [1] "POSIXct" "POSIXt"
POSIXct is a framework for representing dates and times. Time is represented by the number of seconds that have passed between now and12:00 AM January 1st 1970 (in the Universal Time Coordinated (UTC) zone). You can see this number by removing the class attribute of now, or by using the un class function, which does the same thing.
unclass(now)
## [1] 1669083232
R then gives the double vector a class attribute that contains two classes, “POSIXct” and “POSIXt”. This attribute alerts R functions that they are dealing with a POSIXct time, so they can treat it in a special way. For example, R functions will use the POSIXct standard to convert the time into a user-friendly character string before displaying it. You can take advantage of this system by giving the POSIXct class to random R objects.
mil <- 1000000
mil
## [1] 1e+06
class(mil) <- c("POSIXct", "POSIXt")
mil
## [1] "1970-01-12 13:46:40 UTC"
#Factors
gender <- factor(c("male", "female", "female", "male"))
typeof(gender)
## [1] "integer"
attributes(gender)
## $levels
## [1] "female" "male"
##
## $class
## [1] "factor"
unclass(gender)
## [1] 2 1 1 2
## attr(,"levels")
## [1] "female" "male"
gender
## [1] male female female male
## Levels: female male
as.character(gender)
## [1] "male" "female" "female" "male"
#Coercion
sum(c(TRUE, TRUE, FALSE, FALSE))
## [1] 2
#will become:
sum(c(1, 1, 0, 0))
## [1] 2
as.character(1)
## [1] "1"
## "1"
as.logical(1)
## [1] TRUE
## TRUE
as.numeric(FALSE)
## [1] 0
## 0
#Lists Lists do not group together individual values; lists group together R objects, they are used as building blocks to create many more spohisticated types of R objects.
list1 <- list(100:130, "R", list(TRUE, FALSE))
list1
## [[1]]
## [1] 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
## [20] 119 120 121 122 123 124 125 126 127 128 129 130
##
## [[2]]
## [1] "R"
##
## [[3]]
## [[3]][[1]]
## [1] TRUE
##
## [[3]][[2]]
## [1] FALSE
#Data Frames Data frames are the two-dimensional version of a list. They are far and away the most useful storage structure for data analysis, and they provide an ideal way to store an entire deck of cards. You can think of a data frame as R’s equivalent to the Excel spreadsheet because it stores data in a similar format.
df <- data.frame(face = c("ace", "two", "six"),
suit = c("clubs", "clubs", "clubs"), value = c(1, 2, 3))
df
Data frames cannot combine columns of different lengths.
df <- data.frame(face = c("ace", "two", "six"),
suit = c("clubs", "clubs", "clubs"), value = c(1, 2, 3),
stringsAsFactors = FALSE)
typeof(df)
## [1] "list"
class(df)
## [1] "data.frame"
str(df)
## 'data.frame': 3 obs. of 3 variables:
## $ face : chr "ace" "two" "six"
## $ suit : chr "clubs" "clubs" "clubs"
## $ value: num 1 2 3
df <- data.frame(face = c("ace", "two", "six"),
suit = c("clubs", "clubs", "clubs"), value = c(1, 2, 3),
stringsAsFactors = FALSE)