Generally created using c()-
> c(1, 2, 3, 4, 5)[1] 1 2 3 4 5
Using <- or = vector elements can be assigned -
> x <- c(1, 2, 3, 4, 5)
> print(x)[1] 1 2 3 4 5
Advice: Do some research by yourself after a while to learn the difference in <- and =.
The data types can be logical, integer, double, character, complex or raw. Data types in a vector can be found using the function typeof() -
> logi <- c(TRUE, FALSE, TRUE)
> typeof(logi)[1] "logical"
> int <- c(1L, 2L, 5L, 9L)
> typeof(int)[1] "integer"
> doub <- c(1, 2, 5, 9)
> typeof(doub)[1] "double"
> char <- c("a", "cat", "Dhaka")
> typeof(char)[1] "character"
> complx <- c(1+2i, -3i, 2-1i, 10 + 9i)
> typeof(complx)[1] "complex"
There are functions called mode() and class(). Explore the vectors using these by yourself.
If multiple data types are used then order of importance in data type is: character > complex > double > integer > logical -
> typeof(c(1,"a"))[1] "character"
> typeof(c(1,"a",3i))[1] "character"
> typeof(c(1,3i))[1] "complex"
> typeof(c(TRUE, 4))[1] "double"
> typeof(c(10L, 4))[1] "double"
> typeof(c(10L, FALSE))[1] "integer"
> typeof(c(10L, 1 + 3i))[1] "complex"
Using as.___() functions -
> int[1] 1 2 5 9
> as.character(int) # Changing to character[1] "1" "2" "5" "9"
> as.double(int) # Changing to double[1] 1 2 5 9
> logi <- c(1,0,0,1)
> typeof(logi)[1] "double"
> as.logical(logi) # 1 is TRUE and 0 is FALSE in R[1] TRUE FALSE FALSE TRUE
> doub <- c(1.8, 1, 5, pi)
> doub[1] 1.800000 1.000000 5.000000 3.141593
> typeof(doub)[1] "double"
> as.integer(doub)[1] 1 1 5 3
Look how integers require less memory than doubles -
> int <- c(1L, 2L, 5L, 9L)
> doub <- c(1, 2, 5, 9)
> object.size(int) # Returns memory allocation of int64 bytes
> object.size(doub) # Returns memory allocation of doub80 bytes
Using c():
> a <- c(1, 2, 3, 4, 5)
> b <- c(10, 11, 12, 13, 14)
> comb <- c(a,b)
> comb [1] 1 2 3 4 5 10 11 12 13 14
If combining multiple data types then the data type with most importance will be assiged -
> a <- c(1, 2, 3, 4, 5)
> d <- c("a", "b", "cat")
> comb <- c(a, d)
> comb[1] "1" "2" "3" "4" "5" "a" "b" "cat"
> typeof(comb)[1] "character"
Using the function length() -
> y <- c(10, 99, 12, 19, 100)
> length(y) # Returns the number of elements in y[1] 5
In R, array indexes starts at 1. And elements are selected by passing the index in []-
> y <- c(10, 99, 12, 19, 100)
> y[1] # Selects the 1st element in a[1] 10
> y[4] # Selects the 4th element in a[1] 19
> y[length(y)] # Selects the last element in a[1] 100
Using - before the indexes -
> y[-1] # Removes the 1st element[1] 99 12 19 100
> y[-4] # Removes the 4th element[1] 10 99 12 100
> y[-c(1,4)] #Removed 1st and 4th element[1] 99 12 100
Notice the original vector y is not changed -
> y[1] 10 99 12 19 100
To change the original vector just assign the new vector to the old one -
> y <- y[-c(1,4)]
> y[1] 99 12 100
: methodUsing : a sequence of integers can be created -
> 1:10 # Sequence of 1 to 10 [1] 1 2 3 4 5 6 7 8 9 10
> 10:1 # Sequence of 10 to 1 [1] 10 9 8 7 6 5 4 3 2 1
> -1:-10 # Sequence of -1 to -10 [1] -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
> seq(from=4, to=10) # Sequence of integers from 4 to 10[1] 4 5 6 7 8 9 10
Using the argument names is not necessary in all the cases -
> seq(4,10)[1] 4 5 6 7 8 9 10
Passing only an integer will return the integers from 1 to that specific integer -
> seq(7) # Returns integers from 1 to 7[1] 1 2 3 4 5 6 7
> seq(-5) # Returns integers from 1 to -10[1] 1 0 -1 -2 -3 -4 -5
byTo get a sequence of integers with an increment of 3 use the by argument-
> seq(1, 10, by = 3)[1] 1 4 7 10
More example -
> seq(1, 2, by = 0.1) # Sequence of double increment by 0.1 [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
longth.outIf we want specific number of values in an interval then length.out argument can be used -
> seq(0, 20, length.out = 5) # Returns exactly 5 numbers between 0 and 20[1] 0 5 10 15 20
> seq(0, 1, length.out = 10) # Returns exactly 10 numbers between 0 and 1 [1] 0.0000000 0.1111111 0.2222222 0.3333333 0.4444444 0.5555556 0.6666667
[8] 0.7777778 0.8888889 1.0000000
Arguments can be written in short form unless they match other argument names -
> seq(0, 1, len = 10) [1] 0.0000000 0.1111111 0.2222222 0.3333333 0.4444444 0.5555556 0.6666667
[8] 0.7777778 0.8888889 1.0000000
> start_date <- as.Date("2020-01-01")
> end_date <- as.Date("2021-12-31")
> months_seq <- seq(start_date, end_date, by="+1 months")
> format(months_seq, "%d %B %Y") [1] "01 January 2020" "01 February 2020" "01 March 2020"
[4] "01 April 2020" "01 May 2020" "01 June 2020"
[7] "01 July 2020" "01 August 2020" "01 September 2020"
[10] "01 October 2020" "01 November 2020" "01 December 2020"
[13] "01 January 2021" "01 February 2021" "01 March 2021"
[16] "01 April 2021" "01 May 2021" "01 June 2021"
[19] "01 July 2021" "01 August 2021" "01 September 2021"
[22] "01 October 2021" "01 November 2021" "01 December 2021"
Visit this link to learn in detail about dates in R.
To get a vector with repeated values use the rep() function -
> rep(2, times = 10) # Returns 10 2s [1] 2 2 2 2 2 2 2 2 2 2
When passing a vector with length more than 1, the whole vector will repeat -
> rep(c(1,2, 99), times = 5) [1] 1 2 99 1 2 99 1 2 99 1 2 99 1 2 99
Using the argument each we can get the elements repeated by each at a time -
> rep(c(1, 2, 99), each = 3)[1] 1 1 1 2 2 2 99 99 99
Using both each and times-
> rep(c(1, 2, 99), each = 3, times = 2) [1] 1 1 1 2 2 2 99 99 99 1 1 1 2 2 2 99 99 99
To create a sequence of factor type data `gl() function can be used -
> fac_seq <- gl(n = 2, k = 3)
> fac_seq[1] 1 1 1 2 2 2
Levels: 1 2
> class(fac_seq)[1] "factor"
We can specify the levels using the argument labels-
> fac_seq <- gl(n = 2, k = 3, labels = c("Control", "Treatment"))
> fac_seq[1] Control Control Control Treatment Treatment Treatment
Levels: Control Treatment
If we want the levels to be ordered then ordered argument can be used -
> gl(n = 2, k = 3, labels = c("January", "May"), ordered = T)[1] January January January May May May
Levels: January < May
To know more about factors visit here.
While creating vector we can specify names of elements -
> x <- c("1st"=98, "2nd"=93, "3rd"=89, "4th"=88)
> x1st 2nd 3rd 4th
98 93 89 88
We can access the names using names() function -
> names(x)[1] "1st" "2nd" "3rd" "4th"
So we can replace the names too!
> names(x) <- c("First", "Second", "Third", "Fourth")
> x First Second Third Fourth
98 93 89 88
To select elements in vectors like these we can use either indexes or names -
> x[1]First
98
> x["First"]First
98
To delete the names simply assign NULL to the names of the vector -
> names(x) <- NULL
> x[1] 98 93 89 88
> x <- 1L
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "integer" "numeric" "integer" "integer"
> x <- 1
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "numeric" "numeric" "double" "double"
> x <- letters
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "character" "character" "character" "character"
> x <- TRUE
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "logical" "logical" "logical" "logical"
> x <- cars
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "data.frame" "list" "list" "list"
> x <- cars[1]
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "data.frame" "list" "list" "list"
> x <- cars[[1]]
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "numeric" "numeric" "double" "double"
> x <- matrix(cars)
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "matrix" "array" "list" "list" "list"
> x <- new.env()
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "environment" "environment" "environment" "environment"
> x <- expression(1 + 1)
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "expression" "expression" "expression" "expression"
> x <- quote(y <- 1 + 1)
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "<-" "call" "language" "language"
> x <- ls
> print(c(class(x), mode(x), storage.mode(x), typeof(x)))[1] "function" "function" "function" "closure"