Generally, while doing programming in any programming language, you need to use various variables to store various information. Variables are nothing but reserved memory locations to store values. This means that, when you create a variable you reserve some space in memory.
You may like to store information of various data types like character, wide character, integer, floating point, double floating point, Boolean etc. Based on the data type of a variable, the operating system allocates memory and decides what can be stored in the reserved memory.
In contrast to other programming languages like C and java in R, the variables are not declared as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are many types of R-objects. The frequently used ones are −
1.Vectors; 2.Lists; 3.Matrices; 4.Arrays; 5.Factors; 6.Data Frames.
Any value written within a pair of single quote or double quotes in R is treated as a string. Internally R stores every string within double quotes, even when you create them with single quote.
1.The quotes at the beginning and end of a string should be both double quotes or both single quote. They can not be mixed.
2.Double quotes can be inserted into a string starting and ending with single quote.
3.Single quote can be inserted into a string starting and ending with double quotes.
4.Double quotes can not be inserted into a string starting and ending with double quotes.
5.Single quote can not be inserted into a string starting and ending with single quote.
Following examples clarify the rules about creating a string in R.
a <- 'Start and end with single quote'
print(a)
## [1] "Start and end with single quote"
b <- "Start and end with double quotes"
print(b)
## [1] "Start and end with double quotes"
c <- "single quote ' in between double quotes"
print(c)
## [1] "single quote ' in between double quotes"
d <- 'Double quotes " in between single quote'
print(d)
## [1] "Double quotes \" in between single quote"
# e <- 'Mixed quotes"
# print(e)
# f <- 'Single quote ' inside single quote'
# print(f)
# g <- "Double quotes " inside double quotes"
# print(g)
Concatenating Strings - paste() function
Many strings in R are combined using the paste() function. It can take any number of arguments to be combined together.
The basic syntax for paste function is −
# paste(..., sep = " ", collapse = NULL)
Following is the description of the parameters used −
2.sep represents any separator between the arguments. It is optional.
3.collapse is used to eliminate the space in between two strings. But not the space within two words of one string.
a <- "Hello"
b <- 'How'
c <- "are you? "
print(paste(a,b,c))
## [1] "Hello How are you? "
print(paste(a,b,c, sep = "-"))
## [1] "Hello-How-are you? "
print(paste(a,b,c, sep = "", collapse = ""))
## [1] "HelloHoware you? "
Numbers and strings can be formatted to a specific style using format() function.
The basic syntax for format function is −
# format(x, digits, nsmall, scientific, width, justify = c("left", "right", "centre", "none"))
Following is the description of the parameters used −
1.x is the vector input.
2.digits is the total number of digits displayed.
3.nsmall is the minimum number of digits to the right of the decimal point.
4.scientific is set to TRUE to display scientific notation.
5.width indicates the minimum width to be displayed by padding blanks in the beginning.
6.justify is the display of the string to left, right or center.
# Total number of digits displayed. Last digit rounded off.
result <- format(23.123456789, digits = 9)
print(result)
## [1] "23.1234568"
# Display numbers in scientific notation.
result <- format(c(6, 13.14521), scientific = TRUE)
print(result)
## [1] "6.000000e+00" "1.314521e+01"
# The minimum number of digits to the right of the decimal point.
result <- format(23.47, nsmall = 5)
print(result)
## [1] "23.47000"
# Format treats everything as a string.
result <- format(6)
print(result)
## [1] "6"
# Numbers are padded with blank in the beginning for width.
result <- format(13.7, width = 6)
print(result)
## [1] " 13.7"
# Left justify strings.
result <- format("Hello", width = 8, justify = "l")
print(result)
## [1] "Hello "
# Justfy string with center.
result <- format("Hello", width = 8, justify = "c")
print(result)
## [1] " Hello "
This function counts the number of characters including spaces in a string.
The basic syntax for nchar() function is −
# nchar(x)
Following is the description of the parameters used −
1.x is the vector input.
result <- nchar("Count the number of characters")
print(result)
## [1] 30
These functions change the case of characters of a string.
The basic syntax for toupper() & tolower() function is −
# toupper(x)
# tolower(x)
Following is the description of the parameters used −
1.x is the vector input.
# Changing to Upper case.
result <- toupper("Changing To Upper")
print(result)
## [1] "CHANGING TO UPPER"
# Changing to lower case.
result <- tolower("Changing To Lower")
print(result)
## [1] "changing to lower"
This function extracts parts of a String.
The basic syntax for substring() function is −
# substring(x,first,last)
Following is the description of the parameters used −
1.x is the character vector input.
2.first is the position of the first character to be extracted.
3.last is the position of the last character to be extracted.
# Extract characters from 5th to 7th position.
result <- substring("Extract", 5, 7)
print(result)
## [1] "act"