A few basic keyboard shorcuts to make life simpler in R:
# To insert the "<-" symbol use alt + "-" (alt+minus)
# To insert the pipe operator %>% use Ctrl + Shift + M
# To insert a new code chunk in R Markdown use Ctrl + Alt + I
R can perform basic math calculations
2+2
## [1] 4
1/6
## [1] 0.1666667
2^5
## [1] 32
To assign a value to a variable use: <- (keyboard shortcut: alt + “-”)
Your variable will be visible in the Environment Tab (upper right) and can be recalled any time you need it.
a <- 2
b <- "Apples"
# You can perform math on numeric variables:
a*2
## [1] 4
The most basic data structure in R. In the previous example, “a” is a vector of length 1. We use the “c” or combine function to create a vector.
vector1 <- c(1,2,4,-7,9,-12,6,-3, 2, 5, 16)
# a numeric vector
vector2 <- c("apples", "oranges", "pears")
# a character vector
vector3 <- c(1,4,"apples", 5)
# vectors can only contain one data type. R will force (coerce) the numbers to strings to create a vector of only one data type.
vector1
## [1] 1 2 4 -7 9 -12 6 -3 2 5 16
vector2
## [1] "apples" "oranges" "pears"
vector3
## [1] "1" "4" "apples" "5"
Using square brackets [] we can access items from a vector based on their position. Unlike other languages, R starts at position 1 (not position 0)
vector1[2]
## [1] 2
vector1[2:4]
## [1] 2 4 -7
vector2[1]
## [1] "apples"
# If you choose a position outside the length of the vector you will get "NA" as a response
vector3[10]
## [1] NA
# We can modify vector contents by using subsetting:
vector2[3] <- "bananas"
vector2
## [1] "apples" "oranges" "bananas"
# Or by using an ifelse function (this returns vector1 and replaces all negative values with 0)
ifelse(vector1 < 0, 0, vector1)
## [1] 1 2 4 0 9 0 6 0 2 5 16
# We can select vector items using logcal operators
vector1[vector1<0]
## [1] -7 -12 -3
# Performing math on a vector
vector1 + 4
## [1] 5 6 8 -3 13 -8 10 1 6 9 20
mean(vector1)
## [1] 2.090909
median(vector1)
## [1] 2
sum(vector1)
## [1] 23
vector1 > 0
## [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE
# returns logical TRUE/FALSE values
You can create your own functions in R using the following structure:
function.name <- function(argument1, argument2, …)
{
some functionality
}
# example #1
add.numbers <- function(num1, num2)
{
new_num = num1+num2
new_num
}
add.numbers(3,6)
## [1] 9
# example #2
is.even <- function(a.number)
{
remainder <- a.number %% 2
if (remainder==0)
return(TRUE)
return(FALSE)
}
# testing
is.even(10)
## [1] TRUE
is.even(7)
## [1] FALSE
A matrix is a rectangular array of values (of any type) arranged in rows and columns.
# This uses the matrix function to create a matrix with all values in a single column
a.matrix <- matrix(c(1,2,3,4,5,6))
a.matrix
## [,1]
## [1,] 1
## [2,] 2
## [3,] 3
## [4,] 4
## [5,] 5
## [6,] 6
# This uses the matrix function to create a matrix with values in two columns
a.matrix <- matrix(c(1,2,3,4,5,6), ncol=2)
a.matrix
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
# we could also use the cbind function to combine two vectors to achieve the same result
a2.matrix <- cbind(c(1,2,3), c(4,5,6))
a2.matrix
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
# Use the rbind function to combine the vectors in the transpose (rows and columns switched)
a3.matrix <- rbind(c(1,2,3),c(4,5,6))
a3.matrix
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
To perform math on a matrix use functions such as rowSums(), colSums(), rowMeans(), and colMeans()
a2.matrix
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
colSums(a2.matrix)
## [1] 6 15
rowMeans(a2.matrix)
## [1] 2.5 3.5 4.5
To index the element of a matrix you subscript by specifying the row # and column # (in that order)
a2.matrix[2,1]
## [1] 2
# if you leave one of the spaces empty R will assume you want the whole row or column
a2.matrix[,2]
## [1] 4 5 6
# data1 <- read.table("filename.txt", sep = ",", header=TRUE)
# data2 <- read.csv("filename.csv")
# if you get an error message that your file cannot be found, specify the entire file path in between the quotations or alternatively write your code like this:
# data1 <- read.table(file.choose(), sep = ",", header=TRUE)
# We can see the names R has assigned to the columns of the data frame using the names() function. Eg: names(data1)
# We can get the basic structure of the data frame using the str() function. Eg: str(data1)
Example:
create a txt file with the following
flavour,number
pistachio,6
vanilla,12
chocolate,8
strawberry,14
mint chocolate chip,3
rocky road,9
caramel fudge swirl,5
flav <- read.table(file.choose(), sep=",", header=TRUE)
# or you could use read.csv
# flav <- read.csv(file.choose())
flav
## flavour number
## 1 pistachio 6
## 2 vanilla 12
## 3 chocolate 8
## 4 strawberry 14
## 5 mint chocolate chip 3
## 6 rocky road 9
## 7 caramel fudge swirl 5
class(flav$flavour)
## [1] "character"
Why does R sometimes label my string variable as “Factor” and not “Character”?
R, by default, automatically interprets a column from the data read
from disk as a type factor if it contains characters. We can prevent
this behavior manually by adding
the stringsAsFactors optional keyword argument to
the read.* commands:
# data3 <- read.csv("filename,csv", stringsAsFactors = FALSE)
Comments