Getting started with R

What you should already know (from Week 0):

  • How to access the RStudio environment through Datahub

What is included in this lecture:

  • Key common skills
    • Assigning values
    • Naming objects
    • Running Code
    • Commenting Code
  • Data types
  • Data Structures/Objects
    • Vectors
    • Matrices
    • Lists
    • Dataframes (briefly)
    • Creating objects using functions
  • Calculations and Comparisons

Key Common Skills

Assigning a value to an object

To assign an object a value (or values), use “<-”: Object <- value(s)

#use "<-" as the operator to assign the value of 5 to the object named "x"
x <- 5
#return the value of x
x
[1] 5
#assign and return 
(x <- 5)
[1] 5

Naming objects

  • Names cannot start with numbers or symbols
  • R is case sensitive!!
  • Best practices:
    • Use lower case
    • Use underscores to separate words in names
#some users prefer what is called 'camelCase' which uses a capital letter 
#to indicate a new word

# thisValUE 
# thisvalue  
# THISVALUE

camelCaseHere <- "camel"
# Many packages in R use camel case for functions
# camelCase

#we recommend using all lowercase and underscores to separate words for object naming
snake_case_1 <- "snake"

# kebab-case-example

Running Code

  • To run one line:
    • Click within line or highlight line
    • Click “Run” (shortcut: ctrl + enter)
  • To run several lines:
    • Highlight lines
    • Click “Run” (shortcut: ctrl + enter)
assign_this <- "this"

assign_that <- "that"
paste0(assign_this, " and ", assign_that)
[1] "this and that"
  • To run directly in console:
    • Enter code directly into the console
    • Press Enter
    • **Good option when there is no need to save code

Commenting Code

Use “#” to comment out code

Shortcut: Ctrl + Shift + C ( Command + Shift + C on macOS) 

**Comment multiple lines at a time by highlighting and using keyboard shortcut

# This is very simple commenting on a very simple example

assign_this <- "this"

assign_that <- "that"
paste0(assign_this, " and ", assign_that)
[1] "this and that"
assign_other <- ", or the other"
put_all_together <- paste0(assign_this, ", ", assign_that, assign_other)
put_all_together
[1] "this, that, or the other"

Commenting best practices:

  • Add comments to describe purpose of code or explain specific or complicated processes

  • Err on the side of over-commenting

  • While developing code, comment out (rather than delete) sections that you may need to revisit

Five Main Data types in R

Character values
Type Examples
character “ph”, “ucb”, “ucb ph”
numeric 290, 290.5
integer 290L (the L tells R to store this as an integer)
logical TRUE (or T), FALSE (or F)
complex 1+4i (complex numbers with real and imaginary parts)
  • Indicated by quotation marks (“” or ’’); best practice is to use double quotes
  • Can contain spaces, characters, symbols, and numbers
#create a character value using double quotes
ch_double <- "dog"
#try using single quotes - this works, too
ch_single <- 'dog'

ch_double2 <- "turtle"

#the code below will not work, as R is looking for an object named "dog" 
#(since there are no quotations around it)

#ch_no <- dog

#if we create an objected named "dog"
dog <- "puppy"

#then re-run the line with no quotes, the value of "ch_no" will take on the
#value of the object "dog" which makes ch_no = "puppy"
ch_no <- dog
ch_no
[1] "puppy"
#character values can be as long as you want
ch_long <- "this is a really really long string that i want to save"
ch_long
[1] "this is a really really long string that i want to save"

Numbers can be stored in three ways.

  1. Numeric - both whole numbers or decimals
  2. Integer - similar to whole number by indicated with an “L”
  3. Complex
#numeric objects can be whole or decimal
num_whole <- 290
num_dec <- 290.9

#integers are indicated by adding an "L"
int <- 290.5L

#complex
complex <- 2+4i

Logical: Use all caps - TRUE or T, FALSE or F

#two options for assigning a logical value to an object
logical <- TRUE
logical_1 <- T

logical_2 <- TRUE
#this does not save as a true logical value, rather it saves the string 
#"true" as a character
logical_lower <- "TRUE"

Extra details - 

  • Dates stored as numbers (the number of days (for dates) or seconds (for date/times) from January 1, 1970

  • There are several “constants” available in Base R (i.e. today’s date)

  • Missing values are stored as NA

Data Structures in R

The primary data structure in R is made up of objects. 

We’ll go into detail on these below except for

Factors - These are a little complicated, but the idea is there are fixed categories, that will display in a fixed order; this is especially useful for forcing output to display in a fixed manner, or modeling. These will be covered in detail later in the course.

Vectors

(Atomic) Vectors - this is a one dimensional object that can only contain a single data type (character, numeric, logical). Even a single value is stored as a vector in R (basically a vector of length 1)

  • Multiple ways to be created:
    • c() function
    • Using : operator for a vector of consecutive numbers
#create a numeric vector using the c() function
vec_num <- c(1,5,6,94)
vec_num
[1]  1  5  6 94
#create a numeric vector using the : operator
vec_num2 <- 1:10
vec_num2
 [1]  1  2  3  4  5  6  7  8  9 10
#create a character vector using the c() function
vec_char <- c("dog","cat","mouse")

#important note - a single value in R is stored as a vector of length 1

vec_one <- 100
class(vec_one)
[1] "numeric"
#try creating a vector with multiple data types - this will force 290 to be
#stored as "290"
vec_multi <- c(290,"ph")
str(vec_multi)
 chr [1:2] "290" "ph"

Matrices

  • Multi-dimensional, single data type
  • Created using matrix() function
#create a matrix using the : operator to define the data included
matrix_1 <- matrix(data = 1:12, nrow = 3, ncol = 4, byrow = TRUE, dimnames = NULL)
matrix_1
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8
[3,]    9   10   11   12

Lists

  • One-dimensional, multiple data types and/or objects
  • Created using list() function
#use the list function to see what happens if you add items of different types
my_list <- list(290, "290", "ph")
my_list
[[1]]
[1] 290

[[2]]
[1] "290"

[[3]]
[1] "ph"
str(my_list)
List of 3
 $ : num 290
 $ : chr "290"
 $ : chr "ph"

More on data frames in an upcoming week!

Using functions to describe objects

A few examples:

  • length() - how long is the object?
  • class() - what type of object is it?
  • typeof() - what data type is the object?
#return information about matrix and vectors created above
length(matrix_1) 
[1] 12
class(matrix_1)
[1] "matrix" "array" 
typeof(matrix_1)
[1] "integer"
length(vec_num2)
[1] 10
typeof(vec_num2)
[1] "integer"
length(vec_char)
[1] 3
typeof(vec_char)
[1] "character"

Calculations and comparisons

Calculations

R can be used as a high-power calculator in the console and in the script.

Calculations can be made on numbers and objects.

#calculations can be performed on numbers
54*38743252349
[1] 2.092136e+12
#and objects
a <- 4
b <- 75

b/a
[1] 18.75
#calculations can be performed on vectors
vec_num2*10
 [1]  10  20  30  40  50  60  70  80  90 100

New objects can be created as result of calculations.

matrix_2 <- matrix_1 * 5
matrix_2
     [,1] [,2] [,3] [,4]
[1,]    5   10   15   20
[2,]   25   30   35   40
[3,]   45   50   55   60
c <- b-a
c
[1] 71

Using functions for calculations

In addition to the operators above, there are many functions that can be used to do calculations.

#example: absolute value
abs(-90)
[1] 90

Comparisons

Two values or objects can be compared to assess if equal (==), unequal (!=), less (< or <=), or greater (> or >=) and will return a true or false.

5==40
[1] FALSE
"dog"=="cat"
[1] FALSE
a!=b
[1] TRUE
b>c
[1] TRUE
d <- (a+b)/c
d
[1] 1.112676
d2 <- (a+b)>c
d2
[1] TRUE