Basics of R

Created for STAT 355: Principles of Data Mining at the University of Southern Indiana

Author

Dr. Heather Cook

Published

August 22, 2024

Getting Started

Downloading R and RStudio

R is the behind the scenes program when using RStudio which is the interface for R. RStudio allows you to view your R script (code), environment (saved objects), console (gives errors, warnings, printed output), and plots (or help, packages, files, … ). Both are free to download:

  1. First download R: https://cran.r-project.org/

    • Select the download for your operating system (Linux, macOS, or Windows).
  2. Second download RStudio: https://posit.co/download/rstudio-desktop/

    • If you have a Mac, scroll down to find the download for Macs.

After successful installation, to use R, you just have to open RStudio. Here is a view of RStudio and the panels with some basic code that has been run:

Set Your Working Directory

In order to save your files and read in data from a specific folder, you need to set your working directory. Having a specific folder for each project/assignment, will help keep your work organized and avoid overwriting previous files.

To set your working directory, you may:

  1. At the top of RStudio, click on Session, Set Working Directory, Choose Directory.

    • Then you may navigate to and select the folder you wish to work out of and click Open.
  2. Write and run the command below specifying the folder path:

setwd("C:/Users/hlcook1/Documents/R Example")

Note: If you use the first option (the menu at the top of RStudio), the code will be automatically populated in the console.

Once you have your working directory set, it will default to this folder for saving files, graphs, or data and will default to this folder for importing data.

Writing Basic Code

Using R as a Calculator

R can be a calculator! You may enter calculations either in the R Script panel and run them (see below) or type them directly into the console. If you type in the console, you can execute/run the code by pressing ENTER.

Here are some basic operators built into R:

#addition
2+3
[1] 5
#multiplication
2*3
[1] 6
#division
2/3
[1] 0.6666667
#raising to a power
2^3
[1] 8

Careful! R knows Order of Operations and will follow exactly what you code.

If we wanted to calculate 5 plus 5 then divide by 5, we would code the following:

(5+5)/5
[1] 2

If we wanted to calculate 5 divided by 5 then plus 5, we would code:

5+5/5
[1] 6

See the difference?

Syntax

R is much more than a calculator though. You can store objects in R, however R is case sensitive! So we must be careful in while naming or assigning objects.

To assign/name an object you may use <- or =. However, we must be careful how we use these within functions (more later). Best practice is to use <- to name/assign objects and = inside functions for arguments.

#assign object a value of 1
object<-1
#assign OBJECT a value of 2
OBJECT<-2
#check if object is equal to OBJECT
object==OBJECT
[1] FALSE

Another rule is that you cannot start a name of an object with a number or certain special characters such as ^, !, $, @, +, -, /, or *. Spaces within object names will not work, however you may use periods or underscores.

Good Examples Bad Examples
x 2x
x2 !x
x_sqr $x
x.sqr x/2
x^2
x sqr

If you need to enter something with spaces, you can use quotes. R will treat single and double quotes the same, however you cannot use them together.

#print a sentence with spaces using single quotes
print('Hello, how are you?')
[1] "Hello, how are you?"
#print a sentence with spaces using double quotes
print("Hello, how are you?")
[1] "Hello, how are you?"

While both the above work, we cannot mix the quotes print("Hello, how are you?') will lead to R thinking something is missing and waiting for more input.

R will also overwrite objects if you give them the same name.

#assign the object num the value of 3
num<-3
num
[1] 3
#assign the object num the value of 5 (overwrites previous value)
num<-5
num
[1] 5

You can see which object names you have already used with the ls function or by looking at the Environment panel of RStudio.

#check the list of objects
ls()
[1] "num"    "object" "OBJECT"

If you wish to remove an object from your environment, you can use the rm function.

#remove the num object
rm(num)
#check the list of objects
ls()
[1] "object" "OBJECT"

We can see the num object has been removed.

R is very helpful and has color coding, popup info when writing a function, and indicates lines with errors by a red dot on the left.

Object Types

Now that you’ve seen you can store objects, what types of objects can you have? Well, there are many types of objects to fit your needs!

Vector

A vector in R can be a single value or a group of values, but all the values in a vector should be of a single type.

#creating a vector of a sequence repeated twice
vec<-rep(1:3,2)
vec
[1] 1 2 3 1 2 3
#letters is a built in vector of the alphabet
letters
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
#Create a vector of different types
c(1:3,"a","b","c")
[1] "1" "2" "3" "a" "b" "c"

Notice that R converted the last vector into one type, a character vector (you can tell this based on the quotes around the values in the output).

Numeric Vector

Numeric vectors have numbers as the values of the inputs, but can be of any type such as decimals or integers.

#Numeric vector
num_vec<-seq(1,3,by=0.25)
num_vec
[1] 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
Integer Vector

Integer vectors have numbers as the values of the inputs but are specifically integers.

#Numeric vector
int_vec<-1:3
int_vec
[1] 1 2 3
Character Vector

Character vectors have characters/text as their values. The values may be letters, words, sentences, or numbers stored as text.

#Character vector examples
chr_vec1<-c("a","t","z")
chr_vec1
[1] "a" "t" "z"
chr_vec2<-c("Hello,","how","are","you","?") 
chr_vec2
[1] "Hello," "how"    "are"    "you"    "?"     
chr_vec3<-c("Hello, how are you?","Fine, thank you.") 
chr_vec3
[1] "Hello, how are you?" "Fine, thank you."   
chr_vec4<-c("1","2","3")
chr_vec4
[1] "1" "2" "3"

Again, R is case sensitive. So "dog" and "Dog" will be counted as separate entries. If you want to have them be the same case, you can use the function tolower.

#create a character vector with variations of capitalization  
dogs<-c("dog","Dog","DOG") 
#show that R treats them differently
table(dogs) 
dogs
dog Dog DOG 
  1   1   1 
#change to all lower case
dogs_new<-tolower(dogs)
#show that R now treats them the same
table(dogs_new)
dogs_new
dog 
  3 
Factor Vector

A factor vector is for categorical variables where we can control labels for the levels of the factor. Typically we just refer to these as a factor (dropping the vector term).

#Create a factor
factor1<-factor(c("cat","cat","dog","mouse","mouse","dog","dog"))
factor1
[1] cat   cat   dog   mouse mouse dog   dog  
Levels: cat dog mouse

Notice that R automatically creates the levels and will put them in alphabetical (or numerical) order. We can change that either when we initially define the factor or through the relevel function.

#Create the same factor but make mouse the reference level
factor2<-factor(c("cat","cat","dog","mouse","mouse","dog","dog"),
                levels=c("mouse","cat","dog"))
factor2
[1] cat   cat   dog   mouse mouse dog   dog  
Levels: mouse cat dog
#Use the relevel function to change the reference level
relevel(factor1,ref="mouse")
[1] cat   cat   dog   mouse mouse dog   dog  
Levels: mouse cat dog
Date Vector

A date object is specific for times and dates. There are many formats dates can take, so it is important to specify what the format is for your data. Once you have a variable converted to a date object, you can easily extract information such month, day of the week, hour of the day, minutes of the hour, and so on. You can even calculate differences between dates with the difftime function specifying if you want it in days, hours, etc..

#today's date, without the time information
today<-Sys.Date()
#what day of the week is it?
weekdays(today)
[1] "Friday"
#what month is it?
months(today)
[1] "August"
#today's data and time
now<-Sys.time()
#What hour is it?
format(as.POSIXct(now), format = "%H")
[1] "09"
#What are the minutes of the hour?  
format(as.POSIXct(now), format = "%M")
[1] "41"
#What are the seconds of the minute? 
format(as.POSIXct(now), format = "%S")
[1] "18"
#What are the hour and minutes?
format(as.POSIXct(now), format = "%H:%M")
[1] "09:41"
#What is the time? Hour, Minutes, Seconds
format(as.POSIXct(now), format = "%H:%M:%S")
[1] "09:41:18"

There are functions to work with dates from other packages such as the R package lubridate, however we will discuss packages further down.

Switching Between Vector Types

With certain vectors, we may be able to switch the type of object it is stored as. For example:

#This vector was stored as a character vector previously
chr_vec4
[1] "1" "2" "3"
#Now we change the character vector to a numeric vector
chr2num_vec<-as.numeric(chr_vec4)
chr2num_vec
[1] 1 2 3
#Now we change the character vector to a factor vector
chr2fac_vec<-as.factor(chr_vec4)
chr2fac_vec
[1] 1 2 3
Levels: 1 2 3
#We can change back to character vector
as.character(chr2fac_vec)
[1] "1" "2" "3"

However, the transition is not always as we would expect, especially when dealing with factors to numeric or character.

#Changing a factor to different types
factor1
[1] cat   cat   dog   mouse mouse dog   dog  
Levels: cat dog mouse
as.character(factor1)
[1] "cat"   "cat"   "dog"   "mouse" "mouse" "dog"   "dog"  
as.numeric(factor1)
[1] 1 1 2 3 3 2 2

Changing a factor to a character worked as expected, but what happened when we tried to change it to a numeric? Well, since the levels of the factor correspond to categories, it used the order of the levels instead to give values for a numeric vector. Hence, in the numeric vector a 1 corresponds to the first level which is “cat”.

When applicable, we have the following functions to switch between these object types:

  • as.numeric(object)

  • as.integer(object)

  • as.character(object)

  • as.factor(object)

Matrix

A matrix object is a matrix as we would normally think of a matrix. In R, a matrix must have values of all the same type: numeric or character.

#Create a matix with 2 columns and 3 rows filled in by row
mat<-matrix(c(1,2,3,4,5,6),ncol=2,nrow=3,byrow=T)
mat
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
#Create a matix with 2 columns and 3 rows filled in by column
matrix(c(1,2,3,4,5,6),ncol=2,nrow=3,byrow=F)
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6
#create a matrix with letters
matrix(letters,nrow=2,byrow=TRUE)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] "a"  "b"  "c"  "d"  "e"  "f"  "g"  "h"  "i"  "j"   "k"   "l"   "m"  
[2,] "n"  "o"  "p"  "q"  "r"  "s"  "t"  "u"  "v"  "w"   "x"   "y"   "z"  

Note: While these examples have integers and single letters, matrices in R can have continuous numeric values or characters that are strings (such as words or even sentences).

Array

A matrix is a 2 dimensional array, however an array can have higher dimensions. For example, researchers exploring epilepsy give 25 patients EEGs which measures electrical activity in the brain. An EEG includes several electrodes, typically 64, that attach to the scalp and measure that activity every few miliseconds. An EEG lasts at least 20 minutes, so there may be 120,000 time points. Thus, per each patient there is a matrix (2 dimensional array) with their data: 64 rows as each electrode and 120,000 columns as each time point. So, an array of all patient data could include 3 dimensions: (64, 120000, 25).

#create a small array that will be two 2x3 matrices
array(1:12,dim=c(2,3,2))
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9   11
[2,]    8   10   12

Notice that function input the values by column filling up the first matrix, then filling in the second matrix.

Dataframe

df<-data.frame(ABC=letters,Numbers=1:26)

List

Functions to Check Object Type

If at any time you are unsure of what type your object is you can use the class, mode, or str functions. The str function is great for viewing your dataframe and making sure each variable is the correct type and has the correct values.

class(vec)
[1] "integer"
mode(vec)
[1] "numeric"
str(vec)
 int [1:6] 1 2 3 1 2 3

If you want to know if an object is a specific type of object, you can apply the functions in the first column of the table. If the object is not the type that is needed, you may be able to use the following to switch them back and forth. However, remember that they may not work or may not work correctly. Always check if the code did what you think it should have.

Check Object Class Convert Object Class
is.numeric(object) as.numeric(object)
is.integer(object) as.integer(object)
is.character(object) as.character(object)
is.factor(object) as.factor(object)
is.matrix(object) as.matrix(object)
is.data.frame(object) as.data.frame(object)
is.Date(object) as.Date(object)

Working with Objects

Can use objects within operations. Here, we have an objects stored with a single value then multiple operations are completed with that object.

#assign a the value of 2
a<-2
#adding 3
a+3
[1] 5
#dividing by 3
a/3
[1] 0.6666667
#cubing
a^3
[1] 8

Completing an operation per each element of a vector:

#Squaring each element in a vector
vec^2
[1] 1 4 9 1 4 9
#Multiplying a vector by a scalar
2*vec
[1] 2 4 6 2 4 6

We can do matrix operations as well:

#Transpose a matrix
t(mat)
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
#Matrix multiplication
mat%*%t(mat)
     [,1] [,2] [,3]
[1,]    5   11   17
[2,]   11   25   39
[3,]   17   39   61
#Inverse of a matrix (matrix must be square)
mat2<-mat[-3,]
solve(mat2)
     [,1] [,2]
[1,] -2.0  1.0
[2,]  1.5 -0.5
#Eigenvalues and Eigenvectors (matrix must be square)
eig<-eigen(mat2)
eig$values
[1]  5.3722813 -0.3722813
eig$vectors
           [,1]       [,2]
[1,] -0.4159736 -0.8245648
[2,] -0.9093767  0.5657675

Using $

table for factors

#create a frequency table for a factor
table(factor1)
factor1
  cat   dog mouse 
    2     3     2 
#make table of percentages
table(factor1)/length(factor1)*100
factor1
     cat      dog    mouse 
28.57143 42.85714 28.57143 

Useful Built In Functions

Function Input Object Type Usage
log numeric vector Computes the logarithm, default is base 10
exp numeric vector Computes the exponential
sqrt numeric vector Compute the square root
factorial numeric vector Compute the factorial
abs numeric vector Computes the absolute value
floor numeric vector Returns a numeric vector containing the largest integers not greater than the corresponding elements of the input
ceiling numeric vector Returns a numeric vector containing the smallest integers not less than the corresponding elements of the input
round numeric vector Rounds the values in the vector to a specified number of decimal places
trunc numeric vector Returns a numeric vector containing the integers formed by truncating the values in the vector toward 0 (takes off the decimals)
sort numeric, character, logical vector Sorts a vector or factor into ascending or descending order
order numeric, character, logical vector or a sequence of those all of the same length

Returns a permutation which rearranges the input into ascending or descending order

Can be used to sort data frames

sample vector Takes a random sample of a specified size from the input using either with or without replacement
names any Returns the names of the input object, for vectors it will return any names of the elements, for data frames it will return the column names
colnames

matrix like object

matrix, array, dataframe

Returns the names of the columns
rownames

matrix like object

matrix, array, dataframe

Returns the names of the rows
dimnames matrix, array, dataframe Returns the names of the different dimensions
str any Displays internal structure of the R object
dim vector, matrix, array, dataframe Returns the dimensions
ncol vector, matrix, array, dataframe Returns the number of columns
nrow vector, matrix, array, dataframe Returns the number of rows
length vector, matrix, array, dataframe

For a vector, array, or matrix it returns the number of elements

For a dataframe it returns the number of columns

class any Returns the class of the object
mode any Returns the storage modes of the object
is.na any Returns an object of the same size filled with TRUE if that element had a missing value or FALSE if the element is not missing
table one or more objects that can be interpreted as factors Returns a contingency table of class table with counts
unique vector, matrix, array, dataframe Returns a vector, dataframe, or array like the input but with duplicate elements/rows removed
summary any Produces summaries of the data or of the results for various model fitting functions
mean numeric, logical, date vector Calculates arithmetic mean
sd numeric vector Calculates standard deviation of values in the object
var numeric vector Calculates variance of values in the object
median numeric vector Calculates median of values in the object
quantile numeric vector Calculates specified quantiles
sum numeric, logical vector, matrix, array Calculates the sum of all the values in the object
rowSums, colSums, colMeans, rowMeans Array, matrix, dataframe with numeric values Compute sums or means of the rows or columns
min numeric vector, matrix, array Calculates the minimum of the values in the object
max numeric vector, matrix, array Calculates the maxmum of the values in the object
rbind vectors, matrices, dataframes

Take a sequence of vectors, matrices, arrays, or dataframes and combine them by rows

Use to add on additional rows of same number of columns/elements

cbind vectors, matrices, dataframes

Take a sequence of vectors, matrices, arrays, or dataframes and combine them by columns

Use to add on additional columns of same number of rows/elements

merge two dataframes Merges two dataframes by common columns or row names

Getting Help on Functions

help(rnorm)

?rnorm

example(rnorm)

Indexing

Running Code

To execute/run your code, there are a few options.

  1. Highlight the code you wish to run then click the Run button at the top of the R script panel.

  1. Hightlight the code you wish to run then hold down CRLT and hit ENTER on the keyboard.

  2. To run code line by line, you can click anywhere in the line of code and either:

    1. Click the Run button

    2. Use CRLT + ENTER on the keyboard

Note: You may highlight any piece of code that you wish to run, even if it’s within a line of code.

Save Your Work Often!

Saving Your R Script

Saving your R script saves the code you have written. This is helpful if you have multiple projects ongoing, if you need to close R and RStudio (or if you computer automatically updates), or if you need to share code (say for assignments).

To save your R Script file:

  1. Click the save button at the top of the R Script pane

  2. Click the Save or Save As buttons at the top of RStudio

  3. Go to File, Save or Save as…

With each option, you will need to select a folder where the file will be saved. Typically, this folder will be the same as your working directory folder.

See the Save buttons highlighted below:

Note: You can have more than one R Script open at a time, so be aware of what your working directory is so you know where you are saving your files as the save buttons will default to the working directory.

Saving Your Workspace/Environment

Saving your workspace saves all the objects you’ve defined & assigned in R’s environment. That way, when you open R next time or on a new computer, you can load your workspace without having to rerun your code.

To save your Workspace, click the Save button at the top of the Environment pane in RStudio:

Or, you can write the command to save your workspace giving the file a unique memorable name in the quotes after the forward slash (make sure the file name ends with .RData):

save.image("~/Workspace_Aug2024.RData")

To import a workspace, you can use the Open file button at the top of the Environment pane in RStudio:

Or, you can write the command to open your workshpace previously saved:

load("~/Workspace_Aug2024.RData")

Note: If you save/load your workspace using the buttons at the top of the environment panel, the code will be automatically populated in the console.

Installing and Loading Packages

Many contributors have created content for R users through packages. These packages usually contain additional (optional) functions and datasets for use in R. We may access these functions and datasets by installing and loading packages. There are over 5,000 available R packages! If they were to come pre-loaded, R would be a very slow program.

Installing an R package involves downloading the information onto your computer so that you may load the package into your library for use.

For example, let’s install and load the dplyr package. This package has many tools for data manipulation. See the reference manual for this package here.

install.packages("dplyr")

It may take time depending on which package you install and the size of it, but if you recieve the message:

package ‘dplyr’ successfully unpacked and MD5 sums checked

at the end of the console, then you have successfully installed the package.

Now, you need to load the package into your library by using:

library(dplyr)

Once the R package is in your library, you have it loaded and can use the functions built into that package.

Importing Data Files

R can import data in many different ways from many different file types (the list below is not exhaustive). Many ways are included in R packages that are automatically included such as base or utils, but others are specific and need an R package to be installed.

File Format Function & Syntax Library
RData (R data file that can store multiple R objects) load("file.RData") base
RDS (R data file that stores one R object) readRDS("file.RDS") base
Plain Text File (General Purpose Read Function) read.table("file.txt",``sep="", header=FALSE, stringsAsFactors=FALSE``) utils
Plain Text File with Fixed Widths read.fwf``("file.txt", widths=c(7,2,5,4,20), header = FALSE, sep = "\t", stringsAsFactors=FALSE) utils
Comma Separated Variable (CSV) files read.csv``("file.csv",header = TRUE, sep = ",", stringsAsFactors=FALSE) utils
CSV Files with European Decimal Format read.csv2("file.csv",header = TRUE, sep = ",", dec=",", stringsAsFactors=FALSE) utils
Tab Delimited Files read.delim("file.txt",header = TRUE, sep = "\t", stringsAsFactors=FALSE) utils
Tab Delimited Files with European Decimal Format read.delim2("file.txt",header = TRUE, sep = "\t", dec=",", stringsAsFactors=FALSE) utils
Excel Spreadsheet readWorksheetFromFile("file.xlsx,``sheet=1, startRow=0, startCol=0, endRow=0, endCol=0, header=TRUE``) XLConnect
Matlab readMat("file.mat") R.matlab
Minitab read.mtp``("file.mwx") foreign
SAS read_sas``("file.sas7bdat") haven
SPSS read.spss("file.sav") foreign
Stata read.dta("file.dta") foreign
Systat read.systat("file.sys") foreign

Besides using the functions listed in the table, we can use either:

  1. The menu in RStudio

    • Click on File, Import Dataset, then select the type of file you wish to import.
  2. The Import Dataset button.

    • Click on the Import Dataset button, then select the type of file you wish to import.

These two options give the following choices:

  • From Text (base)

  • From Text (readr)

  • From Excel

  • From SPSS

  • From SAS

  • From Stata

Exporting/Saving Data Files

Since we often use R to clean, manipulate, and create data, we would like to save it as well. Just like when importing data, there are many options to save data:

File Format Function and Syntax Object Type
R Data file save(object1, object2, object3, file="dataname.RData") Any
RDS saveRDS(object, file="dataname.RDS") Any
.csv write.csv(object, file="dataname.csv", rownames=FALSE) Dataframe, Vector, Matrix
.csv with European decimal notation write.csv2(``object``, file = "dataname.csv", row.names = FALSE) Dataframe, Vector, Matrix
space delimited write.table(object, file = "dataname.txt", row.names=FALSE) Dataframe, Vector, Matrix
tab delimited write.table(``object``, file = "dataname.txt", sep = "\t", row.names=FALSE) Dataframe, Vector, Matrix
comma delimited write.table(object, file = "dataname.txt", sep = ",", row.names=FALSE) Dataframe, Vector, Matrix
Excel sheet

Using the XLConnect R package:

writeWorksheetToFile("file.xlsx", data=``object``, sheet="New Sheet", startRow=1, startCol=1)

Dataframe, Vector, Matrix

Working with Dataframes

Indexing with Dataframes

Deleting a Row

Deleting a Column

Adding on Rows

Adding on Columns

Merging Dataframes

Missing Data

Imputing Data

Useful Functions

Programming Tools

If Else Functions

ifelse()

if(){}else{}

Loops

Loops go through chunks of code several times until some stopping criteria is fullfilled. This is helpful is you want to implement a function per row or column in a dataframe or list. These do tend to be slower than the apply functions, however are very helpful when a custom loop is needed.

For Loop

For loops repeatedly run chunks of code, once per each item in a given set of elements. They “Do this for every value of that.” A basic for loop looks like:

for(value in that){
  dothis
}

Example: I wish to print the sentence “My favorite number is” with a new number each time.

for(i in c(3,5,37)){
  print(paste("My favorite number is ",i,".",sep=""))
}
[1] "My favorite number is 3."
[1] "My favorite number is 5."
[1] "My favorite number is 37."

For each value in the given vector of numbers, R pasted the sentence together with that number then printed the result. The value in this example was i which was updated each time the loop ran. The that in this example was the vector of favorite numbers c(3,5,37). The dothis in this example was the printing of the pasted sentence print(paste("My favorite number is ",i,".",sep="")).

Notes:

  • You can have multiple steps within a for loop if needed.

  • Be careful with how you name objects! Keep it simple.

While Loop

A while loop reruns the inside code chunk while a certain condition remains true. The basic syntax of a while is:

while(condition){
  dothis
}

Example: I wish to print the sentence “My favorite number is” with a new number each time until that number gets over 5.

i<-1
while(i < 5){
  print(paste("My favorite number is ",i,".",sep=""))
  i<-i+1
}
[1] "My favorite number is 1."
[1] "My favorite number is 2."
[1] "My favorite number is 3."
[1] "My favorite number is 4."

Here, the condition was that i stays less than 5. The dothis in this example was the printing of the pasted sentence print(paste("My favorite number is ",i,".",sep="")). The value of i was initalized as 1 and updated in the while loop each time by adding 1: i<-i+1. Hence, the while loop printed the sentence with a number starting at 1 going through the number 4. The loop stopped after 4 since during the last loop, when i was 4, i was updated to be 5 which stops the loop as the condition is no longer true.

Apply Functions

lapply

sapply

mapply

Creating Your Own Function

function(arglist){expression}

Plotting/Graphics

Base R

plot

Quantitative Variable Graphs

histogram

dotplot

boxplot

Categorical Variable Graphs

pie chart

bar graph

Quantitative vs Quantitative Graphs

Scatterplots

Scatterplot matrix

correlation plot

Quantitative vs Categorical Graphs

boxplot by category

bar graph with means

Updating R & RStudio

R regularly has updates to fix issues (bugs), to improve performance, and to work with new technologies. If there is a new release, you will need to install it. However, there are usually only minor changes with each release so using a recent update is usually sufficient. Like R, RStudio regularly updates as well. Packages also may have updates. Here’s how to update all of that!

Updating R & R Packages

In order to update R and R packages, you can use the following commands:

install.packages("installr") 
library(installr) 
updateR()

A new window will open and take you through the following:

  1. It checks for a new version of R.

  2. If a new version of R is available, it will download the most recent version and run its installer.

  3. Once installed, it will ask you if you would like to copy (or move) all of the R packages from the old R library to the new R library.

  4. It will then ask you if you wish to update the moved packages. (If you have many packages, this will take some time.)

  5. Last, it will exit R and RStudio.

Updating RStudio

If your RStudio is outdated, then each time you open RStudio you will receive a message that asks if you would like to update RStudio to the newest version. You may select yes and follow the steps.

If you are unsure or selected no, you may go to Help then Check for Updates using the RStudio menu at the top. If there is a new version, it will let you know and you may select yes and follow the steps to update RStudio.