Welcome to Intro to R! This is part 1 of a two-part lab that will get you up and running in the R programming language. It is aimed toward undergraduate students with little or no experience with R. In this first lesson, you will…
Throughout these labs, you should write and run the code in your own console. Resist the urge to copy and paste. The best way to improve at coding is to learn by doing!
To begin using R, you’ll need to download two pieces of software:
Go ahead and open up RStudio. On a Mac or Windows machine, this means finding where you’ve installed the RStudio app (not the R app) and double-clicking its icon. Then, open a new R script: File > New File > R Script. A new Untitled file will open. There is a lot going on here! Let’s walk through what we are looking at. There should be four windows:
We can use R as a calculator. Let’s say we want to calculate 1 + 3. To run this code…
1 + 3
## [1] 4
Now let’s try a few more calculations!
log10(1000)
## [1] 3
curve(x^3 - 2*x + 1, from=0, to=5)
Note that R is case-sensitive. Try running the following code:
Log10(1000)
To analyze data, we need to load and store datasets. R stores
information in the form of objects. Objects are like a
envelope or box that can contain any information. To create an object in
R, we use the assignment operator: {r} <-
Let’s make your first object! Run the following code:
nine <- 1 + 8
What happened here? nine is the name of the object. In
our box analogy, this is like the label on the box. 9 is
the contents of the box.
Objects are useful because we can use objects in subsequent calculations. Let’s try this out:
nine*2
## [1] 18
What happened here? 9 multiplied by 2 is 18. We multiplied the contents of our box (9) by 2. Later on, you’ll be able to use objects to write a complex data analysis as a sequence of many simpler steps.
Functions are actions that we ask R to perform with a particular object or piece of data, like calculating the mean of a variable or plotting a variable. R comes with some basic functions. Some things to know about functions:
functionname()functionname(arguments)Let’s try a few examples
sqrt(64)
## [1] 8
Here, we asked R to find the square root of 64, which is 8.
sqrt() is a function. 64 is the argument.
Before we dig into some data, let’s talk about a good coding practice
that will make your life easier. This practice is called
commenting. When we use the # symbol in R,
R will ignores anything in a script the right of the symbol. This means
yuo can write little explanations yourself (or others) about what your
code is doing.
# Create an object
object <- 4+10
# Multiply object by 10 and store as new object
new_object <- object*10
# Divide new object by 3
new_object/4
## [1] 35
In the next lab, we will load some real-world data. Before we do that, let’s introduce the basic vocabulary of data in R. The core of R is the dataframe. Think of dataframes like a spreadsheet: they have rows and columns. Usually, rows are a datapoint. The columns are the variables.
Let’s take a look at a sample dataset in R. Run the statement below to load the cars dataset (a preloaded dataset in R)
View(cars)
This statement will open up the cars dataset. This dataset is a dataframe. You will see that the rows are datapoints: here, specific distances and speeds. The columns are variables: here, speed and distance. Let’s refresh our memory on some different types of variables:
A library is a piece of software that provides
additional functionality to R beyond what is in the basic R
installation. A library is something you need to install once and load
each time you want to use it. Let’s use the
install.packages() function to install a library that we
will use in the second lesson: tidyverse.
install.packages("tidyverse")
Even the most proficient coders need help with R at some point or another. I look for R help all the time, and so will you. Here are some tips:
?sqrt