Overview

Welcome to Intro to R! This is part 1 of a two-part lab that will get you up and running in the R programming language. It is aimed toward undergraduate students with little or no experience with R. In this first lesson, you will…

  1. Get R up and running on your computer
  2. Perform some basic coding in R
  3. Practice some good coding practices that will make your life easier later on
  4. Learn where to find help when you run into obstacles in R
  5. Learn the basic vocabulary of data in R

Throughout these labs, you should write and run the code in your own console. Resist the urge to copy and paste. The best way to improve at coding is to learn by doing!

1 Download R and R Studio

To begin using R, you’ll need to download two pieces of software:

  1. Head to cloud.r-project.org to download and install the version of R appropriate for your computer, whether Windows, Mac, or Linux. R is a statistical program that performs calculations.
  2. Head to www.rstudio.com and download RStudio. If prompted to choose your version, you’ll want “RStudio Desktop,” which is free and, like R, available for Windows, Mac, and Linux. RStudio is the user-friendly interface that we use to communicate with R.

2 First Steps

Go ahead and open up RStudio. On a Mac or Windows machine, this means finding where you’ve installed the RStudio app (not the R app) and double-clicking its icon. Then, open a new R script: File > New File > R Script. A new Untitled file will open. There is a lot going on here! Let’s walk through what we are looking at. There should be four windows:

  1. Upper-left window: the R script, where we write and run code
  2. Lower-left window: the R console, where R executes the code and provides results
  3. Upper-right window: the environment, which lists all the objects we have created in the current R session.
  4. Lower-right window: where we find help and plots tabs. These will be helpful later.

3 Simple Calculations

We can use R as a calculator. Let’s say we want to calculate 1 + 3. To run this code…

  1. Type the code in the R script (the upper-left window).
  2. Highlight the code you want to run
  3. Either (1) manually hit the run icon in the top-right corner or (2) use command+enter in Mac or ctrl+enter in Windows
  4. The output should show up in the R console (the lower-left window).
1 + 3
## [1] 4

Now let’s try a few more calculations!

log10(1000)
## [1] 3
curve(x^3 - 2*x + 1, from=0, to=5)

Note that R is case-sensitive. Try running the following code:

Log10(1000)

3 Objects

To analyze data, we need to load and store datasets. R stores information in the form of objects. Objects are like a envelope or box that can contain any information. To create an object in R, we use the assignment operator: {r} <-

Let’s make your first object! Run the following code:

nine <- 1 + 8

What happened here? nine is the name of the object. In our box analogy, this is like the label on the box. 9 is the contents of the box.

Objects are useful because we can use objects in subsequent calculations. Let’s try this out:

nine*2
## [1] 18

What happened here? 9 multiplied by 2 is 18. We multiplied the contents of our box (9) by 2. Later on, you’ll be able to use objects to write a complex data analysis as a sequence of many simpler steps.

4 Functions

Functions are actions that we ask R to perform with a particular object or piece of data, like calculating the mean of a variable or plotting a variable. R comes with some basic functions. Some things to know about functions:

  1. The name of a function is always followed by parentheses: functionname()
  2. Inside the parentheses, we write arguments. Arguments are the inputst o be used in the function: functionname(arguments)

Let’s try a few examples

sqrt(64)
## [1] 8

Here, we asked R to find the square root of 64, which is 8. sqrt() is a function. 64 is the argument.

5 Commenting

Before we dig into some data, let’s talk about a good coding practice that will make your life easier. This practice is called commenting. When we use the # symbol in R, R will ignores anything in a script the right of the symbol. This means yuo can write little explanations yourself (or others) about what your code is doing.

# Create an object
object <- 4+10

# Multiply object by 10 and store as new object
new_object <- object*10

# Divide new object by 3
new_object/4
## [1] 35

6 Data

In the next lab, we will load some real-world data. Before we do that, let’s introduce the basic vocabulary of data in R. The core of R is the dataframe. Think of dataframes like a spreadsheet: they have rows and columns. Usually, rows are a datapoint. The columns are the variables.

Let’s take a look at a sample dataset in R. Run the statement below to load the cars dataset (a preloaded dataset in R)

View(cars)

This statement will open up the cars dataset. This dataset is a dataframe. You will see that the rows are datapoints: here, specific distances and speeds. The columns are variables: here, speed and distance. Let’s refresh our memory on some different types of variables:

  • Character variables contain text
  • Numeric variables contain numbers
    • There are two types of numeric variables: binary (take only two values) and non-binary (all other numeric variables).

6 Libraries

A library is a piece of software that provides additional functionality to R beyond what is in the basic R installation. A library is something you need to install once and load each time you want to use it. Let’s use the install.packages() function to install a library that we will use in the second lesson: tidyverse.

install.packages("tidyverse")

7 Getting Help in R

Even the most proficient coders need help with R at some point or another. I look for R help all the time, and so will you. Here are some tips:

  1. The web: There are thousands and thousands of R users out there. If you Google a question, you will probably find a helpful answer.
  2. If you want help on a specific function, type a question mark followed by the function’s name into the console. Try running this statement:
?sqrt