Topics
In this sesssion we’ll be covering:
Other sessions we’ll cover today and tomorrow:
Introductions
- Nathan Byers and Kali Frost
What is R
R is many things…
- R is a free, open-source language and environment for statistical analysis
- It’s been very popular in academia for more than a decade
- A statisitcal software that can do simple or complex analyses, similar to SAS or S plus, and which makes really great looking graphs

- R is not just a stats software, it’s a programming language too + A language allows the user to do a limitless number of tasks - R is open source, so users can make it available to others + This sharing of information is centered around the concept of open source development + Open source means that everything is freely available (kind of a big deal if you are working under state budget restrictions) - Becoming widely used in academia and government, and making headway into industry, especially the biotech and finance sectors. Here is a link describing the increasing popularity of R.
R vs. Excel
Advantages of Excel
- Easy to use
- Familiar
- Easy to sort and scroll through data
Disadvantages of Excel
- Easy to make a mistake
- Limits on data size, not a huge deal anymore. But with large datasets, can really bog down your computer
- Difficult to perform a series of steps
- Without very very good documentation, difficult to describe steps of an analysis
Disadvantages of R
- Difficult to use at first
- Figuring out errors can be difficult
- Relies on the user finding answers via google searches, no structured support
- Often times your colleague may not be familar with it, so sharing data analyses with them can be difficult
Advantages of R
- Because you can see all of your steps, much harder to make an error
- If you know the right tricks, almost unlimited size of data
- Easily reproducible and repeatable
- Active user community means lots of internet sources and rapid releases of new technology
When to Use Excel
- Doing a one-time analysis, small dataset, basic graphics
When to Use R
- Doing repeated analyses, lots of variables, advanced graphics
- Most likely will begin as a hybrid user and migrate over time
R and RStudio
- This section covers the two pieces of software you need to download
- R is the core piece
- RStudio is a nice integrated development environment (IDE) that makes it much easier to use R
R
- To download R for Windows, see this page
- If you open R itself, it will look very plain

RStudio
- RStudio makes R a little more user friendly
- It’s free and can be downloaded at rstudio.com
It’s not necessary to open RStudio to use R, but in these sessions we will assume that RStudio is your interface to R
When you first open RStudio, this is what you see:

- The left panel is the console for R
- Type
1 + 1 then hit “Enter” and R will return the answer

- It’s a good idea to use a script so you can save your code
- Open a new script by selecting “File” -> “New File” -> “R Script” and it will appear in the top left panel of RStudio

- This is basically a text document that can be saved (go to “File” -> “Save As”)
- You can type and run more than one line at a time by highlighting and clicking the “Run” button on the script tool bar

- The bottom right panel can be used to find and open files, view plots, load packages, and look at help pages
- The top right panel gives you information about what variables you’re working with during your R session
- We’ll explain more about what to look for in those panels later
R basics
Doing math
- Open up a script if you haven’t already (“File” -> “New File” -> “R Script”)
- Try some math by either typing the lines below or copying and pasting the lines into your script
10 + 5
10 - 5
10 * 5
10 / 5
10 ^ 5
- Remember, to run the lines, highlight your code and click the “Run” button on the toolbar of the script panel
Reference table of arithmetic operators
+ |
addition |
2 + 2 |
- |
subtraction |
2 - 2 |
* |
multiplication |
2 * 2 |
/ |
division |
2 / 2 |
^ or ** |
exponentiation |
2 ^ 2 |
Creating objects
- An object is used to store information in R
- To create an object or variable in R we use an arrow symbol pointing left
<-
- On the right we’ve created the variables
x and y by assigning some numbers to them
x <- 10
y <- 5
x + y
## [1] 15
Above, the top panel is what you run in your script, the bottom panel is the output
- In RStudio, you will see the variables we created in the top right panel

- If you’ve already created a variable, you can replace the value with another value
x
## [1] 10
x <- 20
x
## [1] 20
Creating a variable
- In the top right panel you can see that the number stored in the variable
x has changed
