Topics

In this sesssion we’ll be covering:

Other sessions we’ll cover today and tomorrow:


Introductions


What is R

R is many things…


Really cool graph made in R


- R is not just a stats software, it’s a programming language too + A language allows the user to do a limitless number of tasks - R is open source, so users can make it available to others + This sharing of information is centered around the concept of open source development + Open source means that everything is freely available (kind of a big deal if you are working under state budget restrictions) - Becoming widely used in academia and government, and making headway into industry, especially the biotech and finance sectors. Here is a link describing the increasing popularity of R.


R vs. Excel


Advantages of Excel

  • Easy to use
  • Familiar
  • Easy to sort and scroll through data


Disadvantages of Excel

  • Easy to make a mistake
  • Limits on data size, not a huge deal anymore. But with large datasets, can really bog down your computer
  • Difficult to perform a series of steps
  • Without very very good documentation, difficult to describe steps of an analysis


Disadvantages of R

  • Difficult to use at first
  • Figuring out errors can be difficult
  • Relies on the user finding answers via google searches, no structured support
  • Often times your colleague may not be familar with it, so sharing data analyses with them can be difficult


Advantages of R

  • Because you can see all of your steps, much harder to make an error
  • If you know the right tricks, almost unlimited size of data
  • Easily reproducible and repeatable
  • Active user community means lots of internet sources and rapid releases of new technology


When to Use Excel

  • Doing a one-time analysis, small dataset, basic graphics


When to Use R

  • Doing repeated analyses, lots of variables, advanced graphics
  • Most likely will begin as a hybrid user and migrate over time


R and RStudio


R

  • To download R for Windows, see this page
  • If you open R itself, it will look very plain

plain R console


RStudio

  • RStudio makes R a little more user friendly
  • It’s free and can be downloaded at rstudio.com
  • It’s not necessary to open RStudio to use R, but in these sessions we will assume that RStudio is your interface to R

  • When you first open RStudio, this is what you see:

first opening RStudio


  • The left panel is the console for R
  • Type 1 + 1 then hit “Enter” and R will return the answer


RStudio 1 + 1


  • It’s a good idea to use a script so you can save your code
  • Open a new script by selecting “File” -> “New File” -> “R Script” and it will appear in the top left panel of RStudio


RStudio open script


  • This is basically a text document that can be saved (go to “File” -> “Save As”)
  • You can type and run more than one line at a time by highlighting and clicking the “Run” button on the script tool bar


RStudio many lines

  • The bottom right panel can be used to find and open files, view plots, load packages, and look at help pages
  • The top right panel gives you information about what variables you’re working with during your R session
  • We’ll explain more about what to look for in those panels later

R basics

Doing math

  • Open up a script if you haven’t already (“File” -> “New File” -> “R Script”)
  • Try some math by either typing the lines below or copying and pasting the lines into your script
10 + 5
10 - 5
10 * 5
10 / 5
10 ^ 5
  • Remember, to run the lines, highlight your code and click the “Run” button on the toolbar of the script panel

Reference table of arithmetic operators

Operator Meaning Example
+ addition 2 + 2
- subtraction 2 - 2
* multiplication 2 * 2
/ division 2 / 2
^ or ** exponentiation 2 ^ 2

Creating objects

  • An object is used to store information in R
  • To create an object or variable in R we use an arrow symbol pointing left <-
  • On the right we’ve created the variables x and y by assigning some numbers to them
x <- 10
y <- 5
x + y
## [1] 15
Above, the top panel is what you run in your script, the bottom panel is the output


  • In RStudio, you will see the variables we created in the top right panel


variables

  • If you’ve already created a variable, you can replace the value with another value
x
## [1] 10
x <- 20
x
## [1] 20

Creating a variable

  • In the top right panel you can see that the number stored in the variable x has changed


variables2


Exercise 1

-Intro Exercises

Next session: