Introduction (Read Me First)

Hello! This document was created as part of the CC Bio INSITES Quantitative Workshop #2, specifically as a guide for the pre-activity to be completed before our meeting on ?????. This guide has been written assuming that you have little to no familiarity with R, and so begins with a brief overview of the software installation. Depending on your previous experience with R, you should feel free to scroll through and skip areas as needed. [Add additional bit about “your goal is to get stuck” and link to worksheet.]


Step 1: Installing R and RStudio

Installing R can be unexpectedly complicated at first, as there are two programs from separate websites that need to be installed. To provide a bit of background for the curious: the original ‘R’ is a programming language (like Java, python, C, etc) that began development in the 90s. With the development of this programming language came the first computer program you’ll install, simply titled R, which is a bare-bones interface for writing and running code.

As R became more popular, the RStudio IDE (the second program you’ll install) was developed to build upon the original R and to make writing and running code a bit easier. RStudio doesn’t replace R – instead, it loads it in the background and interfaces between you and the R engine in order to provide some additional functionality. Therefore, in order to use RStudio, you have to download R first (note that you don’t have to run R before running RStudio, so you should only need one additional shortcut on your desktop).

With that information in mind, hopefully getting R and RStudio installed won’t be too confusing anymore! This page provides a good overview for PC and Mac users, but in case it isn’t available (or you would just prefer to stay on this page) we’ll also provide directions for Windows and Macs here. If you use Linux, the link also provides step-by-step directions for Ubuntu towards the bottom of the page.


Step 1.1: Installing on Windows PC

Installing R

Head over to CRAN to download the current version of R. The link you need is right on the front page, labeled ‘Download R for Windows’; from there, look for the bolded ‘install R for the first time’ link and click to download the software. You can save the downloaded EXE where ever you like; when run, it will install R into the default directory that you shouldn’t need to alter.

Installing RStudio

The download page for RStudio makes it very easy to install RStudio for Windows. Hit the big blue button and to get the installer, and then run to get RStudio ready to run. After that, you’re done! The link provided above will walk you through some additional stuff, but we’ll also cover that on our own so you don’t have to worry about it now.


Step 1.2: Installing on Mac

Installing R

Head over to CRAN to download the current version of R. The link you need is right on the front page, labeled ‘Download R for (Mac) OS X’; from there, look for the section titled ‘Latest release’ and download the indicated PKG. When run, it will install R into the default directory that you shouldn’t need to alter.

Installing RStudio

On the download page for RStudio check out the section titled ‘All Installers’, and then locate the link for macOS. Save the indicated file and run; once again, you should accept the defaults (unless you’re familiar enough with your system to know why you shouldn’t). After that, you’re done! The link provided above will walk you through some additional stuff, but we’ll also cover that on our own so you don’t have to worry about it now.


Step 1.3: Run RStudio & Get Familiar With the Windows

Now that you’ve got everything installed, here’s a quick overview of what’s what in the program. When you open RStudio for the first time, you should see this:

Default setup of RStudio

The only thing that won’t match up is the contents of your ‘Files’ window, but that’s okay. Now for a quick breakdown of what the various windows are:

Default setup of RStudio, with some extra labels thrown in

1. Console

This window is where most of the output will be displayed, including errors and warnings. Everything in this window is text – if you create a plot, for instance, it will appear in the ‘Plots’ tab (the same windows as the ‘Files’ tab). The window does have a limit; if you run a bunch of code, or ask R to display something lengthy, some of your older code may get cut off as the window fills with new output. You can write code here, but it won’t be saved, so make sure you’re writing all of your code in the script window (#4, discussed more below).

2. Environment

This window lets you keep track of the objects you’re working with. If you import a dataset, that will be one object (usually a ‘dataframe’ that holds all of your rows and columns). If you run a regression and save the output, that will be another object (usually an ‘lm’, a collection of all the output produced by the test, such as coefficients, residuals, fitted values, and so on). Sometimes you can click objects in this window (e.g., a dataframe) and the object will be output into the console or it will open in a new window. There are some other tabs here as well, but they aren’t used often.

3. Files (and Plots, Packages, Help, and Viewer)

Lots of stuff can be displayed in the tabs of this window, but it’s not as interactive as the others.

  • The ‘Files’ tab shows you the folder structure of your current working directory (more on working directories later).
  • ‘Plots’ shows you your most recently created plot, and has arrows to let you scroll through past plots or icons to help you save the plots to your computer.
  • ‘Packages’ lists all of the packages available to your current installation of R (more on packages, what they are, and why they’re important later).
  • ‘Help’ can be extremely useful, but it’s not as helpful as it sounds. If you’re new to R and working your way up the introductory learning curve, this window might leave you more confused than you started. Specifically, it’s good if you have a question about a command you’re using (for instance, you want to know what the summary() command does, or how to specify variables for a lm() command).
  • ‘Viewer’ is used to display interactive output, which isn’t as common. For an example: some packages will let you create interactive tables (e.g., sort your variables by mean), and this output will display in this window.

4. Code or Script Window

This is the window where you’ll spend most of your time, but it doesn’t show up automatically. This window only appears when you create a code file (File > New File) for writing and saving code. The files you create will be Untitled and unsaved, like other coding software or word processing programs you might be familiar with, so don’t forget to save! RStudio has a handy default setting that lets it save your workspace, specifically all of the code windows you have open and all of the objects in your environment, but sometimes there are issues so don’t rely on it too much. Note: when you save files, the save location will default to your current working directory.


Step 2: Getting Started

The best way to get familiar with R is to use it, so let’s get started. First step: creating a Project!

Step 2.1: Creating a Project

What is a Project (uppercase P)?

Projects are a useful way of organizing your work environment, working directory, and file structure. They’re especially useful if you’re working on multiple projects (lowercase p). For instance, you might have a dataset consisting of information about your students’ grades, science interest, science self-efficacy, and growth mindsets. One collaborator might be interested in analyzing the relationship between grades and mindsets, while another wants to examine the relationship between interest and self-efficacy. You can, of course, run both analyses from the same Project, but splitting them helps keep things organized. For instance, one project might limit the dataset to first-year students, while the other looks at the entire sample, or even requires you to combine your data with some collected by your collaborator. Keeping track of any such decisions you’ve made is easier when you keep the projects in separate Projects.

They also make switching between projects easier. If you’re going from a meeting with Collaborator 1 to Collaborator 2, all you have to do is pull up the second project and everything – your working directory, the scripts you had open, the objects in your environment, etc. – will all come up exactly as you left them last.

So how do you create a project?

In the top right corner of your screen is a little blue box and a label that says Project: (None) (assuming, of course, that you don’t have a project open: see the screenshot below for more information).

The project button (outlined in black) and the tab that appears when it’s clicked (outlined in red)

Selecting ‘New Project’ will open a new window, and the option to create a project in a new directory.

Create project window (select New Directory)

The next screen provides a list of potential project types. In this case, we want to select ‘New Project’ again.

The last screen lets you name your project and select the project’s working directory. You don’t need to create a new folder just for your new project! Whatever you input as the directory name will become the new folder for your project. If you created the folder structure recommended by the previous workshop, you can navigate to this location and select it as your new project subdirectory, or you can move your previously created folders into your new project location ourside of R. I titled my project ‘Quant Workshop 2’.

Specify your directory and the name of the folder to be created for your new project

When complete, the project button will have the name of your new project, and the Files window will change to reflect your new working directory. If you look closely at the Files window in the screenshot below, you can see that the full directory for my new project is D:ProjectsWorkshop 2.

Final view after creating new project

And you’re done! Your new project is ready for you to start creating script files, adding code, and importing your data.

Step 2.2: Creating a Script File

File > New File will produce a list of all the programming languages that RStudio can handle, but the most commonly used is the R Script. Once you’ve created the new script, File > Save As will let you save it. The default location will be your project subdirectory, although you can save it outside this folder as well (not recommended).

RStudio after creating a new script file and saving it

In the screenshot above, you can see I created the project and script in a folder without any sub-folders. I added folders using the structure recommended in the previous workshop and moved my new script into the RCode location afterwards.

Step 2.3: Annotating Your Code

Step 2.4: Organizing Your Code (Optional)


Step 3: Importing Data

Step 3.2: Importing from SPSS

Step 3.3: Basic Commands